Football + Data: Where to find the next EPL wonderkid?
Football clubs are always on the lookout for great talents, and willingly splurge for the young and highly talented players.
One example is the purchase of Gareth Bale, a Tottenham Hotspur forward by Real Madrid, at the price of €101mil at the age of 24.

Because of this, many football clubs are successful businesses — buying potential young talent at a cheap price and then selling off when they catch the big club’s eyes. One example for this is the Borussia Dortmund football club, famous for their high potential academy talents.
On top of that, football clubs spend millions to hire scouts to find potential talent to join their academy.
Problem statement
So, how does a football club know where to look for potential wonderkids? Where should they send their scouts to?
One way is to look at the big picture, using data.
In this article, I will use the player data from the English Premier League season 2020/2021 to demonstrate the process of identifying potential scouting locations with football wonderkids. The dataset is available at Kaggle here, while the dashboard is available here.
This to note:
- We define young as in players aged 25 and below, which is reflected in the filter of the Tableau dashboard.
- We will use median in this analysis as the goals (median = 6) and assists (median = 9) count is positively skewed, as seen in the following figure:

Median Goals by Nationality
First off, we look at Median Goals by Nationality.

The median goals are represented using the size of the circle and the bigger it is, the better. It means a typical player will score X number of goals.
One interesting thing is that there are good players that come from outside of Europe — Burkina Faso (BFA), Mali (MLI), United States (USA), Nigeria (NGA) and Ghana (GHA).
Football clubs could aim for these countries for cheap but potential football talent, considering the competitions for talent in other regions are quite fierce.
Median Contributions, Goals and Assists by Nationality
Secondly, we talk a look at Median Contributions, Goals and Assists by Nationality.
Contributions here are the sum of goals and assists, and is the main factor that we take a look at here.

Players from Burkina Faso — again, shows up, this time, top of the list, joined by fellow non-European countries like Algeria, Brazil, Cameroon, Egypt, and Nigeria.
Median Mins per Contribution by Nationality
Thirdly, we take a look at Median Minutes Played per Contribution by Nationality.

This is important as we want to know how long a player needs to play before making a mark in the form of goals or assists. The shorter it is, the better.
Mali tops the list this time, followed by non-European countries like Cameroon, Ghana, Argentina and Nigeria.
Median xG and xA by Nationality
Lastly, we take a look at Median xG and xA by Nationality.
xG and xA are known as Expected Goals and Expected Assists — which are calculated by formulas and represent the estimated goals and assists that will be contributed by a player, based on various factors like style of play, position on the field, etc. In short, the higher it is, the better.

However, we put more emphasis on goals, rather than assists — because goal machines hit the headlines more than assist kings. Popularity also influences the transfer fee of the player.
Looking at the chart, Serbia tops the list this time, followed by non-European nations like Egypt, Burkina Faso, Brazil, Japan and Algeria. One interesting nation that made it up to the top 15 is Mauritania (MTN).
Concluding Remarks
By looking at the visualization above, if I were a football scout, I would propose to scout talents in regions like Mali, Burkina Faso, Cameroon, Ghana and Egypt. Mauritania is optional as it comes to light under an estimated metric — xG.
What are your thoughts on this analysis? Let me know!