World Cup: What Factors are Effective for Determining Team Efficiency and World Cup Placement?

DataRes at UCLA
7 min readApr 8, 2023

--

Authors: Helena Xu (Project Lead), Ethan Rauchwerk, Nalin Chopra, Izaac Tay, Ken Isaka

Introduction

Every four years, soccer fans from around the world gather together to root for their favorite teams during the FIFA World Cup. Throughout all the matches played, each soccer team will have wins, losses, and ties, which will influence the final results of the World Cup. Thus, people may wonder, what factors help to determine the efficiency of teams and their placement during the World Cup? In particular, we will analyze the number of players in a team and their average ages, formations, completed pass vs. attempted pass ratios, possessions and number of scored goals, and goalkeeping, to explore the question that many soccer fanatics may like to know to predict how well their favorite team will do.

Number of Players and Average Ages

Some factors that could possibly affect success in the World Cup are the number of players used in matches and the average age of a team. It’s hard to hypothesize how these factors would affect success, as both a positive and negative relationship are feasible. On one hand, teams that use a lot of players could be more successful because they get more energy when they are able to bring on more fresh legs into a game. On the other hand, maybe it’s more beneficial when a team keeps their starters in and trusts their best players to take care of the whole game. As far as age, it’s unclear to the un-data-trained eye whether an older and more experienced team or a younger and more energetic team are more likely to advance in the World Cup.

One way to define success in the World Cup is via goal differential, which is goals scored minus goals conceded. This accounts for how much of a margin teams win by, and is a good account for how far teams went in the tournament because playing more games gives them more chances to add onto their goal differential.

From these graphs, there does not appear to be a relationship between average age or numbers of players used and goal difference. Perhaps, this is because both sides of each variable are valuable; experience is roughly as valuable as youth, and using starters is roughly as valuable as utilizing a lot of players. Therefore, teams do not necessarily have to concern themselves with using a certain number of players or looking for a certain age in their players to increase their chances of high placement in the World Cup.

Effective Formations

There are certain formations that different teams utilize, and there is likely a reason for this. In the 2022 World Cup, there were 12 types of team formations applied during various matches, as presented in the boxplot below.

The boxplot illustrates the various formations that the teams used during their matches along the x-axis and the scores that correlate to the formations along the y-axis. Some formations resulted in the same score for all the matches shown in the data, such as 3–1–4–2 and 4–1–2–1–2. The 4–2–3–1 and 4–3–3 formations had some matches where there were higher scores. A majority of these formations have scores that range mostly around 0 to 2. As presented, some formations are indeed more effective in scoring higher, even though there is variation and many of the scores are similar.

Completed Pass vs Attempted Pass Ratios

In a soccer match, the passing of the ball from one player to another is inevitable and a chance for one team to get the ball closer to the goal or for the other team to intercept the ball. Thus, we may wonder if a higher completed pass versus attempted pass ratio would predict whether a team would win a match or how well they rank in the World Cup. The following barplot visualizes ten different teams with the lowest pass ratios during the various matches during the 2022 World Cup.

Here, the ten teams are Australia, Cameroon, Costa Rica, Ecuador, Japan, Morocco, Netherlands, Poland, Saudi Arabia, and Tunisia. On the other hand, there are many teams that had pass ratios of one; in this case, ten unique teams with the most passes are Spain, Argentina, Germany, Croatia, Brazil, Portugal, Denmark, France, Belgium, England. When examining the match statistics and final rankings of these twenty teams, there seems to be a slight correlation between a higher pass ratio with more wins and overall rankings. Despite this, some of the teams with lower pass ratios have won matches against the teams with higher pass ratios, as well as have a higher final ranking than those teams. Therefore, each team still has the opportunity to win a match or rank higher even if they do not have as many completed passes compared to attempted passes, but there could be an advantage to having more completed passes over incomplete passes.

Possessions and Scored Goals

If a team has higher possession of the ball during a match, we may think that this team will be more likely to be able to score more goals. This suggests that we would expect a positive correlation between the percentage of possession and the number of goals scored.

Based on the scatterplot above, we find that there indeed is a positive correlation between possessions and goals for the 32 different teams that played during the FIFA World Cup in 2022, as suggested by the red line of best fit, meaning that a higher proportion of possessions would more likely result in more goals scored. However, the individual data points for each team indicate this relationship may not always be the case. There are many teams who have a similar number of goals despite having different percentages of possession, so there are definitely other factors involved that also play a role in determining how many goals a team can score and thus whether they win the match or not. Nevertheless, we can still continue to believe that there is a higher chance of scoring more goals if possession is higher, since this trend is still shown in the scatterplot.

Goalkeeping

One interesting statistic to evaluate quality of goalkeeping from the World Cup is save percentage. This indicates the percentage of shots that a goalkeeper faced that they saved. A common criticism of using this statistic is that it doesn’t take into account the difficulty of shots faced. For example, if a ball slowly rolls toward the middle of the net, when the goalkeeper picks it up, it is a save. So, can the percentage of shots saved really indicate how well that goalkeeper did at preventing goals? To investigate, we can look at the linear relationship between save percentage and goals against.

From the graph, it is clear that there is a relatively strong, negative, linear relationship between save percentage and goals against. Therefore, it is fair to use goal percentage as a way to assess a goalkeeper’s performance and to determine who the best goalkeepers were in the World Cup.

Conclusion

To soccer fans, the most exciting thing is when their favorite teams win a soccer match in the World Cup. Of course, there is also the anticipation that comes with whether their teams will win a future match. There are a few elements that may or may not be influential in figuring out these outcomes, which we have decided to investigate here. The number of players used and the average age of players on a team does not have much of an effect, since there are benefits to both younger and older players or having more or less players. The type of formation that each team utilizes could potentially influence the outcome of a match; the teams with some of the formations were demonstrated to have a slightly higher score, but this might also vary depending on the formation that the opponents employ.

Furthermore, determining the proportion of completed passes with total attempted passes does suggest that there is a slight advantage to having a higher pass ratio, but this is not a definitive situation as teams with lower pass ratios can also win more matches and have a higher ranking. There is a positive correlation between the percentage of possessions and the number of goals scored, meaning that possessing the ball more often would likely lead to more scored goals, which in turn results in more chances of winning a match. Finally, we can take a look at goalkeeping statistics, in which a higher save percentage is associated with less goals against; this is an important factor since goalkeepers play a substantial role in keeping the other team from scoring.

What other factors could be useful in predicting the outcomes of the World Cup? Based on this analysis on the various factors of soccer of the 2022 FIFA World Cup, do you think you can predict how well your favorite team can do during matches and the overall ranking during the next World Cup?

Sources

https://www.kaggle.com/datasets/swaptr/fifa-world-cup-2022-match-data

https://www.kaggle.com/datasets/swaptr/fifa-world-cup-2022-statistics?select=team_data.csv

--

--

No responses yet