Evaluation of the 2018 FIFA World Cup Forecast

[This article was first published on Achim Zeileis, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A look back the 2018 FIFA World Cup in Russia to check whether our tournament forecast based on the bookmaker consensus model was any good…

How surprising was the tournament?

Last week France won the 2018 FIFA World Cup in a match against Croatia in Russia, thus delivering an entertaining final to a sportful tournament. Many perceived the course of the tournament as very unexpected and surprising because many of the “usual” favorites like Brazil, Germany, Spain, or Argentina did not even make it to the semi-finals. And in contrast, teams like host Russia and finalist Croatia proceeded further than expected. However, does this really mean that expectations of experts and fans were so wrong? Or, how surprising was the result given pre-tournament predictions?

Therefore, we want to take a critical look back at our own Probabilistic Forecast for the 2018 FIFA World Cup based on the bookmaker consensus model that aggregated the expert judgments of 26 bookmakers and betting exchanges. A set of presentation slides (in PDF format) with explanations of the model and its evaluation are available to accompany this blog post: slides.pdf

TL;DR

Despite some surprises in the tournament, the probabilistic bookmaker consensus forecast fitted reasonably well. Although it is hard to evaluate probabilistic forecasts with only one realization of the tournament but by and large most outcomes do not deviate systematically from the probabilities assigned to them.

However, there is one notable exception: Expectations about defending champion Germany were clearly wrong. “Die Mannschaft” was predicted to advance from the group stage to the round of 16 with probability 89.1% – and they not only failed to do so but instead came in last in their group.

Other events that were perceived as surprising were not so unlikely to begin, e.g., for Argentina it was more likely to get eliminated before the quarter finals (predicted probability: 51%) than to proceed further. Or they were not unlikely conditional on previous tournament events. Examples for the latter are the pre-tournament prediction for Belgium beating Brazil in a match (40%) or Russia beating Spain (33%). Of course, another outcome of those matches was more likely but compared with these predictions the results were maybe not as surprising as perceived by many. Finally, the pre-tournament prediction of Croatia making it to the final was only 6% but conditional on the events from the round of 16 (especially with Spain being eliminated) this increased to 27% (only surpassed by England with 36%).

Tournament animation

The animated GIF below shows the pre-tournament predictions for each team winning the 2018 FIFA world cup. In the animation the teams that “survived” over the course of the tournament are highlighted. This clearly shows that the elimination of Germany (winning probability: 15.8%) was the big surprise in the group stage but otherwise almost all of the teams expected to proceed also did so. Afterwards, two of the other main favorites Brazil (16.6%) and Spain (12.5%) dropped out but eventually the fourth team with double-digit winning probability (France, 12.1%) prevailed.

tournament animation

Correlations

Compared to other rankings of the teams in the tournament, the bookmaker consensus model did quite well. To illustrate this we compute the Spearman rank correlation of observed partial tournament ranking (1 FRA, 2 CRO, 3 BEL, 4 ENG, 6.5 URU, 6.5 BRA, …) with the bookmaker consensus model as well as Elo and FIFA rating.

Method Correlation
Bookmaker consensus
Elo rating
FIFA rating
0.704
0.592
0.411

Match probabilities

As there is no good way to assess the predicted winning probabilities for winning the title with only one realization of the tournament, we at least (roughly) assess the quality of the predicted probabilities for the individual matches. To do so, we split the 63 matches into three groups, depending on the winning probability of the stronger team.

pairwise probability evaluation

This gives us matches that were predicted to be almost even (50-58%), had moderate advantages for the stronger team (58-72%), or clear advantages for the stronger team (72-85%). It turns out that in the latter two groups the average predicted probabilities (dashed red line) match the actual observed proportions quite well. Only in the “almost even” group, the stronger teams won slightly more often than expected.

Group stage probabilities

As already mentioned above, there was only one big surprise in the group stage – with Germany being eliminated. As the tables below show, most other results from the group rankings conformed quite well with the predicted probabilities to “survive” the group stage.

A
Rank
 
Team
 
Prob. (in %)
1
2
3
4
URU
RUS
KSA
EGY
68.1
64.2
19.2
39.3
B
Rank
 
Team
 
Prob. (in %)
1
2
3
4
ESP
POR
IRN
MAR
85.9
66.3
26.5
27.3
C
Rank
 
Team
 
Prob. (in %)
1
2
3
4
FRA
DEN
PER
AUS
87.0
46.7
31.7
25.2
D
Rank
 
Team
 
Prob. (in %)
1
2
3
4
CRO
ARG
NGA
ISL
58.7
78.7
41.2
30.9
E
Rank
 
Team
 
Prob. (in %)
1
2
3
4
BRA
SUI
SRB
CRC
89.9
45.4
39.0
22.6
F
Rank
 
Team
 
Prob. (in %)
1
2
3
4
SWE
MEX
KOR
GER
44.5
45.2
26.8
89.1
G
Rank
 
Team
 
Prob. (in %)
1
2
3
4
BEL
ENG
TUN
PAN
81.7
75.6
23.5
23.2
H
Rank
 
Team
 
Prob. (in %)
1
2
3
4
COL
JPN
SEN
POL
64.6
36.3
37.9
57.9

To leave a comment for the author, please follow the link and comment on their blog: Achim Zeileis.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)