- If team abilities are available, pairwise winning probabilities can be derived for each possible match using a Bradley-Terry approach.
- Given pairwise winning probabilities, the whole tournament can be easily simulated to see which team proceeds to which stage in the tournament and which team finally wins.
- Such a tournament simulation can then be run sufficiently often (here 100,000 times) to obtain relative frequencies for each team winning the tournament.
Predictive Bookmaker Consensus Model for the UEFA Euro 2016
[social4i size=”large” align=”float-right”]
(By Achim Zeileis)
From 10 June to 10 July 2016 the best European football teams will meet in France to determine the European Champion in the UEFA European Championship 2016 tournament. For the first time 24 teams compete, expanding the format from 16 teams as in the previous five Euro tournaments. For forecasting the winning probability of each team a predictive model based on bookmaker odds from 19 online bookmakers is employed. The favorite is the host France with a forecasted winning probability of 21.5%, followed by the current World Champion Germany with a winning probability of 20.1%. The defending European Champion Spain follows after some gap with 13.7% and all remaining teams are predicted to have lower chances with England (9.2%) and Belgium (7.7%) being the “best of the rest”. Furthermore, by complementing the bookmaker consensus results with simulations of the whole tournament, predicted pairwise probabilities for each possible game at the Euro 2016 are obtained along with “survival” probabilities for each team proceeding to the different stages of the tournament. For example, it can be determined that it is much more likely that top favorites France and Germany meet in the semifinal (7.8%) rather than in the final at the Stade de France (4.2%) – which would be a re-match of the friendly game that was played on 13 November 2015 during the terrorist attacks in Paris and that France won 2-0. Hence it is maybe better that the tournament draw favors a match in the semifinal at Marseille (with an almost even winning probability of 50.1% for France). The most likely final is then that either of the two teams plays against the defending champion Spain with a probability of 5.7% for France vs. Spain and 5.4% for Germany vs. Spain, respectively.
All of these forecasts are the result of a bookmaker consensus rating proposed in Leitner, Hornik, and Zeileis (International Journal of Forecasting, 26(3), 471-481, 2010). This technique correctly predicted the winner of the FIFA World Cup 2010 and Euro 2012 tournaments while missing the winner but correctly predicting the final for the Euro 2008and three out of four semifinalists at the FIFA World Cup 2014. A new working paper about the UEFA Euro 2016, upon which this blog post is based, applies the same technique and is introduced here.
The core idea is to use the expert knowledge of international bookmakers. These have to judge all possible outcomes in a sports tournament such as the UEFA Euro and assign odds to them. Doing a poor job (i.e., assigning too high or too low odds) will cost them money. Hence, in our forecasts we solely rely on the expertise of 19 such bookmakers. Specifically, we (1) adjust the quoted odds by removing the bookmakers’ profit margins (or overround, typically around 15%), (2) aggregate and average these to a consensus rating, and (3) infer the corresponding tournament-draw-adjusted team abilities using the Bradley-Terry model for pairwise comparisons.
For step (1), it is assumed that the quoted odds are derived from the underlying “true” odds as: quoted odds = odds · α + 1, where + 1 is the stake (which is to be paid back to the bookmakers’ customers in case they win) and α is the proportion of the bets that is actually paid out by the bookmakers. The so-called overround is the remaining proportion1 – α and the main basis of the bookmakers’ profits (see also Wikipedia and the links therein). For the 19 bookmakers employed in this analysis, the median overround is a sizeable 15.1%. Subsequently, in step (2), the overround-adjusted odds are transformed to the log-odds (or logit scale), averaged for each team, and transformed back to winning probabilities (displayed in the barchart above).
Finally, step (3) of the analysis uses the following idea:
Light gray signals that either team is almost equally likely to win a match between Teams A and B (probability between 40% and 60%). Light, medium, and dark blue/red corresponds to small, moderate, and high probbilities of winning/losing a match between Team A and Team B. All probabilities are obtained from the Bradley-Terry model using the following equation for the winning probability:
Clearly, the bookmakers perceive France and Germany to be the strongest teams in the tour- nament that are almost on par (with a probability of only 50.5% that France beats Germany) while having moderate (70-0%) to high (> 80%) probabilities to beat almost any other team in the tournament. The only group of teams that get close to having even chances are Spain (with probability of 43.7% and 44.2% of beating France and Germany, respectively), England (with 38.7% and 39.1%), and Belgium (with 37.4% and 37.9%). Behind these two groups of the strongest teams there are several larger clusters of teams that have approximately the same strength (i.e., yielding approximately even chances in a pairwise comparison). Interestingly, two of the teams with very low strengths (Romania and Albania) compete in the same group A together with the favorite team France.
Additionally, the tournament simulation cannot only be used to infer an estimated probability for the outcome of each individual match but also for the whole course of the tournament. The plot below shows the relative frequencies from the simulation for each team to “survive” over the tournament, i.e., proceed from the group-phase to the round of 16, quarter- and semi-finals, and the final. France and Germany are the clear favorites within their respective groups A and C with almost 100% probability to make it to the round of 16 and also rather small drops in probability to proceed through the subsequent rounds. All remaining teams have much poorer chances to proceed to the later stages of the Euro 2016. Group B also has a rather clear favorite with England and all remaining teams following with a certain gap. In contrast, groups D and E each have a favorite (Spain and Belgium, respectively) but with a second strong contender (Croatia and Italy, respectively). Group F is a weaker group but a much more balanced compared with the previous groups. Due to the new tournament system where 16 out of 24 teams proceed from the group phase to the next stage, even the weakest teams have probabilities of about 40% to reach at least the round of 16. However, many of these weak teams then have rather poor chances to make it to the quarterfinals resulting in clear downward kinks in the survival curves.
Needless to say that all predictions are in probabilities that are far from being certain. While France taking the home victory is the most likely event in the bookmakers’ expert opinions, it is still far more likely that one of the other teams wins. This is one of the two reasons why we would recommend to refrain from placing bets based on our analyses. The more important second reason, though, is that the bookmakers have a sizeable profit margin of about 15% which assures that the best chances of making money based on sports betting lie with them. Hence, this should be kept in mind when placing bets. We, ourselves, will not place bets but focus on enjoying the exciting football tournament that the UEFA Euro 2016 will be with 100% predicted probability!
Working paper: Zeileis A, Leitner C, Hornik K (2016). “Predictive Bookmaker Consensus Model for the UEFA Euro 2016”, Working Paper 2016-16, Working Papers in Economics and Statistics, Research Platform Empirical and Experimental Economics, Universität Innsbruck. URL http://EconPapers.RePEc.org/RePEc:inn:wpaper:2016-15.