The Most Unpredictable League?

[This article was first published on Sport Data Science, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Hello, I watch a lot of football in the championship and often the commentators on sky sports will say “this is the most unpredictable league in the world”. The thought occurred to me that this is actually quite easy to test. All you need is a model which predicts a range of league.

The model I will be using is fivethrityeight.com. Nate Silvers model make predictions for a number of leagues and competitions across the world. Below is a link to the predictions

https://projects.fivethirtyeight.com/soccer-predictions/?ex_cid=rrpromo

and this is the methodology used:

So the idea is to compare how this model does for all the leagues and if the championship is the most unpredictable league then the model will perform the worst on it. Simple logic. Just to make clear this is not picking faults in the model. My skill is know where near Nate Silver and his team.

They produce a csv file which has all the predictions since 16/17 season. The first thing i’m going to look at is simple how many correct results has the model got by league. If the Championship is the most unpredictable league then it will have the lowest correct predictions

Not True. While the championship is on the lower end is is someway above the most incorrect which is League 2. The Chinese super league looks to be the most predictable. Looking at the league near the top the Barclays Premier League and the Scottish Premiership there are teams a lot better then others (top 6 in the Premier League, Celtic and Rangers) which will make those league a lot more predictable.

In league a lot of teams must be pretty evenly matched meaning predictions are harder.

The trend of the English leagues shows that the Championship over the 3 years there predictions has increased its predictability. this could be improvements in the model or the championship itself being more predictable. Its impossible to tell which. Though if it was model performance then other league probably would improve too.

The model includes a percentage chance of each result and therefore to look at how close a league is I compared what the difference in percentage chance of a win is between each team in the match.

Measuring the difference in percentage chance between the favourite for a match and the other team then overall the Championship ends up the 5th closest league. The lower the gap between the favourite and the other team means that a league has a lot of parity and therefore will be unpredictable. Although so far the championship has not been the most predictable league it seems to be an unpredictable league.

Focusing in on the championship across multiple seasons and the difference in win probability across multiple years doesnt seem to alter much. In fact when compared to the correct prediction which was a definite increase there in no discernible change in the difference between the 2 teams win probability.

I think overall the Championship is shown the be quite unpredictable – not the most unpredictable – but over a few measures its shown to be amongst the group of most unpredictable league. There are 2 amin reasons for a league being unpredictable, no information about the league or theres a lot of parity in the league. I think with the championship this is definitely the latter. This has also shown some predictable leagues which may be frutiful for betting.

All my code should now be on my github below

https://github.com/alexthom2/TheChampionship/blob/master/UnpreditableExploration.Rmd

To leave a comment for the author, please follow the link and comment on their blog: Sport Data Science.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)