Different winners under different criteria
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A few posts ago (see here), I noted that there was a group of 7 teams in the English Premier League (EPL) that seem to be a cut above the rest:
- Arsenal
- Chelsea
- Everton
- Liverpool
- Manchester City
- Manchester United
- Tottenham Hotspur
Who is the best among these 7 teams? In thinking about this question I realized that there are so many measures of what is best. So, instead of trying to figure out who is best (and get caught in between raging fans), I challenged myself to do the following: For each of the 7 teams, can I find a metric of success for which they are the best? Here, I am limiting myself to metrics that can be derived from just ranking and points.
tl;dr: I couldn’t find one for Everton, and the one I found for Liverpool is rather forced. Can you find measures against which Liverpool or Everton come out tops?
Code snippets used to obtain figures in this post are provided; the full R script can be found here.
Here is a glimpse at the data we are using (named under the variable standings):
Most championships in last 10 years: Chelsea, Man C, Man U (3)
standings %>% filter(Rank == 1) %>% group_by(Club) %>% summarize(Champions = n()) %>% arrange(desc(Champions))
Most top 3 finishes in last 10 years: Chelsea, Man C (7), Man U (6)
standings %>% filter(Rank <= 3) %>% group_by(Club) %>% summarize(`Top 3` = n()) %>% arrange(desc(`Top 3`))
Most top 5 finishes in last 10 years: Arsenal, Man C (9),Chelsea, Man U, Tottenham (8)
standings %>% filter(Rank <= 5) %>% group_by(Club) %>% summarize(`Top 5` = n()) %>% arrange(desc(`Top 5`))
Most points in last 10 years: Man U (783), Man C (764), Chelsea (762)
standings %>% group_by(Club) %>% summarize(`Total Points` = sum(Points)) %>% arrange(desc(`Total Points`))
Best median rank in last 10 years: Man U (2), Man C (2.5), Chelsea (3)
standings %>% group_by(Club) %>% summarize(`Median Rank` = median(Rank)) %>% arrange(`Median Rank`)
Best median rank in last 5 years: Man C (2), Chelsea (3), Tottenham (3)
standings %>% group_by(Club) %>% filter(Season >= 2013) %>% summarize(`Median Rank` = median(Rank)) %>% arrange(`Median Rank`)
Best worst rank in last 10 years: Arsenal (6), Man U (7), Liverpool (8), Tottenham (8)
standings %>% group_by(Club) %>% summarize(`Worst Rank` = max(Rank)) %>% arrange(`Worst Rank`)
Best worst rank in last 3 years: Tottenham (3), Man C (4), Arsenal (6), Man U (6)
standings %>% filter(Season >= 2015) %>% group_by(Club) %>% summarize(`Worst Rank` = max(Rank)) %>% arrange(`Worst Rank`)
Most consistent ranking by standard deviation: Arsenal (1.14), Tottenham (1.72), Everton (2.12)
standings %>% filter(Club %in% c("Arsenal", "Chelsea", "Everton", "Liverpool", "Manchester United", "Manchester City", "Tottenham Hotspur")) %>% group_by(Club) %>% summarize(sd = sd(Rank)) %>% arrange(sd)
Most consistent ranking by linear regression coefficient: Liverpool (-0.024), Arsenal (0.133), Chelsea (0.261)
standings %>% filter(Club %in% c("Arsenal", "Chelsea", "Everton", "Liverpool", "Manchester United", "Manchester City", "Tottenham Hotspur")) %>% group_by(Club) %>% summarize(beta = lm(Rank ~ Season)$coefficients[2]) %>% arrange(abs(beta))
Most improved team in last 5 years by linear regression coefficient: Tottenham (-0.9), Man U (-0.8), Man C (0.1)
standings %>% filter(Season >= 2013) %>% filter(Club %in% c("Arsenal", "Chelsea", "Everton", "Liverpool", "Manchester United", "Manchester City", "Tottenham Hotspur")) %>% group_by(Club) %>% summarize(beta = lm(Rank ~ Season)$coefficients[2]) %>% arrange(beta)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.