Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Each week triplemfooty tweets out a free kick ladder that is based only on total free kick differential throughout the season.
But what if we did a ladder in the same style of the home and away ladder.
- 4 points for a win (winning free kick count in game)
- 2 points for a draw (drawn free kick count in game)
- 0 points for a loss (losing the free kick count in game)
- percentage which is \(100 * (totalfreesfor/totalfreesagainst)\)
Well thankfully to the new afl R package fitzRoy that is possible!
library(tidyverse) ## -- Attaching packages -------------------------------- tidyverse 1.2.1 -- ## v ggplot2 2.2.1 v purrr 0.2.5 ## v tibble 1.4.2 v dplyr 0.7.5 ## v tidyr 0.8.1 v stringr 1.3.1 ## v readr 1.1.1 v forcats 0.3.0 ## -- Conflicts ----------------------------------- tidyverse_conflicts() -- ## x dplyr::filter() masks stats::filter() ## x dplyr::lag() masks stats::lag() fitzRoy::get_footywire_stats(9514:9630)%>% group_by(Round, Team)%>% mutate(Free_diff=sum(FF)-sum(FA), sumFF=sum(FF), sumFA=sum(FA))%>% select(Team, sumFF, sumFA, Free_diff)%>% distinct()%>% mutate(points=if_else(Free_diff>0,4,if_else(Free_diff==0,2,0)))%>% group_by(Team)%>% summarise(total_points=sum(points), percentage_Frees=100*(sum(sumFF)/sum(sumFA)))%>% arrange(desc(total_points), desc(percentage_Frees)) ## Getting data from footywire.com ## Finished getting data ## Adding missing grouping variables: `Round` ## # A tibble: 18 x 3 ## Team total_points percentage_Frees ## <chr> <dbl> <dbl> ## 1 North Melbourne 38 118. ## 2 Western Bulldogs 38 115. ## 3 Collingwood 36 126. ## 4 Adelaide 36 117. ## 5 Brisbane 36 114. ## 6 West Coast 34 123. ## 7 Hawthorn 34 103. ## 8 Geelong 32 98.3 ## 9 Carlton 30 103. ## 10 Melbourne 28 98.6 ## 11 Port Adelaide 26 93.6 ## 12 Gold Coast 18 95.1 ## 13 Sydney 18 91.6 ## 14 Essendon 18 86.6 ## 15 GWS 16 83.5 ## 16 St Kilda 12 87.7 ## 17 Fremantle 10 85.8 ## 18 Richmond 8 77.6
So there has been a lot of talk recently about how West Coast receive a free kick advantage at home. The media hype has got to the point where the AFL will investigate.
So how can we use this ladder concept to see if there might be something there?
Lets first start by looking at the free kick ladder for 2018.
library(tidyverse) fitzRoy::get_footywire_stats(9514:9630)%>% group_by(Round, Team)%>% mutate(Free_diff=sum(FF)-sum(FA), sumFF=sum(FF), sumFA=sum(FA))%>% select(Team, sumFF, sumFA, Free_diff)%>% distinct()%>% mutate(points=if_else(Free_diff>0,4,if_else(Free_diff==0,2,0)))%>% group_by(Team)%>% summarise(total_points=sum(points), percentage_Frees=100*(sum(sumFF)/sum(sumFA)))%>% arrange(desc(total_points), desc(percentage_Frees)) ## Getting data from footywire.com ## Finished getting data ## Adding missing grouping variables: `Round` ## # A tibble: 18 x 3 ## Team total_points percentage_Frees ## <chr> <dbl> <dbl> ## 1 North Melbourne 38 118. ## 2 Western Bulldogs 38 115. ## 3 Collingwood 36 126. ## 4 Adelaide 36 117. ## 5 Brisbane 36 114. ## 6 West Coast 34 123. ## 7 Hawthorn 34 103. ## 8 Geelong 32 98.3 ## 9 Carlton 30 103. ## 10 Melbourne 28 98.6 ## 11 Port Adelaide 26 93.6 ## 12 Gold Coast 18 95.1 ## 13 Sydney 18 91.6 ## 14 Essendon 18 86.6 ## 15 GWS 16 83.5 ## 16 St Kilda 12 87.7 ## 17 Fremantle 10 85.8 ## 18 Richmond 8 77.6
So West Coast do well on frees but certainly they are not top of the comp.
But the perception is that West Coast are being favoured at home, so lets divide up this table into home and away.
df<-fitzRoy::get_footywire_stats(9514:9630)%>% group_by(Round, Team, Status)%>% mutate(Free_diff=sum(FF)-sum(FA), sumFF=sum(FF), sumFA=sum(FA))%>% select(Team, sumFF, sumFA, Free_diff)%>% distinct()%>% mutate(points=if_else(Free_diff>0,4,if_else(Free_diff==0,2,0)))%>% group_by(Team,Status)%>% summarise(total_points=sum(points), percentage_Frees=100*(sum(sumFF)/sum(sumFA)))%>% arrange(desc(total_points), desc(percentage_Frees)) ## Getting data from footywire.com ## Finished getting data ## Adding missing grouping variables: `Round`, `Status` df%>% filter(Status=="Home")%>% arrange(desc(total_points), desc(percentage_Frees)) ## # A tibble: 18 x 4 ## # Groups: Team [18] ## Team Status total_points percentage_Frees ## <chr> <chr> <dbl> <dbl> ## 1 Adelaide Home 24 168. ## 2 West Coast Home 24 155. ## 3 Western Bulldogs Home 22 113. ## 4 Hawthorn Home 22 103. ## 5 Collingwood Home 20 131. ## 6 Carlton Home 20 121. ## 7 North Melbourne Home 20 116. ## 8 Brisbane Home 20 115. ## 9 Geelong Home 20 105. ## 10 Melbourne Home 16 120. ## 11 Port Adelaide Home 16 107. ## 12 Essendon Home 12 101. ## 13 GWS Home 12 101. ## 14 Gold Coast Home 8 101. ## 15 Sydney Home 8 94.7 ## 16 Richmond Home 8 86.7 ## 17 Fremantle Home 4 83.7 ## 18 St Kilda Home 0 80 df%>% filter(Status=="Away")%>% arrange(desc(total_points), desc(percentage_Frees)) ## # A tibble: 18 x 4 ## # Groups: Team [18] ## Team Status total_points percentage_Frees ## <chr> <chr> <dbl> <dbl> ## 1 North Melbourne Away 18 119. ## 2 Collingwood Away 16 120. ## 3 Western Bulldogs Away 16 117. ## 4 Brisbane Away 16 114. ## 5 Hawthorn Away 12 102. ## 6 St Kilda Away 12 95.1 ## 7 Geelong Away 12 90.2 ## 8 Melbourne Away 12 87.9 ## 9 Adelaide Away 12 85 ## 10 West Coast Away 10 93.8 ## 11 Gold Coast Away 10 90.4 ## 12 Carlton Away 10 90.3 ## 13 Sydney Away 10 88.4 ## 14 Port Adelaide Away 10 80.7 ## 15 Fremantle Away 6 87.8 ## 16 Essendon Away 6 76.6 ## 17 GWS Away 4 71.5 ## 18 Richmond Away 0 72.5
Looking at our tables for Home vs Away, it would seem as though for this year at least that Adelaide is being favoured more, they lead the home free kick table are 4th on the overall table and just outside the eight on the away table. What’s the go with North Melbourne they lead the overall table and the away table but finish a mere 7th for games at home.
What about those tigers? For all the talk about home ground advantage and there being some influence on umpires it doesn’t seem to bother them if its 80,000 screaming tigers fans. Could the noise of affirmation over in the west, be louder than the tiger roar?
Other people might want to see a plot of free kicks differential but graphically. So lets do that.
fitzRoy::player_stats%>% select(-Player)%>% mutate(Round = parse_number(Round))%>% mutate(Season_round=Round+Season)%>% group_by(Season_round,Season, Round, Venue, Team, Status, Match_id,Date)%>% summarise(Fdiff=sum(FF)-sum(FA))%>% ggplot(aes(x=Season_round, y=Fdiff))+ geom_point(aes(colour = factor(Status))) + geom_hline(yintercept = 0)+ facet_wrap(~Team) ## Warning: 3212 parsing failures. ## row # A tibble: 5 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 7745 NA a number Qualifying Final row 2 7746 NA a number Qualifying Final col 3 7747 NA a number Qualifying Final expected 4 7748 NA a number Qualifying Final actual 5 7749 NA a number Qualifying Final ## ... ................. ... ....................................... ........ ....................................... ...... ....................................... ... ....................................... ... ....................................... ........ ....................................... ...... ....................................... ## See problems(...) for more details. ## Warning: Removed 146 rows containing missing values (geom_point).
Visually if we look at the WCE plot it does look as though they get the rub of the green at home but not away.
Are they just making less mistakes at home?
fitzRoy::player_stats%>% select(-Player)%>% mutate(Round = parse_number(Round))%>% mutate(Season_round=Round+Season)%>% group_by(Season_round,Date,Season, Round, Team, Status, Opposition, Venue, Match_id)%>% summarise_all(.funs = sum)%>% group_by(Match_id)%>% arrange(Match_id)%>% mutate_if(is.numeric, funs(difference=c(-diff(.), diff(.))))%>% ggplot(aes(x=Season_round,y=CG_difference))+ geom_point(aes(colour = factor(Status))) + geom_hline(yintercept = 0)+ facet_wrap(~Team) ## Warning: 3212 parsing failures. ## row # A tibble: 5 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 7745 NA a number Qualifying Final row 2 7746 NA a number Qualifying Final col 3 7747 NA a number Qualifying Final expected 4 7748 NA a number Qualifying Final actual 5 7749 NA a number Qualifying Final ## ... ................. ... ....................................... ........ ....................................... ...... ....................................... ... ....................................... ... ....................................... ........ ....................................... ...... ....................................... ## See problems(...) for more details. ## Warning: Removed 146 rows containing missing values (geom_point).
fitzRoy::player_stats%>% select(-Player)%>% mutate(Round = parse_number(Round))%>% mutate(Season_round=Round+Season)%>% group_by(Season_round,Date,Season, Round, Team, Status, Opposition, Venue, Match_id)%>% summarise_all(.funs = sum)%>% group_by(Match_id)%>% arrange(Match_id)%>% mutate_if(is.numeric, funs(difference=c(-diff(.), diff(.))))%>% ggplot(aes(x=Season_round,y=CG_difference))+ geom_point(aes(colour = factor(Status))) + geom_hline(yintercept = 0)+ facet_wrap(~Team) ## Warning: 3212 parsing failures. ## row # A tibble: 5 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 7745 NA a number Qualifying Final row 2 7746 NA a number Qualifying Final col 3 7747 NA a number Qualifying Final expected 4 7748 NA a number Qualifying Final actual 5 7749 NA a number Qualifying Final ## ... ................. ... ....................................... ........ ....................................... ...... ....................................... ... ....................................... ... ....................................... ........ ....................................... ...... ....................................... ## See problems(...) for more details. ## Warning: Removed 146 rows containing missing values (geom_point).
Again visually it looks like the eagles are making less mistakes at home.
What about their tackle differential?
fitzRoy::player_stats%>% select(-Player)%>% mutate(Round = parse_number(Round))%>% mutate(Season_round=Round+Season)%>% group_by(Season_round,Date,Season, Round, Team, Status, Opposition, Venue, Match_id)%>% summarise_all(.funs = sum)%>% group_by(Match_id)%>% arrange(Match_id)%>% mutate_if(is.numeric, funs(difference=c(-diff(.), diff(.))))%>% ggplot(aes(x=Season_round,y=T_difference))+ geom_point(aes(colour = factor(Status))) + geom_hline(yintercept = 0)+ facet_wrap(~Team) ## Warning: 3212 parsing failures. ## row # A tibble: 5 x 4 col row col expected actual expected <int> <int> <chr> <chr> actual 1 7745 NA a number Qualifying Final row 2 7746 NA a number Qualifying Final col 3 7747 NA a number Qualifying Final expected 4 7748 NA a number Qualifying Final actual 5 7749 NA a number Qualifying Final ## ... ................. ... ....................................... ........ ....................................... ...... ....................................... ... ....................................... ... ....................................... ........ ....................................... ...... ....................................... ## See problems(...) for more details. ## Warning: Removed 146 rows containing missing values (geom_point).
Doesn’t look like their is much difference, so for WCE it looks as though they do get more free kicks at home vs away, they make less mistakes at home but tackles don’t seem to change by home/away status.
So while these plots paint a picture, we should always dig in a bit more to see if there is something there, or if it is just the mind trying to spot a pattern when there is none.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.