Couting Pairs
[This article was first published on Analysis of AFL, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Saw a tweet from [@matt_pavlich](https://twitter.com/matt_pavlich) asking twitter roughly how many games he and David Mundy have played together.
Thankfully, you don’t have to wonder anymore and you can reproduce the results yourself and do running counts for your favourite players!
library(tidyverse) ## ── Attaching packages ─────────────────────────── tidyverse 1.2.1 ── ## ✔ ggplot2 3.1.1 ✔ purrr 0.3.2 ## ✔ tibble 2.1.1 ✔ dplyr 0.8.0.1 ## ✔ tidyr 0.8.3 ✔ stringr 1.4.0 ## ✔ readr 1.3.1 ✔ forcats 0.4.0 ## ── Conflicts ────────────────────────────── tidyverse_conflicts() ── ## ✖ dplyr::filter() masks stats::filter() ## ✖ dplyr::lag() masks stats::lag() library(fitzRoy) df<-fitzRoy::get_afltables_stats(start_date = "1990-01-01", end_date=Sys.Date()) ## Returning data from 1990-01-01 to 2019-04-26 ## Downloading data ## ## Finished downloading data. Processing XMLs ## Warning: Detecting old grouped_df format, replacing `vars` attribute by ## `groups` ## Finished getting afltables data df$matchID<-paste(df$Season, df$Round, df$Home.team, df$Away.team) df$name<-paste(df$First.name, df$Surname) df_data<-df%>%filter(Playing.for=="Fremantle")%>%select(name, matchID, Playing.for) df_data %>% mutate(n = 1) %>% spread(name, n, fill=0) %>% select(-Playing.for, -matchID) %>% {crossprod(as.matrix(.))} %>% replace(lower.tri(., diag=T), NA) %>% reshape2::melt(na.rm=T) %>% unite('Pair', c('Var1', 'Var2'), sep=", ")%>% filter(value>150)%>% arrange(desc(value)) ## Pair value ## 1 Aaron Sandilands, Matthew Pavlich 232 ## 2 David Mundy, Matthew Pavlich 226 ## 3 Luke McPharlin, Matthew Pavlich 224 ## 4 David Mundy, Michael Johnson 222 ## 5 Aaron Sandilands, David Mundy 214 ## 6 Matthew Pavlich, Paul Hasleby 200 ## 7 Antoni Grover, Matthew Pavlich 191 ## 8 Matthew Pavlich, Michael Johnson 187 ## 9 David Mundy, Luke McPharlin 186 ## 10 Aaron Sandilands, Luke McPharlin 183 ## 11 David Mundy, Stephen Hill 183 ## 12 David Mundy, Ryan Crowley 175 ## 13 Aaron Sandilands, Michael Johnson 173 ## 14 Luke McPharlin, Michael Johnson 171 ## 15 Shane Parker, Shaun McManus 171 ## 16 Matthew Pavlich, Ryan Crowley 167 ## 17 Matthew Pavlich, Shaun McManus 164 ## 18 Michael Johnson, Ryan Crowley 163 ## 19 Matthew Pavlich, Peter Bell 160 ## 20 David Mundy, Garrick Ibbotson 158 ## 21 Chris Mayne, David Mundy 154 ## 22 David Mundy, Hayden Ballantyne 154 ## 23 David Mundy, Paul Duffield 153 ## 24 Antoni Grover, Paul Hasleby 153 ## 25 Matthew Pavlich, Stephen Hill 151
Some interesting things you might want to do now you have the script and data, you might want to see which pair has played the most together for each team.
Something you might also want to do is look the games that they played together in.
df_data%>% filter(name %in% c("David Mundy", "Matthew Pavlich")) %>% group_by(matchID)%>% count(n=n())%>% filter(n==2) ## # A tibble: 226 x 3 ## # Groups: matchID [226] ## matchID n nn ## <chr> <int> <int> ## 1 2005 10 Geelong Fremantle 2 2 ## 2 2005 11 Fremantle Brisbane Lions 2 2 ## 3 2005 12 Sydney Fremantle 2 2 ## 4 2005 13 Fremantle North Melbourne 2 2 ## 5 2005 14 Adelaide Fremantle 2 2 ## 6 2005 15 Fremantle Western Bulldogs 2 2 ## 7 2005 16 Carlton Fremantle 2 2 ## 8 2005 17 Fremantle Melbourne 2 2 ## 9 2005 18 Collingwood Fremantle 2 2 ## 10 2005 19 Fremantle Richmond 2 2 ## # … with 216 more rows
To leave a comment for the author, please follow the link and comment on their blog: Analysis of AFL.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.