Site icon R-bloggers

How to Scrape Data from Euroleague

[This article was first published on R – Predictive Hacks, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

We will provide you an example of how you can get the results of the Euroleague games in a structured form. The example is from the 2016-2017 season but you can adapt it for any season. What you need is to get the corresponding URL for each team in Euroleague and also to define the period.

Let’s start coding:

library(tidyverse)
library(rvest)

IST<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=IST&amp;seasoncode=E2016#!games")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p
BAS<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=BAS&amp;seasoncode=E2016#!games")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p
BAM<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=BAM&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()


RED<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=RED&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()
CSK<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=CSK&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p
#NEW
DAR<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=DAR&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p



MIL<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=MIL&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()
BAR<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=BAR&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()
ULK<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=ULK&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p

#NEW
GAL<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=GAL&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()
TEL<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=TEL&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()
OLY<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=OLY&amp;seasoncode=E2016#!games")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p

PAN<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=PAN&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p
MAD<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=MAD&amp;seasoncode=E2016#!games")%>% html_nodes("table")%>%.[[1]]%>%html_table() #p
#NEW
UNK<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=UNK&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()

ZAL<-read_html("http://www.euroleague.net/competition/teams/showteam?clubcode=ZAL&amp;seasoncode=E2016")%>% html_nodes("table")%>%.[[1]]%>%html_table()


IST$Team<-c("Anadolu Efes Istanbul")
MIL$Team<-c("EA7 Emporio Armani Milan")
BAS$Team<-c("Baskonia Vitoria Gasteiz")

BAM$Team<-c("Brose Bamberg")
RED$Team<-c("Crvena Zvezda mts Belgrade")
CSK$Team<-c("CSKA Moscow")

BAR$Team<-c("FC Barcelona Lassa")
ULK$Team<-c("Fenerbahce Istanbul")
DAR$Team<-c("Darussafaka Dogus Istanbul")

TEL$Team<-c("Maccabi FOX Tel Aviv")
OLY$Team<-c("Olympiacos Piraeus")
PAN$Team<-c("Panathinaikos Superfoods Athens")

MAD$Team<-c("Real Madrid")
UNK$Team<-c("Unics Kazan")
GAL$Team<-c("Galatasaray Odeabank Istanbul")
ZAL$Team<-c("Zalgiris Kaunas")



df<-rbind(IST,MIL, BAS, BAM, RED, CSK, BAR, ULK, GAL, TEL, OLY, PAN, MAD, UNK, DAR, ZAL )%>%filter(!grepl("^[A-z]", X4))%>%
  mutate(Opponent = substr(X3,4, nchar(X3)), HomeVisitor = ifelse(substr(X3,1,2)=="vs", "Home", "Visitor"),  Score=X4   )%>%
  separate(Score, into = c('HScore', 'VScore'), sep="-")%>%
  mutate(HScore=as.numeric(trimws(HScore)),  VScore=as.numeric(trimws(VScore)) ,  TeamScore = ifelse(HomeVisitor=='Home', HScore, VScore), OpponetScore = ifelse(HomeVisitor!='Home', HScore, VScore))%>%
  select(-X3)%>%rename(Game=X1, WL=X2)%>%select(Game, Team, Opponent, TeamScore, OpponetScore, HomeVisitor, WL)

Let’s see the df how does it look like:

This is a good starting point in case you want to build a predictive model.

To leave a comment for the author, please follow the link and comment on their blog: R – Predictive Hacks.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.