UEFA Champions League Knockout Phase Draws: Monte Carlo Simulation with R
[This article was first published on Memo's Island, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Draws for the knockout phase of the 2012–13 UEFA Champions League will be held in Nyon on the 20th December 2012. The rules of the draw are simple and are as follows:Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- 8 Group winner teams will be seeded.
- 8 Group runner-up teams will be unseeded.
- Teams coming from the same group and from same association can not be drawn against.
Qualified teams, their groups and associations definitions, the simulation and results reporting R code are as follows (note that we use xtable for html outputs, for other functions we use, see below codes after tables):
winnerTeams <- c('Paris-Saint-Germain', 'Schalke-04', 'Malaga', 'Borussia-Dortmund', 'Juventus', 'Bayern-Munich', 'Barcelona', 'Manchester-United'); winnerAssociation <- c('FR', 'DE', 'ES', 'DE', 'IT', 'DE', 'ES', 'ENG'); runnersUpTeams <- c('Porto', 'Arsenal', 'Milan', 'Real-Madrid', 'Shakhtar-Donetsk', 'Valencia', 'Celtic', 'Galatasaray'); runnersUpTeamsAssociation <- c('PT', 'ENG', 'IT', 'ES', 'UA', 'ES', 'SCO', 'TR'); countMatrix <- matrix(0, 8, 8) row.names(countMatrix) <- winnerTeams; colnames(countMatrix) <- runnersUpTeams; many <- 20e6; system.time(drawMany(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation, countMatrix, many)) countMatrix <- countMatrix/many; print(countMatrix);
Simulations results can be interpreted based on frequencies (probabilities) of pairings, rather intuitively while probabilities are not that far off . For example if we consider Barcelona, Milan and Arsenal score the largest 0.23 and 0.21. So my guess based on these frequencies, which I select maximums first ,then the second maximum and so on. If all ties I select the highest count using the second table. Here are the predicted pairs, (ordered with highest probability):
Barcelona - Milan
Malaga - Arsenal
Bayern Munich - Real Madrid
Borissia Dordmund - Valencia
Manchester United - Celtic
Juventus - Galatasaray
Schalke 04 - Porto
PSG - Donetsk
Note that predicted pairs are quite depending on your selection strategy.
Table for the frequencies:
Porto | Arsenal | Milan | Real-Madrid | Shakhtar-Donetsk | Valencia | Celtic | Galatasaray | |
---|---|---|---|---|---|---|---|---|
Paris-Saint-Germain | 0.00 | 0.13 | 0.14 | 0.18 | 0.12 | 0.18 | 0.12 | 0.12 |
Schalke-04 | 0.12 | 0.00 | 0.15 | 0.19 | 0.12 | 0.19 | 0.13 | 0.12 |
Malaga | 0.19 | 0.23 | 0.00 | 0.00 | 0.19 | 0.00 | 0.20 | 0.19 |
Borussia-Dortmund | 0.13 | 0.14 | 0.15 | 0.00 | 0.13 | 0.19 | 0.13 | 0.13 |
Juventus | 0.13 | 0.15 | 0.00 | 0.22 | 0.00 | 0.22 | 0.14 | 0.13 |
Bayern-Munich | 0.13 | 0.14 | 0.15 | 0.19 | 0.13 | 0.00 | 0.13 | 0.13 |
Barcelona | 0.18 | 0.21 | 0.23 | 0.00 | 0.19 | 0.00 | 0.00 | 0.18 |
Manchester-United | 0.13 | 0.00 | 0.16 | 0.22 | 0.13 | 0.22 | 0.14 | 0.00 |
Table for the counts (actually all are integers):
Porto | Arsenal | Milan | Real-Madrid | Shakhtar-Donetsk | Valencia | Celtic | Galatasaray | |
---|---|---|---|---|---|---|---|---|
Paris-Saint-Germain | 0.00 | 2589164.00 | 2897024.00 | 3658337.00 | 2356581.00 | 3658892.00 | 2494247.00 | 2345755.00 |
Schalke-04 | 2348458.00 | 0.00 | 2924099.00 | 3735314.00 | 2371245.00 | 3743246.00 | 2517610.00 | 2360028.00 |
Malaga | 3781019.00 | 4506559.00 | 0.00 | 0.00 | 3845872.00 | 0.00 | 3997679.00 | 3868871.00 |
Borussia-Dortmund | 2500221.00 | 2819591.00 | 3098797.00 | 0.00 | 2542013.00 | 3863888.00 | 2646626.00 | 2528864.00 |
Juventus | 2659773.00 | 2983255.00 | 0.00 | 4393444.00 | 0.00 | 4392158.00 | 2892103.00 | 2679267.00 |
Bayern-Munich | 2500835.00 | 2818331.00 | 3098181.00 | 3866890.00 | 2542742.00 | 0.00 | 2647340.00 | 2525681.00 |
Barcelona | 3620075.00 | 4283100.00 | 4684023.00 | 0.00 | 3721268.00 | 0.00 | 0.00 | 3691534.00 |
Manchester-United | 2589619.00 | 0.00 | 3297876.00 | 4346015.00 | 2620279.00 | 4341816.00 | 2804395.00 | 0.00 |
This is the main simulation function:
drawMany <- function(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation, countMatrix, many) { for(i in 1:many) { repeat { dr <- drawOneKnock(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation,0); if(sum(dr) > 0) break; } updateCount <- mapply(incMatrix, dr[,1], dr[,2]) } }
A single draw can be generated as follows:
drawOneKnock <- function(winnerTeams, winnerAssociation, runnersUpTeams, runnersUpTeamsAssociation, names=1) { k=1 repeat { k=k+1; if(k > 1000) return(-1); blockWin = 1:8 ; # tracking for draw blockRun = blockWin; winners = c(); # Draw results runners = c(); for(i in 1:7) { kk =1; repeat { kk=kk+1; if(kk > 1000) return(-1); winner <- sample(blockWin, 1); runner <- sample(blockRun, 1); if(!(runner == winner) && !(winnerAssociation[winner] == runnersUpTeamsAssociation[runner])) { break; } } blockWin <- blockWin[-which(blockWin == winner)]; blockRun <- blockRun[-which(blockRun == runner)]; winners <- c(winners, winner); runners <- c(runners, runner); } winner <- blockWin; runner <- blockRun; # check if last remaining is ok, otherwise invalidate draw if(!(runner == winner) && !(winnerAssociation[winner] == runnersUpTeamsAssociation[runner])) { winners <- c(winners, blockWin); runners <- c(runners, blockRun); if(names) dr <- cbind(winnerTeams[winners], runnersUpTeams[runners]); if(!names) dr <- cbind(winners, runners); break; } } dr }
And counting the pair-ups is performed by a simple function:
incMatrix <- function(i, j) { countMatrix[i,j] <<- countMatrix[i,j]+1; return(0); }
To leave a comment for the author, please follow the link and comment on their blog: Memo's Island.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.