More football

Gianluca Baio

10 years ago

[This article was first published on Gianluca Baio's blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Given that Sampdoria have lost their fourth game in a row, I am not really interested in football any more (as of yesterday, I actually find it a very boring game and think we should focus on real sports $-$ of course if we manage to break the crappy run, I’ll be back to liking it…). But, because Kees has done some more work using R2jags (here and here $-$ I’ve actually only seen this morning, although I think some of it at least is from a couple weeks back), I thought I would spend a few minutes on football anyway.

In the first, he’s modified his original model and (I guess) linked to our formulation; in his new Bayesian model, he specifies the number of goals scored by the two competing teams as Poisson variables; the log-linear predictor is then specified depending on the “home-effect”, the attack strength for that team and the defence strength of the opponents.

In our formulation, the attack and defence effects were modelled as exchangeable, which induced correlation among the different teams and therefore between the two observed variables. Kees uses a slightly different model where they both depend on an “overall” strength for each team (modelled as exchangeable) plus another exchangeable, team-specific “attack/defence balance” effect.

This is interesting, but the main problem seems to remain the fact that you tend to see clear mixtures of effects (ie “bad” teams having quite different behaviour from “good” teams); he tried to then use the mix module in JAGS to directly model the effects as mixtures of exchangeable Normal distributions, but seems to have run in some problems.

We did it slightly differently (ie using informative priors on the chance each team belongs to one of three categories: “top” teams; “strugglers”; and “average” teams). We only run it in BUGS, but because of our specification, we didn’t have problems with convergence or the interpretation (only slightly slower). I’ve not tried the JAGS version with the mix module (I suppose we should fiddle a bit with the original code $-$ which is here, by the way, is someone has time/interest in doing so). But it shouldn’t be a problem, I would think.

In Kees model, he needs to fix some of the effects to avoid convergence and identifiability issues, if I understand it correctly. Marta and I had always wanted to explore the mixture issue further, but never got around to have time to actually do it. If I only I could like football again…

To leave a comment for the author, please follow the link and comment on their blog: Gianluca Baio's blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.