Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I wanted to add my two-cents worth to the “Deflate-gate” statistics row that’s been going on recently. If you want to catch up with the story, then you should start with the original articles at Sharp Football Analysis (1, 2) and this critique by Gregory J. Matthews and Michael Lopez. And if you want to get the data to follow along at home, it’s available from Sharp Football Analysis.
If you’re new to all of this, then the question is this: are the New England Patriots “fumble-proof”? In other words, for whatever reason, do the Patriots drop the ball significantly less than their competitors in the NFL? A source cited by the original Sharp Football Analysis claimed yes, the statistics do look odd. Given random fluctuation and assuming a normal distribution, they claimed that the chance of seeing the Patriots’ fumble performance was 1 in 16233.77. Matthews and Lopez then debunked this, highlighting that a normal distribution on fumbles is not the same as a normal distribution on fumbles per play.
But a normal distribution is a terrible choice here. A much better model is a binomial distribution. A fumble can be thought of like a coin toss. Each play is an independent event with a constant chance of fumbling (to first order, ignoring pass vs run plays, weather, etc) and so, if we have a fumble rate of 50%, then after 100 plays, we wouldn’t have exactly 50 fumbles. But we would have a 95% confidence that the true number of fumbles was between 40 and 60.
The data linked above tells us the number of plays and fumbles for the rest of the NFL and for the Patriots for every year from 2000 and 2014. I’ve used this to calculate the fumble rate for the rest of the NFL for each year and then calculated the corresponding 95% confidence interval; in other words, how many fumbles would you expect to see given this league-wide fumble rate and the number of plays the Patriots ran? This is plotted below, alongside the actual performance of the Patriots. As you can see, the Patriots are on the lower side of the distribution but in most years, the observed number of fumbles is safely within the 95% confidence interval.
The major exception to this is 2010, an outlier which corresponds to a probability of 0.04% or 1 in 2344. I don’t watch much (read: any) football so I have no idea why 2010 was special. But I am pretty sure that a binomial model is a better way of thinking about fumble rates, exceptional or not.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.