[This article was first published on More or Less Numbers, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Last year I posted about how chances of a non-African country winning the Boston Marathon seemed to be good because of the widening interval of winning times (more recently there had been some historically “slower” races and some historically “faster” ones) and this actually happened. Meb Kflezighi ran a remarkable race and was widely celebrated as he represented the US in a race more recently dominated by African countries. His time for winning the race was obviously the fastest, but others in the field had faster PRs. Because of the variation in winning times my conclusion has been that this provides opportunities for certain runners representing non-African countries to contest the race well.
The amount of participants from Africa in the elite field clearly increases the likelihood that the winner represents an African country. The runners in the elite field mostly fall into or below the confidence interval shown in the graph above with the slight exception of Matt Tegenkamp whose PR for the marathon is 2:12 ish, just above where this statistical measurement would encompass. It is clear that once again the elite field is dominated by African runners who are putting up some really impressive PRs.
And yet, with the difference in PRs, last year there was a similar dynamic. Dennis Kimetto comes to the race with a 2:03 PR and Meb Kflezighi wins the Boston Marathon having run a 2:09 PR previously. Thus we have another great story this year. Incredible athletes, some of whom have in the past run much faster than others. And yet, who can tell what will happen race day.
But why try? Why did Meb think he could beat someone who in marathon terms could go somewhere he could not? More broadly, why do we love these events? Why should Matt Tegankamp attempt to rival someone who would be 2 miles ahead of him on each of their best days? Variance. Within these elite athletes there is the notion that on any given day, the guy next to you could be at his best or worst. As spectators, we’re drawn to variance…we love possibilities of things not turning out predictably, or that there is variation in what we assume to be true. Athletes place their hopes in this, that they could run their absolute best and others may not. Confidence intervals tell the story of variance, that statistically we can’t know for certain. I think this year yet again, we could see this same variance play out. The athlete that doesn’t have the fastest PR runs their best despite the odds. This is what makes a great race and what we could see again tomorrow.
The amount of participants from Africa in the elite field clearly increases the likelihood that the winner represents an African country. The runners in the elite field mostly fall into or below the confidence interval shown in the graph above with the slight exception of Matt Tegenkamp whose PR for the marathon is 2:12 ish, just above where this statistical measurement would encompass. It is clear that once again the elite field is dominated by African runners who are putting up some really impressive PRs.
And yet, with the difference in PRs, last year there was a similar dynamic. Dennis Kimetto comes to the race with a 2:03 PR and Meb Kflezighi wins the Boston Marathon having run a 2:09 PR previously. Thus we have another great story this year. Incredible athletes, some of whom have in the past run much faster than others. And yet, who can tell what will happen race day.
But why try? Why did Meb think he could beat someone who in marathon terms could go somewhere he could not? More broadly, why do we love these events? Why should Matt Tegankamp attempt to rival someone who would be 2 miles ahead of him on each of their best days? Variance. Within these elite athletes there is the notion that on any given day, the guy next to you could be at his best or worst. As spectators, we’re drawn to variance…we love possibilities of things not turning out predictably, or that there is variation in what we assume to be true. Athletes place their hopes in this, that they could run their absolute best and others may not. Confidence intervals tell the story of variance, that statistically we can’t know for certain. I think this year yet again, we could see this same variance play out. The athlete that doesn’t have the fastest PR runs their best despite the odds. This is what makes a great race and what we could see again tomorrow.
To leave a comment for the author, please follow the link and comment on their blog: More or Less Numbers.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.