The case of the missing zeroes
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Political polling is a big industry these days, especially here in the US, and both mainstream news outlets and many of the bigger political blogs commission their own polls to measure (for example) the popularity of a sitting or candidate politician or policy. In the last week though, a very public spat has erupted between the left-leaning political blog Daily Kos and the polling firm Research 2000, commissioned by Daily Kos to produce a series of polls for the site. An investigation by statistician-readers of Daily Kos suggested some unusual features of the polling data provided by Research 2000, mainly that the results from tracking polls appear to be underdispersed compared to what you’d expect given the sample sizes. Other results suggest data drawn from unexpected distributions, as illustrated for example by this histogram of week-to-week changes in a measure of Obama’s favorability over a period of 60 weeks:
This and other anomalies in the data have led Daily Kos to sue Research 2000, claiming that they “handed Daily Kos fiction and claimed it was fact”. (On the other hand, Nate Silver of fivethirtyeight.com, while having his own reservations about Research 2000, suggests a less nefarious explanation for modifying data by hand.)
In any case, this story is an illustrative example of how statistical concepts (randomness, distributions, samples, data) have increasing prominence in political discourse today — largely as a result of this increased use of polling — and how the stakes have been raised by the data they generate.
Daily Kos: Research 2000: Problems in plain sight
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.