Russian elections
[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Just a few words about the Russian election. I read this entry http://www.badscience.net/2012/03/is-there-statistical-evidence-of-fraud-in-the-russian-election-data/ and thought to look for myself. For me it seems the data is not good enough to answer the fraud question.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Downloading data, reading and just look:
> r1 <- read.xls("xxxxxxxxxxxxxx")
> head(r1)
projecturl id updt region uik obstrusted INVALID VALID
1 http://sms.golos.org 1 38324.72 27 650 1 4 323
2 http://sms.golos.org 2 38689.09 25 216 0 9 927
3 http://sms.golos.org 3 38324.72 38 732 1 7 1282
4 http://sms.golos.org 4 38324.72 25 291 0 14 1185
5 http://sms.golos.org 5 38324.72 38 668 0 15 1510
6 http://sms.golos.org 6 38324.72 27 198 0 15 1889
Zhirinovsky Zyuganov Mironov Prokhorov Putin
1 42 40 3 24 214
2 88 229 58 92 460
3 80 333 46 150 673
4 129 315 67 175 499
5 76 395 70 227 742
6 127 353 115 379 915
Data looks good. Some unknown columns, region, VALID and the contenders look pretty straightforward.
Some regions occur once, others quite often. Some are completely missing
> regs <- xtabs(~ region,data=r1)
> names(regs[regs==1])
[1] “13” “32” “43” “65” “75” “86” “87”
Quite some difference in counts per region, as per the next plot. That is actually very odd, for someone not knowing about this field..
plot(xtabs(VALID ~ factor(region,levels=min(region):max(region)),data=r1))
And, if we think VALID=Zhirinovsky + Zyuganov + Mironov + Prokhorov + Putin, that is not true either.
r1$myValid <- with(r1, Zhirinovsky + Zyuganov + Mironov + Prokhorov + Putin)
plot(myValid ~ VALID,data=r1)
The data just do not add together.
Conclusion
The data is either not complete and contains too many questions to even think about looking for fraud, or this is the true data and it is so bad as seen here and the fraud is obvious.
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.