Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It’s common knowledge that the way you ask a question in a survey can bias the results you get. (It’s been a staple of political pollsters since the dawn of time.) But Aaron Shaw from Dolores Labs has used an interesting technique to demonstrate that bias: crowdsourcing. He asked the same question of crowdsourced respondents assigned randomly to one of two groups, and offered a different way for each group to respond:
Q: About how many hours do you spend online per day?
Group 1 selected from these responses:
< face="Courier">(a) 0 – 1 hour
(b) 1 – 2 hours
(c) 2 – 3 hours
(d) More than 3 hours< >Group 2 selected from these responses:
< face="Courier">(a) 0 – 3 hours
(b) 3 – 6 hours
(c) 6 – 9 hours
(d) More than 9 hours< >
Each set of answers covers the entire range of possible hours in a day, just grouping them into different buckets. In theory, you can estimate the true underlying distribution from either set of responses. Since the groups were selected randomly, the underlying distributions for each group should be the same. With some analysis in R, though, Aaron discovers that’s not the case. See the link below for the details.
Dolores Labs: Ask a Stupid Question
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.