laptop-friendly analysis of the census of 82 countries with r and monetdb
[This article was first published on asdfree, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
the integrated public use microdata series international (ipumsi) has been my white whale since i started in survey research. non-demographers, perhaps think of this repository as a martryoshka varanasi-kaaba-ark of the covenant: nothing compares. the minnesota population center amassed half a billion person-level records from national statistics offices across the globe. it’s all free and ready for download, so long as you have a project idea and an institutional affiliation. so my turn to talk? because now the software needed for analysis is free as well, and markedly superior to anything that’s available for purchase. 277 censuses later, roll credits. these tutorials maniacally document every step necessary toWant to share your content on R-bloggers? click here if you have a blog, or here if you don't.
- download and import your extract either directly into working memory or definitively into a hyperfast column-store
- construct a probability-weighted survey design object with legitimate, defensible standard errors
- compute any statistic that a mad (social) scientist might conceive from this infinity
click here to get started working with ipums international
notes: unless you plan to make severe edits to my example code, individual extracts must contain a single year and a single country and be formatted as a csv. the actual extract link can simply be copied and pasted into your r script from the url highlighted in the screenshot below. each extract should include the variables “serial”, “strata”, and “perwt” if you plan on calculating statistics to be shared anywhere beyond fingerpainting class. these census files cannot be treated as simple random samples, those three columns contain the information necessary for my scripts to handle everything correctly.
confidential to sas, spss, stata, and sudaan users: neil armstrong would give pogo sticks the same look i’m giving your softwares right now. time to reserve your spot on apollo eleven. time to transition to r. 😀
To leave a comment for the author, please follow the link and comment on their blog: asdfree.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.