Handling missing data with Amelia
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
So, what if you have data, but some of the observations are missing? Many statistical techniques assume no missingness, so we might want to “fill in” or rectangularize our data, by replacing missing observations with plausible substitutes. There are many ways of going about this, but one of the most robust and accessible is through the Amelia package.
Today’s Gist applies multiple imputation to some sample ANES survey data, and compares listwise-deleted regression results to results pooled from the same regression run on ten imputed data sets. Amelia makes this imputation, modeling, and recombination straightforward, and I’ve thrown in a nice coefficient plot (using position_dodge!) to illustrate the differences between missing data approaches.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.