Site icon R-bloggers

Handling missing data with Amelia

[This article was first published on is.R(), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

So, what if you have data, but some of the observations are missing? Many statistical techniques assume no missingness, so we might want to “fill in” or rectangularize our data, by replacing missing observations with plausible substitutes. There are many ways of going about this, but one of the most robust and accessible is through the Amelia package.

Today’s Gist applies multiple imputation to some sample ANES survey data, and compares listwise-deleted regression results to results pooled from the same regression run on ten imputed data sets. Amelia makes this imputation, modeling, and recombination straightforward, and I’ve thrown in a nice coefficient plot (using position_dodge!) to illustrate the differences between missing data approaches.

https://gist.github.com/4224887

To leave a comment for the author, please follow the link and comment on their blog: is.R().

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.