Import data to R from SAS, SPSS and Stata with Haven
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Regardless of the tool you use to analyse data, you'll often have to access data living in file formats generated by other tools. The “haven” package from RStudio allows you to import and export data in SAS, SPSS and Stata formats. Version 1.0 was released on October 4, and is now available on CRAN. Haven is also installed as part of the tidyverse.
Haven augments the base R foreign package with additional formats. The core read/write engine is the ReadStat package, and it provides support for:
- SAS binary files (SAS7BDAT), including compressed files
- SPSS .sav and .por files
- Stata files
Haven takes special care in handling missing values in these file formats, and includes tools for extracting information from the specialized missing value representations in each format. It also has improved support for handling dates and times. (See the blog post announcing Haven for details.)
For those working in the pharmaceutical industry, there's one unfortunate omission in ReadStat (and therefore Haven) thus far. While the FDA does not mandate that SAS be used for analysis in clinical trials, it does mandate that the data be provided in the SAS Transport File (XPORT) format, which is an open standard. Hopefully this will be added in a future release, but in the meantime you can read XPORT files with the base read.xport function, and write them with the SASxport package.
For more information on Haven, check out the Haven website. Thanks to Hadley and the RStudio team for providing this useful functionality to R!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.