Reading REDATAM databases in R

[This article was first published on pacha.dev/blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

REDATAM

REDATAM (Retrieval of Data for Small Areas by Microcomputer) is a data storage and retrieval system created by ECLAC and it is widely used by national statistics offices to store and manipulate census and survey data.

However, conducting statistical analysis with REDATAM databases, such as Poisson or Negative Binomial regression, can be tricky due to their unique format that can be opened with an official point-and-click tool that allows to conduct counts and averages without additional features like SPSS, another point-and-click tool, that allows to test hypothesis and use a wide range of statistical functions.

REDATAM Converter

The REDATAM Converter is an open-source tool designed to extract raw information from REDATAM databases, used for census microdata. Whether you’re a statistician, researcher, or data analyst, this tool allows you to convert these databases into CSV files compatible with R, Python, Google Sheets, Microsoft Excel, and other data analysis tools.

Initially written in C# by Pablo de Grande, the REDATAM Converter has been fully rewritten in C++ for improved portability and efficiency. Now, with the release of an R package, the REDATAM Converter allows seamless integration of REDATAM databases directly into your R workflows.

REDATAM Converter R Package

The latest development in the REDATAM Converter is the release of an R package that lets users read REDATAM data directly into R. This can be a game-changer for researchers and analysts working in R, as it removes the need for exporting data to CSV files first with our converter, which requires command line usage. Instead, you can load and work with the data directly in R.

Key Features of the R Package:

  • Directly reads REDATAM databases into R using read_redatam().
  • Works with both .dic and .dicx formats.
  • Integrates seamlessly with other R packages such as dplyr, allowing for easy data manipulation and analysis.
  • Uses the Redatam Converter written in C++, which is fast and memory efficient.

To install the R package, you can run the following command in R:

remotes::install_github("pachadotdev/redatam-converter/rpkg", subdir = "rpkg")

Once installed, using the package is as simple as pointing it to your REDATAM dictionary. For example, to load data from the 2017 Chilean Census, you can unzip the file and run:

redatam::read_redatam("CP2017CHL/BaseOrg16/CPV2017-16.dicx")

For more detailed examples, check out the vignette included in the package documentation.

To leave a comment for the author, please follow the link and comment on their blog: pacha.dev/blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)