Site icon R-bloggers

Open data sets you can use with R

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R is an environment for programming with data, so unless you're doing a simulation study you'll need some data to work with. If you don't have data of your own, we've made a list of open data sets you can use with R to accompany the latest release of Revolution R Open.

At the Data Sources on the Web page on MRAN, you can find links to dozens of open data sources both large and more. You'll find some classics of data science and machine learning, like the Enron emails data set, and the famous Airlines data. You can find official statistics on economics and government from countries around the world, including links to every country's official data repositories at UNdata. There are links to scientific data, including several sources from the social sciences. And of course you'll find links to various financial data sources (but not all of these are 100% free to use).

Many of the data sets are indicated as ready-to-use in R format; for the others, you can use R's various data import tools to access the data (for which there is a great guide at ComputerWorld).

Got other suggestions for great open data sources? Let us know in the comments below, or send an email to mran@revolutionanalytics.com.

MRAN: Data Sources on the Web

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.