How to access 100M time series in R in under 60 seconds
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
DataMarket, a portal that provides access to more than 14,000 data sets from various public and private sector organizations, has more than 100 million time series available for download and analysis. (Check out this presentation for more info about DataMarket.) And now with the new package rdatamarket, it's trivially easy to import those time series into R for charting, analysis, or anything. Here's what you need to do:
- Register an account on DataMarket.com (it's free)
- Install the rdatamarket package in R with install.packages(“rdatamarket”)
- Browse DataMarket.com for a time series of interest (I found this series on unemployment)
- Copy the URL of the page you're on (the short URL works too, I used “http://data.is/qb61uf”)
- Use the dmseries function with the URL to extract the time series as a zoo object
Here's an example:
> library(rdatamarket) > dminfo("http://data.is/qb61uf") Title: "Persons Unemployed 15 weeks or longer, as a percent of the civilian labor force" Provider: "Federal Reserve Bank of St. Louis" (citing "U.S. Department of Labor: Bureau of Labor Statistics") Dimensions: > unemp <- dmseries("http://data.is/qb61uf") > plot(unemp) > str(unemp) ‘zoo’ series from Jan 1948 to Jul 2011 Data: num [1:763, 1] 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 0.5 ... - attr(*, "dimnames")=List of 2 ..$ : chr [1:763] "1" "2" "3" "4" ... ..$ : chr "Persons.Unemployed.15.weeks.or.longer..as.a.percent.of.the.civilian.labor.force" Index: Class 'yearmon' num [1:763] 1948 1948 1948 1948 1948 ...
With this package, you can go from finding interesting data on DataMarket to working with it in R in less than a minute. With such a wealth of data so easily available to the power of R, this will be a fantastic tool for all data scientists and data journalists.
CRAN: rdatamarket package
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.