source_GitHubData: a simple function for downloading data from GitHub into R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Update 31 January: I’ve folded source_GitHubData
into the repmis packaged. See this post.
Update 7 January 2012: I updated the internal workings of source_GitHubData
so that it now relies on httr rather than RCurl
. Also it is more directly descended from devtool
‘s source_url
command.
This has two advantages.
- Shortened URL’s can be used instead of the data sets’ full GitHub URL,
- The ssl.verifypeer issue is resolved. (Though please let me know if you have problems).
The post has been rewritten to reflect these changes.
In previous posts I’ve discussed how to download data stored in plain-text data files (e.g. CSV, TSV) on GitHub directly into R.
Not sure why it took me so long to get around to this, but I’ve finally created a little function that simplifies the process of downloading plain-text data from GitHub. It’s called source_GitHubData
. (The name mimicks the devtools syntax for functions like source_gist
and source_url
. The function’s syntax is actually just a modified version of source_url
.)
The function is stored in a GitHub Gist HERE (it’s also at the end of this post). You can load it directly into R with devtools’ source_gist
command.
Here is an example of how to use the function to download the electoral disproportionality data I discussed in an earlier post.
# Load source_GitHubData library(devtools) # The functions' gist ID is 4466237 source_gist("4466237") # Create Disproportionality data UrlAddress object # Make sure the URL is for the "raw" version of the file # The URL was shortened using bitly UrlAddress <- "http://bit.ly/Ss6zDO" # Download data Data <- source_GitHubData(url = UrlAddress) # Show Data variable names names(Data) ## [1] "country" "year" "disproportionality"
There you go.
Note that the the function is set by default to load comma-separated data (CSV). This can easily be changed with the sep
argument.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.