R and the Data Science Toolkit
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I recently decided to present a talk to the Denver R Users Group (DRUG) on how to make an R package (May 17). There were only two problems: (1) I’ve never made a package and (2) I had nothing in mind to package up. At about this same time, Pete Warden and others were blogging about the iPhone tracking issue [1]. How are these two events related? Well, I remembered that a few of my favorite Twitter ‘friends’ posted some things related to Pete Warden’s “The Data Science Toolkit (DSTK)” [2] a while back. And? And at the time I thought that it would be cool to have an R package/wrapper for accessing the DSTK’s API, similar to Drew Conway’s R wrapper for the infochimps API.
So I’m happy to announce that after spending a little time on this project in the past week, Version 0.1 of the RDSTK package is available on github. I haven’t submitted this package to CRAN and, hence, you need to install it from source (RDSTK_0.1.tar.gz). In order to do this, use the install.packages() function within R or R CMD INSTALL from the shell prompt. Note that the package depends on the RCurl, plyr, and rjson packages.
The following functions are included in the package:
- street2coordinates
- ip2coordinates
- coordinates2politics
- text2sentences
- text2people
- html2text
- text2times
They should be easy to use if you are familiar to the DSTK API. If not, RTFM!
Let me know if you have any comments and/or suggestions. Happy hacking.
Acknowledgements:
I wanted to mention that I received a bit of help with the RCurl package from “Noah” on stackoverflow, Andy Gayton on stackoverflow, and Duncan Temple Lang on the R-Help list. Thanks!
Footnotes:
- To borrow a joke from Asi Behar, “Right after word leaks that the iPhone has been tracking your location at all times, we find Osama. Coincidence? Thanks Apple!”
- You may recall that a while back, I tweeted about disliking the phrase “data science”. My feelings have not changed.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.