socialR: Reproducible Research & Notebook integration with R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I’ve created an R package that uses social media tools for reproducible research. The goal of the package is this: whenever I run a code, output figures are automatically added to my figure repository (Flickr), linked to the timestamped version of the code that produced them in the code repository. Figures should be tagged by project and be embedded selectively or automatically into this lab notebook. The basic workflow of the notebook looks like this:[ref]Diagram of my notebook as presented at Science Online, 2011, see other slides in my entry on this.[/ref]
To do this, I use a few simple R functions that I wrap around the system command-line programs git, flickr_upload, and hpc-autotweets to enable monitoring of my simulations through social media. The package has it’s own git repository here. This is a rather custom development to make for rapid deployment on my own machines, and depends largely on Linux tools external to R, so it may not be easily deployed by others. See my earlier post, Making R Twitter, for examples and back story.
Basic Features
All of these tasks are run by wrapping any plot command with my command “social_plot()”
- Push the running code version to Github.
- Grab the git hashtag to reference this version of the code.
- Push figures to Flickr as they complete. Tags images appropriately and provide link to the code (version-stable, on Github) that produces them in the description.
- Tweet notification of a figure upload, parameter values, links to code, and timestamp.
- Tweet when an error occurs.
Setup / Install
- Create a flickr account (need not be unique for the computer).
- Create a twitter account (preferably separate one for the machine).
- Install flickr_upload:
; sudo apt-get install libflickr-upload-perl
- Install tweepy:
easy_install tweepy
- Configure flickr_upload credentials.
- Configure OAuth for tweepy.
(See link for more detailed instructions)
Future modifications
Current program relies entirely on external command-line tools. Probably no easy solution to make this package self-contained and cross platform. Still, a good bit of functionality can be added:
- Add option to include the git log message.
- Smart/more informative git commit messages
- Add option/default to use truncated git commit ID numbers
- Make Flickr discription actually link directly to code.
- Make twitter statements include urls/actual links (to code, files)
- Identify machine credentials?
- Documentation still needed
- Should verify if the git version is current
- Grab a DOI for the object (i.e. using EZID from UC3?)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.