Site icon R-bloggers

An update to the checkpoint package

[This article was first published on Revolutions, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

by Andrie de Vries

During October 2014 we announced RRT (the Reproducible R Toolkit) that consists of the checkpoint package and the MRAN. In January, David Smith followed up with another post about reproducibility using Revolution R Open.

Since then, we've had several requests for new features and enhancements. The development code for checkpoint is available at GitHub.  

The current release 0.3.8 contains many new features:

  • Allow users to specify any folder location as the checkpoint library location.  Previously, checkpoint always installed packages in the location ~/.checkpoint.  This is still the default, but now you can change this, for example to store your checkpoint packages on a USB drive.
  • Add option to run checkpoint() without scanning for packages. This option answers the use case where you run code in a production environment, and you are already certain that all package dependencies are installed in the .checkpoint folder.  In this special case, not scanning for packages leads to lower latency and better performance.
  • You can now specify that your checkpoint project depends on a specific version of R.
  • Removed dependency on the knitr package.  If the knitr package is available on your machine, then checkpoint will scan all rmarkdown script files in the project for package dependencies.

Also, several enhancements:

  • Progress reporting while installing packages.  If you happen to scan a project with many R scripts, the scanning process can take some time.  The checkpoint function now provides a progress bar indicator.
  • Inform user when packages are found that don't exist in the MRAN snapshot.
  • Include direct namespace calls with :: or ::: into scan for packages, for example package::foo() or package:::bar(). This is in addition to any occurrences of library() or require() statements.
  • Return diagnostic information from checkpoint().  Previously, checkpoint() always returned NULL.  Now checkpoint() invisibly returns a list with diagnostic information, e.g. which packages were found during the scan process.
  • Improve messages when scanning project for packages. Rather than providing a cryptic message for each package, now checkpoint() prints a helpful message and lists all files that could not be scanned.
  • Improve handling of checking for knitr availability. For example, if checkpoint() finds an rmarkdown file, but knitr is not available, you get a helpful warning message with the name of the file that could not be scanned.
  • Added vignette with sample code.

Performance improvements:

  • Checks if required packages are installed and doesn't re-install them if so 

Finally, some bug fixes that were potentially annoying:

  • No longer displays warnings when installing base packages.
  • When encountering checkpoint() no longer throws an error.

You can easily download and install the latest version of GitHub as follows (with thanks to the devtools package):

library(devtools)
install_github("RevolutionAnalytics/checkpoint")

GitHub (Revolution Analytics): checkpoint 

To leave a comment for the author, please follow the link and comment on their blog: Revolutions.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.