A Case Study in Reproducible Model Building
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The U.S. Geological Survey (USGS) recently published a report describing a groundwater-flow model of the Wood River Valley (WRV) aquifer system. What makes this report unique (at least in my opinion) was the authors’ desire to make their work as reproducible as possible under budgetary constraints. The collection of raw data, source code, and processing instructions used to build and analyze the model was placed in an non-general-use R package named wrv. The package repository can be found on GitHub. Commands for installing the package are as follows:
repos <- c("http://owi.usgs.gov/R", getOption("repos"))
install.packages("wrv", repos = repos, dependencies = TRUE, type = "both") # about 100 MB, so be patient
While many of the functions are intended for non-general use, there are a few functions that the larger R community might find of interest.
For example, the PlotMap
, PlotGraph
, and PlotCrossSection
functions have been designed for general use.
Report documentation was included in the wrv package as vignettes; these files are also available from the
USGS Publications Warehouse.
For a general overview of the project, I’ll recommend my
useR! 2016 talk:
Any comments or suggestions regarding our approach to reproducible model building can be left below. Please realize that your opinions go a long way in determining whether this type of approach will be used in future projects.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.