Here is Your Data
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
It’s a common situation: you want to code and debug in R *and* leverage RMarkdown for a presentation or document.
The challenge: file paths.
Executing code in the console and from within a saved RMarkdown document typically requires distinct file paths to locate data files.
While you’re writing your code and debugging, you’ve probably got your source code open and are sending lines of code to the console to be evaluated. The code is evaluated with respect to the current working directory. In an RStudio project, that will be the top directory level of your project.
In a typical project, you’ll have several subdirectories:
- project - data - data-raw - output - R
So, a casual reference to the data directory might look like this:
read.csv("data/mtcars.csv") # to execute in the console
If you write an RMarkdown document, you’ll save that in your R directory. While you’re debugging code, you’ll be sending lines of code to the console for evaluation, as above. So, the project’s top level directory is used as a starting point.
However, when you knit the document (or spin, or compose notebook… any of the aliases to knitr::knit()
and associated functions), the document is evaluated in its own environment. From the perspective of knitr, the base directory is the directory in which you saved the .R, .md, or .Rmd document. Assuming you saved your RMarkdown script in R/
, then knitting that document with the same code as above
read.csv("data/mtcars.csv")
looks for R/data/mtcars.csv
within your project directory, which doesn’t exist.
The common, but not optimal, solution is to tell the knitted document to look up one directory level before looking for the data directory:
read.csv("../data/mtcars.csv") # to knit
..
is geek-speak for “parent directory” AKA “up one directory level”. If you needed to look up two levels, you would refer to ../../data/mtcars.csv
That’s great for knitting, but when you want to debug your code interactively in the console, you have to remove the ../
portion again. What a mess!
The here package addresses this issue. here provides a function, called here(), that returns a file path based on where your project’s top level directory is. So, when you evaluate the following line in the console
read.csv(here::here("data", "mtcars.csv"))
here looks for an .Rproj
file or other indicators of where your current project is located and constructs a file path to that top level directory, plus a data
subdirectory, plus a file named mtcars.csv
in that directory:
[1] "/Users/wdoane/Documents/Repositories/example_project/data/mtcars.csv"
Equivalently, you could first library()
or require()
here:
library(here) dir.create(here("data")) write.csv(mtcars, here("data", "mtcars.csv")) read.csv(here("data", "mtcars.csv"))
To install the current release version of the here package
install.packages("here")
or to install the development version
devtools::install_github("r-lib/here")
For more information about here, visit https://github.com/jennybc/here_here
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.