Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
How many times has your code unexpectedly stopped working? Even better, how many times did the code work on your machine, but your coworkers couldn’t run it? Both questions share the same underlying problem – R environment.
That’s where R renv
comes in – a dependency management package that will ensure issues like these won’t happen ever again. Today you’ll learn all about renv
through a hands-on example. You’ll also see how to work with Plotly and Git. Let’s get started!
Need to make interactive Markdown documents? Try R Quarto – you’ll never look back.
Table of contents:
- What is R renv and Why Should You Care?
- 2 Ways to Use renv in Your R Project
- How to Take Snapshots of your R Environment
- How to Restore your R Environment with a Single Command
- Summing up R renv
What is R renv and Why Should You Care?
Renv stands for Reproducible Environment and does just what the name suggests. Developers often have trouble managing the R environment and dependencies due to reasons of how R works by default. It installs packages to a central library and shares them between the projects.
It sounds like a good and time-saving feature. After all, you don’t need to install the same package in every project. But that’s where the problems arise. You might have a newer version of some package than your coworkers – resulting in a deprecated or not-implemented functionality.
End result? The app crashes or the code won’t run altogether, and no one knows why because the code didn’t change.
The answer is always the same – environment differences and package version mismatch. R renv
is here to create a separate, reproducible environment that you and your coworkers can use, hassle-free.
To be more precise, renv
will do the following:
- Create a project-specific library (no dealing with the central R library).
- Create a list of dependencies and their version, making it easy to share the project.
You now know why you should use renv
, but how do you actually go about it? Let’s answer this question next.
2 Ways to Use renv in Your R Project
There are two distinct ways to leverage renv
– when first creating a project and later through a command line.
When creating a new project, just tick the second checkbox, as shown in the image below:
In case you forgot to do it, or prefer doing things through a command line, just type the following into the R console:
renv::activate()
This will do all the housekeeping for you, and create several files and folders (we started with an empty directory):
Let’s discuss the purpose of each file next.
Files and folders created by R renv – What do they mean?
Here’s the meaning behind every file/folder created by renv
:
.Rprofile
– A file run by RStudio every time you load or reload an R session. It calls therenv/activate.R
file.renv/.gitignore
– Tells Git to ignore thelibrary
folder, as it contains dependencies that can be large in size. There’s no need to keep track of them, as the correct version can easily be downloaded by your coworkers.renv/activate.R
– A file used to activate a local R environment.renv/library/*
– Folder with many subfolders – contains the project dependencies.
And with that out of the way, let’s discuss an essential R renv
topic – snapshots.
How to Take Snapshots of your R Environment
A snapshot is a term used by renv
determine dependencies used in your R project and write them to a separate file – renv.lock
.
You’ll see how it works in a second, but first, let’s install some R packages:
install.packages("dplyr") install.packages("plotly")
Once the packages are installed, take a snapshot by running the following line from the R console:
renv::snapshot()
Specify that you want to proceed by typing y
when prompted, and you’re good to go! A new file named renv.lock
is now created, containing all project dependencies and sub-dependencies with the correct version:
The file is almost empty (the only dependency being renv
), so what gives? Well, as said earlier, renv
will store only the dependencies used in your project. Since there’s no R code importing either dplyr
or plotly
, they weren’t added to the lock file.
Let’s change that by making an R file that will render a chart. We’ve copied some code from the Plotly documentation, and encourage you to do the same:
library(plotly) x <- c("Product A", "Product B", "Product C") y <- c(20, 14, 23) y2 <- c(16, 12, 27) text <- c("27% market share", "24% market share", "19% market share") data <- data.frame(x, y, y2, text) fig <- data %>% plot_ly() fig <- fig %>% add_trace( x = ~x, y = ~y, type = "bar", text = y, textposition = "auto", marker = list( color = "rgb(158, 202, 225)", line = list(color = "rgb(8, 48, 107)", width = 1.5) ) ) fig <- fig %>% add_trace( x = ~x, y = ~y2, type = "bar", text = y2, textposition = "auto", marker = list( color = "rgb(58, 200, 225)", line = list(color = "rgb(8, 48, 107)", width = 1.5) ) ) fig <- fig %>% layout( title = "January 2013 Sales Report", barmode = "group", xaxis = list(title = ""), yaxis = list(title = "") ) fig
Here’s the figure displayed by Plotly:
You can now once again take the snapshot:
renv::snapshot()
And take a look at the lock file:
As you can see, there are many dependencies listed, but we’ve used only one – plotly
. The reason is simple, plotly
needs a handful of packages in order to work, and each of these packages has its own dependencies. You can now see how quickly dependency management can become a nightmare.
But how easy it is now for other developers to recreate this environment? To answer this question, we have to put ourselves in the shoes of other developers.
Pushing Your R Project to GitHub
This will allow us to clone the project and start fresh (just as you weren’t the author of the code), and determine if dependency management with renv
really works.
Start by creating a new repository on GitHub:
Initialize the R project folder as a Git project, and push it to a remote with the following set of commands:
git init git add . git commit -m "initial commit" git remote add origin https://github.com/<you>/<project>.git git push -u origin main
Assuming you did everything correctly, you’ll see the R project pushed to the Main branch:
Next, let’s restore this R environment to test if dependency management works as advertised.
How to Restore your R Environment with a Single Command
Let’s start by cloning the repository into a new folder – NewRenvProject
:
git clone https://github.com/<you>/<project>.git NewRenvProject
As soon as you open it as a project in RStudio, you’ll see the message from renv
telling you how to restore the environment:
Just run the following command:
renv::restore()
And you’ll be good to go:
The R environment is now restored and you have access to all dependencies at the correct version, just as intended by the project author.
Summing up R renv
And that’s how easy it is to manage project package dependencies in R environments. It all boils down to three renv
functions – activate()
, snapshot()
, and restore()
. You’ve learned how each works through a practical example, and by now, we hope you can appreciate the heavy lifting renv
does for you.
What’s your favorite way to manage dependencies in R? Are you still using the outdated packrat package? Please let us know in the comment section below. Also, don’t hesitate to move the discussion to Twitter – @appsilon. We’d love to hear from you.
What do R Shiny developers do? Here’s a typical day in the life of Appsilon’s Alexandros Kouretsis.
The post appeared first on appsilon.com/blog/.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.