Speeding up your Continuous Integration Builds
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Continuous integration is an amazing tool when developing R packages. We push a change to the server, and a process is spawned that checks we haven’t done something silly. It protects us from ourselves! However this process can become slow, as typically the CI process starts with a blank virtual machine (VM).
If you are using R, then the current most popular CI pipeline is Travis CI, but there’s also Jenkins, GitHub Actions, GitLab CI, Circle CI and a few others. They all follow the same idea. Start a VM, install your R package, then run a bunch of checks. One obvious bottle neck is the “install your R package” step, as any R package may have a large number of dependencies.
In a recent post, we showed the different ways of speeding up package installation (worth checking this out if you find package installation/updating slow). In this post, we’ll discuss leveraging some of those techniques for our CI pipeline.
RStudio Package Manager (RSPM)
The RStudio package manager is perhaps the easiest way of speeding up your CI process. RSPM provides precompiled binaries for CRAN packages, which should ensure a faster install. To test this I made a simple package, with no functions, but a dependency on the tidyverse, .i.e. Imports: tidyverse
in the DESCRIPTION file. Then I started two travis CI jobs. The first had a .travis.yml
file
language: r cache: packages
The total time for this travis job was around twelve minutes.
The second job had same two lines, but also an additional before_install:
line
before_install: - echo "options(repos = c(CRAN = 'https://packagemanager.rstudio.com/all/__linux__/xenial/latest'))" >> ~/.Rprofile.site - echo "options(HTTPUserAgent = paste0('R/', getRversion(), ' R (', paste(getRversion(), R.version['platform'], R.version['arch'], R.version['os']), ')'))" >> ~/.Rprofile.site
While looking complicated, it is actually fairly simple. The first line adds the RStudio binary package repository to the .Rprofile
. The second adds an HTTPUserAgent
to the .Rprofile
to enable packages that are installed via Rscript
to also use the binary package versions. These few lines cut the travis build time from around 12 minutes to under 4 minutes.
The above is an incredibly easy way to speed-up your CI steps and works with other CI systems. If you use GitHub Actions, then this has already been implemented.
A couple of things to note
- The above code is for Ubuntu 16.04 Xenial. If you are using
18.04 bionic
, then change in the obvious way - There are few different OSs available for RSPM
- If you are interested in using the RSPM in your own organisation, give us a shout – we’re RStudio Partners.
Other methods
There are three other possibilities for reducing your CI time.
-
The first is similar to the RStudio package manager and use binary builds, but this time use the Ubuntu versions provided by Michael Rutter. The general idea is to add a new Ubuntu package repository, then install packages via
apt install r-cran-*
. Details are available at CRAN. Also see Dirk Eddelbuettel’s recent blog post and youtube video for even more details. -
Alternatively, we could use the
ccache
trick, where we store compiled files to be used for the next build. This requires a little more work, but this has already been done by Patrick Schratz -
Parallel builds using the
Ncpus
argument withinstall.packages()
typically doesn’t typically work for most CI systems, as the (free) VM will only have a single core.
Jumping Rivers are full service, RStudio certified. Part of our role is to offer support in RStudio Pro products. If you use any RStudio Pro products, feel free to contact us ([email protected]). We may be able to offer free support.
The post Speeding up your Continuous Integration Builds appeared first on Jumping Rivers.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.