Continuous deployment of package documentation with pkgdown and Travis CI
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The problem
pkgdown is an R package that can create a beautifully looking website for your own R package. Built and maintained by Hadley Wickham and his gang of prolific contributors, this package can parse the documentation files and vignettes for your package and builds a website from them with a single command: build_site()
. This is what such a pkgdown-generated website looks like in action.
The html files that pkgdown generated are stored in a docs
folder. If your source code is hosted on GitHub, you just have to commit this folder to GitHub, navigate to the Settings panel of your GitHub repo and enable GitHub pages to host the docs
folder at https://<name_or_org>.github.io/<package_name>
. It’s remarkably easy and a great first step. In fact, this is how the pkgdown-built website for pkgdown itself is hosted.
Although it’s an elegant flow, there are some issues with this approach. First, you’re committing files that were automatically generated even though the source required to build them is already stored in the package. In general, it’s not good practice to commit automatically generated files to your repo. What if you update your documentation, and commit the changes without rerendering the pkgdown website locally? Your repo files will be out of sync, and the pkgdown website will not reflect the latest changes. Second, there is no easy way to control when you release your documentation. Maybe you want to work off of the master branch, but you don’t want to update the docs until you’ve done a CRAN release and corresponding GitHub release. With the ad-hoc approach of committing the docs
folder, this would be tedious.
The solution
There’s a quick fix for these concerns though, and that is to use Travis CI. Travis CI is a continuous integration tool that is free for open-source projects. When configured properly, Travis will pick up on any changes you make to your repo. For R packages, Travis is typically used to automatically run the battery of unit tests and check if the package builds on several previous versions of R, among other things. But that’s not all; Travis is also capable of doing deployments. In this case, I’ll show you how you can set up Travis so it automatically builds the pkgdown website for you, and commits the web files to the gh-pages
branch, which is then subsequently used by GitHub to host your package website. To see how it’s set up for a R package in production check out the testwhat package on GitHub, which we use at DataCamp to grade student submissions and give useful feedback. In this tutorial, I will set up pkgdown for the tutorial
package, another one of DataCamp’s open-source projects to make your blogs interactive.
The steps
- Go to https://travis-ci.org and link your GitHub account.
On your Travis CI profile page, enable Travis for the project repo that you want to build the documentation for. The next time you push a change to your GitHub project, Travis will be notified and will try to build your project. More on that later.
In the
DESCRIPTION
file of your R package, addpkgdown
to theSuggests
list of packages. This ensures that when travis builds/installs your package, it will also installpkgdown
so we can use it for building the website.- In the
.gitignore
file, make sure that the entiredocs
folder is ignored by git: add the linedocs/*
. Add a file with the name
.travis.yml
to your repo’s root folder, with the following content:language: r cache: packages after_success: - Rscript -e 'pkgdown::build_site()' deploy: provider: pages skip-cleanup: true github-token: $GITHUB_PAT keep-history: true local-dir: docs on: branch: master
This configuration file is very short, but it’s doing a lot of different things. Jeroen Ooms and Jim Hester are maintaining a default Travis build configuration for R packages that does a lot of things for you out of the box. A Travis config file with only the
language: r
tag would already build, test and check your package for inconsistencies. Let’s go over the other fields:cache: packages
tells Travis to cache the package installs between builds. This will significantly speed up your package build time if you have some package dependencies.after_success
tells Travis which steps to take when theR CMD CHECK
step has succeeded. In our case, we’re telling Travis to build thepkgdown
website, which will create adocs
folder on Travis’s servers.- Finally,
deploy
asks Travis to go ahead and upload the files in thedocs
folder (local-dir
) to GitHub pages, as specified throughprovider: pages
. Theon
field tells Travis to do this deployment step if the change that triggered a build happened on themaster
branch.
For a full overview of the settings, you can visit this help article. We do not have to specify the GitHub target branch where the docs have to be pushed to, as it defaults to
gh-pages
.Notice that the
deploy
step also features agithub-token
field, that takes an environment variable. Travis needs this key to make changes to thegh-pages
branch. To get these credentials and make sure Travis can find them:- Go to your GitHub profile settings and create a new personal access token (PAT) under the Developer Settings tab. Give it a meaningful description, and make sure to generate a PAT that has either the
public_repo
(for public packages) orrepo
(for private packages) scope.
- Copy the PAT and head over to the Travis repository settings, where you can specify environment variables. Make sure to name the environment variable
GITHUB_PAT
.
- Go to your GitHub profile settings and create a new personal access token (PAT) under the Developer Settings tab. Give it a meaningful description, and make sure to generate a PAT that has either the
The build should be good to go now! Commit the changes to your repo (
DESCRIPTION
and.travis.yml
) to themaster
branch of your GitHub repo with a meaningful message.Travis will be notified and get to work: it builds the package, checks it, and if these steps are successful, it will build the
pkgdown
website and upload it togh-pages
.GitHub notices that a
gh-pages
branch has been created, and immediately hosts it athttps://<name_or_org>.github.io/<package_name>
. In our case, that is https://datacamp.github.io/tutorial. Have a look. What a beauty! Without any additional configuration, pkgdown has built a website with the GitHub README as the homepage, a full overview of all exported functions, and vignettes under the Articles section.
From now on, every time you update the master
branch of your package and the package checks pass, your documentation website will be updated automatically. You no longer have to worry about keeping the generated files in sync with your actual in-package documentation and vignettes. You can also easily tweak the deployment process so it only builds the documentation whenever you make a GitHub release. Along the way, you got continuous integration for your R package for free: the next time you make a change, Travis will notify you they broke any tests or checks.
Happy packaging!
References
- pkgdown: https://github.com/r-lib/pkgdown, https://pkgdown.r-lib.org/.
- testwhat: https://github.com/datacamp/testwhat, https://datacamp.github.io/testwhat.
- tutorial: https://github.com/datacamp/tutorial, https://datacamp.github.io/tutorial.
- Full docs on building an R project with Travis CI: https://docs.travis-ci.com/user/languages/r/.
- Interested in getting involved with the internals of
testwhat
? We’re hiring a software engineer to further improve our automated checking and feedback generation and establish DataCamp as the leading player in the field. Check out our job listing!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.