Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The problem
Some time ago, I started a blog. I actually did not post a whole lot of stuff though. I was using Jekyll, but my set-up was rather brittle and there were a few problems:
- I could not use R Markdown directly. I always had to knitr manually to get plain Markdown from my R Markdown files and then use them as input for Jekyll. Since the majority of my posts involved at least some R code, this was not a very elegant thing to do.
- Since every R Markdown post had to be converted into plain Markdown in a ( more or less) manual way, I could not use automated builds on GitHub Pages.
- This, in turn, meant I had to build the content locally and commit all derivatives (i.e. html documents), which isn’t very elegant either.
My solution
Below, I outline how I managed to address these three problems with the following three tools: blogdown to create content, Netlify to deploy the website and Travis to automate the build and deployment step. My approach does not require html or Markdown derivatives (from R Markdown files) to be committed (and pushed) in git, which I think is nice and helps keeping the repo lightweight. You can also preview PR builds. But let’s go step by step.
Step 1: Creating content
I started my journey reading up on blogdown, which I think was a good starting point. If you are like me a few months ago and still are not quite sure what Pandoc, Markdown and knitr exactly are, there is a great resource from RStudio on that and I recommend first freshing up your knowledge on the concepts and tools blogdown is built on.
When I got the basics, I checked out the hugo theme
gallery. I was looking for a simplistic theme, and I
found one: cactus plus. As
advised by Yihui, I created a new RStudio Project and ran
blogdown::new_site(theme = "nodejh/hugo-theme-cactus-plus")
. So far so good. I
copied over some of my old blog posts into this directory and adapted the YAML
headers to match the format of the example posts, which worked smoothly. Next, I
adapted some configurations using the explanations in the cactus plus
README. In
particular, I did the following:
- Adapted the avatar, title and subtitle of the webpage.
- Set
useDescriptionReplaceSummary
inconfig.toml
totrue
to use the custom description from the YAML header for each post. - Added my
disqus
andgoogle_analytics
user name toconfig.toml
. - Most importantly, I set the base url to “/” in in
config.toml
. Note that the base URL should in any case end with a back slash, otherwise certain pages won’t render correctly, i.e. thetags
. However, it may be better to set the base URL to your actual domoain (e.g. https://lorenzwalthert.netlify.com in my case), for example if you want to make sure sites like r-bloggers correctly link back to your page for relative links and your main page. I to my actual domain after “/” has not worked well with r-bloggers. - As we are talking of r-bloggers, you also want to make sure posts as a whole get rendered there, so I followed these instructions on stackoverflow.com on the issue of partial rendering.
I noted that the cactus theme had gorgeous tables of content (toc) at the
beginning of every post. However, they only work if the posts that are input to
hugo are in Markdown, not if they are html already. If you add a YAML option in
your Rmd file that creates a toc, it’s using some default css instead of using
the one defined in the hugo theme. The solution is to use .Rmarkdown
files
instead of .Rmd
for all the posts that should inherit css from the hugo theme.
Using .Rmarkdown
means blogdown won’t render to html, but first to Markdown
and then call hugo to turn the Markdown files into html. You can find some
comments on that in the blogdown
documentation and in
rstudio/blogdown#165 on the
matter. Main drawback of using .Rmarkdown
over .Rmd
is that html widgets and
citations are not supported and you might have issues with math expressions,
which is because hugo is used instead of Pandoc to go from Markdown to html. I
ended up using .Rmarkdown
for the post you are reading because I wanted a toc.
For quite a few other posts, I stuck to .Rmd
.
Now that I had the content, I had to think about how to deploy the page.
Step 2: Deploying the page
I read the section on
deployment in the blogdown
documentation, but it was not particularly revealing to me. It seemed as most
other people in the community just commit html or Markdown files, but I figured
that I’d like to deploy my page without committing these derivatives and operate
on pure R Markdown (i.e either .Rmd
or .Rmarkdown
) instead. A quick Google
search did not help much either. I then realized I might want to use the
Netlify cli tools to deploy a locally built
site with Netlify. The only problem with that was that these only seemed to work
properly on Linux, which is not my primary operating system. It was only later
that I figured out that there are also officially supported cli tools for
Windows and Mac, but they are a legacy
API and hence, you better don’t use it if you are starting fresh. I managed to
deploy a page from my local Linux environment with Netlify. It’s pretty
straightforward once you succeeded to build your website locally and stored it
in the public/
directory. Here is what I did:
- Use the RStudio Addin Serve Site (or
blogdown::serve_site()
), which updates your static site in the self-containing publishing directory,public/
by default. - Follow the video in the Netlify cli tools
documentation, i.e. essentially
$ netlify deploy
mypublic/
folder.
Please note at that point that you cannot proceed if you don’t have a .netlify
folder with state.json
in in. You need to commit this folder later so your
remote CI service has it available for step 3a onward.
Alternatively to everything I wrote under step 2 so far, you can commit and push
Markdown or html derivatives of your R Markdown files and use the Netlify build
command hugo
, potentially specifying the hugo version as part of the build
command (like hugo_0.41
) or via an environment variable. However, I refrained
from doing this because I felt like this approach was not as clean as building
the whole page on a remote CI tool.
Next, I looked into domain settings. I changed the publishing URL in the Netlify
settings, so my page would be available under
https://lorenzwalthert.netlify.com
. You can also use a custom non-Netlify
domain. Next, I added a CNAME
file to the root directory of the repo where my
old website was hosted, so https://lorenzwalthert.github.io
would forward to
https://lorenzwalthert.netlify.com
. I then tried to use the Netlify DNS
service, but I think I won’t be able to show the content of
https://lorenzwalthert.netlify.com
to people visiting
https://lorenzwalthert.github.io
because I do not own the domain. So for now,
it’s ok for me that people just get redirected. Note the side effects of this:
All your other websites such as pkgdown websites published via GitHub pages
become invalid. For this reason, think about whether you really want to redirect
with CNAME.
In addition, I made sure all old URLs redirected correctly by adding a aliases:
[new_url]
entry to the YAML header of my posts where [new_url]
is relative to
the base URL.
Step 3a: Automatically deploying the page
Finally, deploying the page manually every time you make a change seems like a
step that could be automatized – and it is. I had no idea how though. For some
reason, I knew that the tidyverse style guide gets
deployed automatically with Netlify, so I looked at the GitHub repo to figure
out how it might work. I realized that they use Travis to do the deployment. I
inspected the .travis.yml
file and copied it over to my repo. After tweaking
it a little, I ended up with the following .travis.yml
file:
# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r language: R sudo: false cache: packages: true directories: - $HOME/.npm branches: only: - master before_script: - nvm install node - npm install -g netlify-cli - Rscript -e 'blogdown::install_hugo()' script: - Rscript -e 'blogdown::build_site(local = Sys.getenv("TRAVIS_PULL_REQUEST_BRANCH") != "")' after_success: - ./deploy.sh
In addition, I found
this
blog post, where I learned how I could create an access token for Netlify which
I could pass to Travis so it has the privileges to deploy to Netlify.
Essentially you go here and create a new
access token, which you in turn add a hidden environment variable in the Travis
settings of the repo that contains the source code for the webpage to deploy.
$ netlify
command in the deploy step,
so if you want to use this script, you need to call the environment variable
NETLIFY_AUTH_TOKEN
.
There were only two more steps to complete:
- Add a DESCRIPTION file to my repo and add the dependencies I used in my R code
to it, which you can do most conveniently with
usethis::use_package("your_package")
after you added a basic DESCRIPTION file. Alternatively, you can add install commands to your.travis.yaml
file but I think the DESCRIPTION solution is nicer. - Commit the
.netlify
directory from my local Linux environment from which I have already successfully deployed the page. Don’t know exactly what it does, but I think it creates some deplyoment context that is used when you call$ netlify deploy
, i.e. it does not prompt the user to give further instructions such as the directory that contains the files to deploy. This is helpful because you can’t give interactive instructions to Travis during the build process.
Step 3b: Automatically deploying the page with merge preview
This was it. The next build succeeded and my site got deployed. A nice feature
of Netlify is also that you can pre-view rendered pull requests before you merge
them, so you can detect dead links, missing pictures etc. When I clicked on the
Netlify CI link in the PR on GitHub, I got an unexpected result though. The
pages was empty. What was that? After skimming through my Netlify settings, I
discovered that the build command was set to hugo
and when I ran $ hugo
locally, I got the same result as the Netlify preview. Hugo was expecting either
html or Markdown files, but I only had R Markdown. So just as for the regular
build process, I could not use Netlify to build my page. Dang.
I checked out the tidyverse.org website from which I knew that they had Netlify previews. The way it works ( as of June 2018) was that people had to locally serve the site and also commit the derivatives, i.e. the html pages, which is basically equivalent to the alternative approach mentioned in step 2. Since I already achieved deploying the page with Travis and Netlify without committing any derivatives, I felt like it could not be all that complicated to do the same for PRs.
A few hours later, I figured out how to do it.
- First, you need to know that in the above script, the deployment happens only
for the production branch. Hence, you should use
after_success:
in.travis.yml
instead ofdeploy:
, the former getting executed for every branch. - Then, I wrote a little bash script that
- First checks if the current branch is the master branch or some other branch.
- Depending on that, either do a normal deploy of the page (in case we are on master) or only deploy a draft.1
This translates to the following deployment script:
#!/bin/bash echo "Your PATH is $PATH" echo "You are on branch $TRAVIS_BRANCH" echo "The TRAVIS_PULL_REQUEST_BRANCH is $TRAVIS_PULL_REQUEST_BRANCH" if [[ "$TRAVIS_PULL_REQUEST_BRANCH" = "" ]] then echo "you are on master, deploying production." netlify deploy --prod --dir=public else echo "you are not on master, deploying preview." netlify deploy --dir=public fi
The draft mode in particular uses your key and publishing directory from the
context stored in your .netlify/state.json
file. The draft won’t show up in
your Netlify dashboard, but you need to go to the Travis log, unfold the
./deploy.sh
call at the very bottom to see where the site got deployed to.
Paste this URL to a browser and check if the page looks as expected. Keep in
mind that the Netlify CI check that is shown in the GitHub PR (depicted bellow)
is the build of the empty page and is hence irrelevant for the approach
presented under the header 3b. Use the one from the Travis log in the above
picture.
I nevertheless changed the settings in Netlify for the deployment for the former so I get three green ticks. The reason it first failed was that the cactus plus theme has a minimal version requirement for hugo that is larger than the hugo default version.
For the build.sh
to get called, you obviously also need to change your
.travis.yml
. Here is my final version:
# R for travis: see documentation at https://docs.travis-ci.com/user/languages/r language: R sudo: false cache: packages: true directories: - $HOME/.npm branches: only: - master before_script: - nvm install node - npm install -g netlify-cli - Rscript -e 'blogdown::install_hugo()' script: - Rscript -e 'blogdown::build_site(local = Sys.getenv("TRAVIS_PULL_REQUEST_BRANCH") != "")' after_success: - ./deploy.sh
Alternative approach with the tic package
Maëlle Salmon kindly reviewed the blog post you are reading and she pointed to some related resources. In particular, she mentioned the tic package – the name stands for taks integrating continuously – which supports deploying blogdown pages. I forked the example repsoitory and tried it out. Here is what I did:
- I first set up a public-private key pair for a ssh connection with
travis::use_travis_deploy()
. This allows travis to push to GitHub. - Next, I added the GitHub repository to Netlify.
- I changed the base URL as described above, so the links would work.
tic offers a rich and elegant domain specific language. Just have a look at
./tic.R
of the above example repository, which contains the code to build and
deploy the site.In a nutshell, I think this approach does the following:
- runs
blogdown::build_site()
on your R Markdown or plain Markdown inputs and creates html derivatives (for.Rmd
) and Markdown derivatives (for.Rmarkdown
). - commits and pushes all new files (i.e the derivatives) to GitHub.
- the push triggers a Netlify built process using the
hugo
build command.
I could not figure out quickly how to create merge previews for PRs with the tic approach or how to refrain from having derivatives committed. But in principle, it should be possible with tic too.
Roundup
If you were able to follow the blog post so far, you have probably realized by now that all these different approaches essentially do the same, it’s just a matter of what get’s done where and whether it gets committed.
Overall, I am pretty happy with the set-up presented up to chapter 3b. Let me summarize again how the approach outlined in step 3b compares to existing solutions I have found.
- It is similar to style.tidyverse.org in that it does not require derivatives to be committed and PR builds can be previewed. However, style.tidyverse.org was built with bookdown. This approach also uses the Netlify CLI tools.
- It is similar to tidyverse.org as the page also uses blogdown and offers PR build previews. The difference is that tidyverse.org does not use Travis to deploy via Netlify, but instead requires html versions of all pages to be committed that are build locally with blogdown.
Looking back, it would have been much easier to build the page locally and simply commit Markdown or html derivatives, as suggested by Yihui in his chapter on Netlify deployment. If I’d foreseen the trouble involved in getting it working, I possibly would have choose this approach. Now that I figured out how you can do without committing derivatives and how to get merge previews for PRs, I hope some people find this post useful and decide to adapt this strategy.
I think if we’d manage to combine the approach outlined in paragraph 3b with tic, this would be my preferred solution. Maybe a topic for another post.
I’d like to thank Yihui for the wonderful blogdown package, Maëlle Salmon for reviewing this post and pointing to interesting related resources which I discussed in the penultimate header of this post.
In case you are interested in some of the details I skipped above for the sake of brevity, feel free to reach out to me via the comment functionality below. Checking out the source code of this webpage might be clarifying too.
I hope these instructions are informative so that you’ll need a fraction of the time I spend on configuring my site.
- Note that there seems to be only one URL for built drafts. Hence, the URL always reflects the last built PR, so it’s inconvenient to work on multiple drafts at the same time. [return]
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.