Site icon R-bloggers

Optimal workflows for package vignettes

[This article was first published on Posts on R-hub blog, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Yet another post with a focus on package documentation! This time, we’ll cover vignettes a.k.a “long-form package documentation”, both basics around vignette building and infrastructure, and some tips for more maintainer- and user- friendliness.

What is a vignette? Where does it live?

In this section we shall go over basics of package vignettes.

Vignette 101

In the “R packages” book by Hadley Wickham and Jenny Bryan, the vignettes chapter starts with “A vignette is a long-form guide to your package. Function documentation is great if you know the name of the function you need, but it’s useless otherwise."1 In “Writing R Extensions”, vignettes are defined as “documents in PDF or HTML format obtained from plain-text literate source files from which R knows how to extract R code and create output (in PDF/HTML or intermediate LaTeX).".

In practice, if your package contains one or several vignette(s), an user could

< !-- -->
vignette(package = "rhub")
Item Title
rhub get-started (source, html)
local-debugging Local Linux checks with Docker (source, html)
browseVignettes("rhub")
Vignette Title
rhub.html get-started
local-debugging.html Local Linux checks with Docker

Note that if the user installs your package from GitHub using devtools, they will need to explicitly ask for installing vignettes.

As a package author you could be fine only knowing about usethis::use_vignette() for creating a vignette, and that packages used in the vignette need to be listed in DESCRIPTION (under Suggests if they’re only used in the vignette3). Still, it’s useful to know about vignettes for debugging problems or finding workarounds for issues you might encounter.

Infrastructure & dependencies for vignettes

The building of package vignettes can either use the default Sweave vignette engine, or a vignette engine provided by a CRAN package like knitr by Yihui Xie. knitr::rmarkdown vignette engine is the one recommended in the R packages book, and usethis. It allows writing vignettes in R Markdown.

See the source of rhub main vignette. It has YAML metadata at the top, some non-executed code chunks, some executed code chunks. To allow for that vignette to be built, a field in DESCRIPTION mentions the vignette engine4:

VignetteBuilder: knitr, rmarkdown

And these two packages are declared as dependencies under Suggests as well.

The creation of a boilerplate Rmd under a new vignettes folder, and the dependencies declaration in DESCRIPTION, are what usethis::use_vignette() would handle for you. Then you can write as you would a standard R Markdown document, knitting for previewing it.

Other vignette builders include R.rsp that we’ll mention again later, noweb to use the noweb literate programming tool (which actually looks a lot like sweave?), rasciidocs that was recently archived at the time of writing. It is unlikely you’ll want to write your own vignette engine.

How many packages use a non-Sweave vignette? One way to assess that is to look for packages that have a VignetteBuilder field in DESCRIPTION with R-hub’s own pkgsearch.5

results <- pkgsearch::advanced_search("_exists_" = "VignetteBuilder")
attr(results, "metadata")$total
[1] 4969


knitr <- pkgsearch::advanced_search(VignetteBuilder = "knitr")
attr(knitr, "metadata")$total
[1] 4739

# for comparison
nrow(available.packages())
[1] 15694

Quite a lot, about 32% of CRAN pages use a non Sweave vignette engine and about 30% use knitr for at least one vignette6 Other packages might have Sweave vignettes, and some CRAN packages don’t have vignettes, whereas having a vignette is compulsory for Bioconductor packages.

Overview of vignettes states

Following the neat diagram of the R packages book,

? If your vignette shows an external image not generated by the build process, you also need to include it in install_extras,

< !-- -->
fs::dir_tree(find.package("rhub"))
/home/maelle/R/x86_64-pc-linux-gnu-library/3.6/rhub
├── DESCRIPTION
├── INDEX
├── LICENSE
├── Meta
│   ├── Rd.rds
│   ├── features.rds
│   ├── hsearch.rds
│   ├── links.rds
│   ├── nsInfo.rds
│   ├── package.rds
│   └── vignette.rds
├── NAMESPACE
├── NEWS.md
├── R
│   ├── rhub
│   ├── rhub.rdb
│   └── rhub.rdx
├── bin
│   ├── rhub-linux-docker.sh
│   └── rhub-linux.sh
├── doc
│   ├── index.html
│   ├── local-debugging.R
│   ├── local-debugging.Rmd
│   ├── local-debugging.html
│   ├── rhub.R
│   ├── rhub.Rmd
│   └── rhub.html
├── help
│   ├── AnIndex
│   ├── aliases.rds
│   ├── figures
│   │   └── logo.png
│   ├── paths.rds
│   ├── rhub.rdb
│   └── rhub.rdx
└── html
    ├── 00Index.html
    └── R.css

Your vignette for R CMD check

So, sometimes R CMD check7 will throw errors related to vignette building. How to deal with them?

? There is good troubleshooting advice in the R packages book.

? Vignette metadata is important. A non place-holder title in VignetteIndexEntry is compulsory! Vignettes with a place-holder title are even called bad_vignettes in R source. ?

? Based on what we said in the previous subsection, R CMD build builds vignettes from vignettes/ whereas R CMD check checks they can be rebuilt from inst/doc/. So if there were data in vignettes/, given it’s not copied to inst/doc/… R CMD check will error!

It’s also useful to know that there are options related to vignette building and checking in R CMD build and R CMD check. Of course you don’t control these options for CRAN but you do control them when sending your packages to R-hub package builder, and when setting up continuous integration. See for instance this great tip by John Blischak, “checking the package while ignoring the vignettes can be done with the following steps:"

R CMD build --no-build-vignettes --no-manual .
R CMD check --no-manual --ignore-vignettes --as-cran *. tar.gz

For R-hub package builder,

Workaround workflows for vignettes

In this section we’ll go over workarounds for some common vignette “problems”.

How to include my pre-print / cheatsheet as a PDF vignette?

Sometimes you might want to include a PDF as a vignette, without wanting to deal with missing LaTeX dependencies; or because the PDF is not knit from R (a cheatsheet); or the computations are too long. In that case there are two alternatives:

As an example of R.rsp usage, the treeBUGS package has HTML vignettes, and a PDF vignette corresponding to a pre-print. In its DESCRIPTION it indicates R.rsp as one of the vignette engines.

VignetteBuilder: 
    knitr,
    R.rsp

In the vignettes/ folder of its source one sees a file called Heck_2018_BRM.pdf.asis

%\VignetteIndexEntry{Heck, Arnold, & Arnold (2018): TreeBUGS paper (Behavior Research Methods)}
%\VignetteEngine{R.rsp::asis}
%\VignetteKeyword{PDF}
%\VignetteKeyword{HTML}
%\VignetteKeyword{vignette}
%\VignetteKeyword{package}
%\VignetteKeyword{TreeBUGS}

Slightly related is this workaround by Iñaki Úcar for building a vignette with a different output format based on the pandoc version available.

How to include a compute-intensive / authentication-dependent vignette?

A very similar problem can happen with HTML vignettes, when their computations are too long, or depend on a system dependency or authentication token absent from CRAN machines – hence R CMD check would fail for sure. So, what can you do?

< !-- -->
```{r, echo = FALSE} 
NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true")
knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>",
  purl = NOT_CRAN,
  eval = NOT_CRAN
)
```

Hey what about testing? And reproducibility?

In the two previous subsections we recommended pre-building stuff, which might make some people cringe, but we like this quote by Henrik Bengtsson in R-package-devel.

Some may argue that your package is not fully tested this way, but that depends on how well your package tests/ are written. I tend to look at examples() and vignettes as demos, and tests/ as actually tests. All should of course pass R CMD check and run, but the tests/ are what really test the package.

He also makes the point,

For reproducibility, I would include the root/source vignette in the package as well, e.g. in inst/full-vignettes/ with instructions and/or a function on how to rebuild it.

User-friendly vignettes

In this section we’ll give some tips for making vignettes easier to navigate.

Pretty vignettes

You might want to tweak layout and aspect of your vignette a bit to make people even more likely to read them, maybe with custom CSS8. Using a disappointingly unspecific GitHub code search on R-hub mirror of CRAN we found the example of idiogramFISH that defines and uses custom stylesheets for its vignette, that makes the vignette look very modern on its CRAN page! Note that it also uses some JavaScript for the table of content and “return to top” links, definitely not light-weight styling.

Now, an even better way to tweak your vignettes is to invest some time in creating a pkgdown website that will feature both manual pages, vignettes, changelogs, etc. It’s actually little work. It’s worth it reading how vignettes are built in pkgdown docs, in particular

Once you’ve created the website, do not forget to indicate its URL in DESCRIPTION. ?

Some further thoughts around vignettes and pkgdown. Since vignettes look better and are more integrated with other docs in the pkgdown website than locally, should your local vignettes contain a link to the pkgdown version to be sure that users that look at an offline vignette but have an internet connection can get a better user experience? And regarding the offline experience, would it make sense to also generate a PDF version of HTML vignettes, maybe with paged.js9?

Cross-references

Vignettes and manual pages serve different roles and complement each other.

In places other than the vignettes you could tell the user to type vignette("vignette-name"). In pkgdown websites, using that function will create a link the vignette page.

To link a vignette from another vignette, the R packages book mentions “Although it’s a slight hack, you can link various vignettes by taking advantage of how files are stored on disk: to link to vignette abc.Rmd, just make a link to abc.html." Again, this is supported in pkgdown websites, where functions are furthermore automatically linked to their manual page.

If you have many vignettes, you might want to use the ultimate R Markdown machinery for having cross-references, bookdown, i.e. writing a book instead of a pkgdown website! See how drake website links to a “Full manual” in its navbar. This process is currently separate from your usual a vignettes/pkgdown workflow, but might not always be.

Repeat yourself

Even better than cross-references, or complementary to them is the idea to repeat yourself. As a quick reminder from our post about READMEs, and as explained very well by Garrick Aden-Buie, you can re-use Rmd fragments in your package README, vignettes and manual pages without actually needing to copy-paste content!

Conclusion

In this post we offered a quite detailed, but probably not exhaustive, guide around R package vignettes. We haven’t discussed content of vignettes, how to best assess their usefulness (surveys? traffic data in pkgdown websites?), or their use as a way to encapsulate analyses in a package structure or “research compendium”. Do you have any special vignette setup or favorite trick? Don’t hesitate to share!

< section class="footnotes" role="doc-endnotes">
  1. Note that on a pkgdown website, a well-organized reference page can help make function documentation more useful. ↩︎

  2. For rendering the vignettes list in this post we used the printr package. ↩︎

  3. If a package is suggested your vignette should be resilient to its not being present; or use the workarounds from the workarounds section of this post. ↩︎

  4. “Writing R Extensions” states, in the section about DESCRIPTION, Note that if, for example, a vignette has engine ‘knitr::rmarkdown’, then knitr provides the engine but both knitr and rmarkdown are needed for using it, so both these packages need to be in the ‘VignetteBuilder’ field and at least suggested (as rmarkdown is only suggested by knitr, and hence not available automatically along with it). Many packages using knitr also need the package formatR which it suggests and so the user package needs to do so too and include this in ‘VignetteBuilder’. which isn’t acted upon since most CRAN packages using the knitr::rmarkdown engine don’t list rmarkdown in VignetteBuilder; and since VignetteBuilder packages need to be declared as dependencies in other fields. ↩︎

  5. Another way is to use tools::CRAN_package_db() like Julia Silge did in her blog post “Mining CRAN DESCRIPTION Files”. ↩︎

  6. A query like pkgsearch::advanced_search("VignetteBuilder: knitr AND VignetteBuilder: R.rsp") would show how many packages use both knitr and R.rsp as vignette engines, meaning they have at least one vignette using knitr and one vignette using R.rsp. ↩︎

  7. R CMD check will both try re-building vignettes and running R code as noted by Jenny Bryan on R-pkg-devel. It seems intricate, with the R code for check including interesting comments. ↩︎

  8. Bioconductor has its own vignette style. ↩︎

  9. Compared with pagedown the vague idea mentioned here would mean adding a custom print stylesheet to the vignette and using the paged.js CLI for generating a PDF locally before submission, PDF that’d be present in inst/doc, and linked to from the html vignette. There are other efforts for making docs easier to use offline. ↩︎

To leave a comment for the author, please follow the link and comment on their blog: Posts on R-hub blog.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.