Documenting Rcpp functions and classes in R packages
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
Roxygen2 is a convenient way to document functions in
R packages. Based on the Doxygen model, it parses relevant
information from the comments and generates the corresponding man/*.Rd
files. The major strength
of this model – besides not having to write *.Rd
files by hand – is that documentation ends up
living right next to the functionality it describes, thus enabling easy maintenance and
synchronization between the two.
When using Rcpp in package development, the R functions in RcppExports.R
are generated
automatically from the source .cpp
files. Rcpp also faithfully transcribes comment blocks from
.cpp
to .R
. We can take advantage of this fact to add documentation directly to the .cpp
files
and have it appear in the package reference manual.
This vignette demonstrates four different ways to document Rcpp functions and classes in R
packages. We will create an R package from scratch, add some very basic functionality and show how
Roxygen2 blocks appear in the final documentation. Throughout the vignette, we will focus on base R,
Rcpp and roxygen2 with no additional dependencies. There are a number of popular add-on packages
helping with package development (e.g. devtools
, remotes
, usethis
as of January 2020), and
these additional helpers may be reviewed in a follow-up vignette. In case
Rcpp or roxygen2 are not yet installed, do
Create a package
We begin by initializing a package skeleton. The following functions create a new directory and establish a barebones structure for an R package with Rcpp support:
Using your favorite text editor, create a new file ~/test/testpkg/R/zzz.R
and add the following
lines to it:
This exposes the functionality of an Rcpp
module that we will use to encapsulate our
C++ class. Replace "double_cpp"
with your desired module name, but note that the name has to match
what appears in the .cpp
file below.
Add C++ functionality and Roxygen2 comment blocks
All code blocks presented in this section are part of the same contiguous .cpp
file. For the
purposes of this tutorial, assume that the file lives in ~/test/testpkg/src/mult.cpp
. Let’s review
the four different ways to introduce Roxygen2 comment blocks inside .cpp
files.
Approach 1: Standalone functions
The first approach is the most straightforward and mimics traditional usage of Roxygen2 in pure R
packages. The comments are placed immediately before a stand-alone function and begin with //'
,
which Rcpp will translate into #'
in the corresponding RcppExports.R
file:
Approach 2: Nested field structure
Classes impose an additional layer of hierarchy by grouping multiple functions and variables. We need to decide how this hierarchy should be represented in the R help pages, which themselves follow a flat format. If a class is relatively small, a simple solution is to use a nested field structure to describe individual member functions. The entire class will then appear as a single entry in the final package reference manual.
Approach 3: Stand-alone pages for individual class methods
Occasionally, a member function may be complex enough to require its own ?
reference entry. A
major advantage of Roxygen2 is that such entries can be created with relative ease by placing the
corresponding comment block at the top level in a file.
The only disadvantage is that the documentation does not live directly next to the function it describes. Unforunately, this is due to a limitation that only the top-level comment blocks are exported into R by Rcpp.
Approach 4: Rcpp module docstrings
The final place for function documentation is inside the docstring feature provided by the Rcpp modules themselves. This works well for relatively simple classes. Unfortunately, the documentation becomes overly verbose, if a class makes heavy use of templates.
Compile, install and test
After finishing ~/test/testpkg/R/zzz.R
and ~/test/testpkg/src/mult.cpp
, run the following
commands in an R session somewhere inside ~/test/testpkg
:
followed by the usual R CMD build
and R CMD install
(or equivalent helper functions via RStudio,
the usethis
or devtools
package or alike).
The first command compiles the .cpp
code and generates R/RcppExports.R
. You will notice that the
roxygen comment blocks are faithfully transcribed from .cpp
to .R
. The second command then
generates the man/*.Rd
files based on these roxygen blocks. Having prepared the sources, we then
build and install package as usual, making it available through the standard library(testpkg)
interface.
To test the package and view its documentation, start a fresh R session and examine the help pages.
We can also make sure that the package functions as expected.
Going Further
As mentioned above, additional packages (or tools like the RStudio IDE) offer to help with package
creation, documentation, build and more. As well, the
roxygen2 can do more than we showed here by
auto-generating the NAMESPACE
file, collating R files as needed and more. However, we feel it
helps to understand what each relevant tool offers in and by itself, and also appreciate the
relative simplicity of the tools describe in the Writing R
Extensions manual that is part of
base R. A follow-up vignette may describe the additional tools.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.