Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Why share C++ code from an Rcpp package
Using the {Rcpp} package is the dominant method for linking the usability of R with the speed of C++, and can be used to write R packages that are fast and easy to use for both end-users and developers.
From the point of view of developers, it’s very easy to export R code such as functions and classes from an R(cpp) package, but the guidance in the Rcpp documentation does not detail how to export the C++ code so that it can be shared with your other Rcpp packages.
Allowing C++ code to be shared can be very beneficial for the same reasons that sharing R code is — packaging code is a reliable way to reuse it.
Some widely used examples of this practice are the {RcppEigen}, {RcppArmadillo}, {RcppGSL}, and Boost Headers {BH} packages. Indeed, in the Epiverse-TRACE team, {RcppEigen} underpins the {finalsize} and upcoming {epidemics} packages.
< section id="two-ways-to-share-c-code" class="level2">Two ways to share C++ code?
Developers searching for a way to make the C++ code of their Rcpp-based packages shareable will likely find two seemingly distinct ways of doing so.
Developers reading the Rcpp Attributes documentation will find that package C++ code can be shared by generating a C++ interface for functions that also have an R interface, using Rcpp attributes.
Developers instead scanning widely used Rcpp packages such as {RcppEigen} will notice that C++ code can also be shared by defining the majority of the C++ functions in a package header, to which other Rcpp packages can be linked.
These are simply different pathways to the writing and export of an R(cpp) package header, which allows Rcpp to link the package’s C++ code to other packages.
This blog post explores different ways of doing this, and explains how the Rcpp packages from Epiverse-TRACE implement C++ code sharing.
< section id="the-package-header" class="level2">The package header
The package header of the package {mypackage}
is a file of the name mypackage.h
under inst/include
. Defining this header is the key step in making (R)C++ code shareable.
# conceptual organisation of an Rcpp package with a package header . ├── DESCRIPTION ├── NAMESPACE ├── R │ └── RcppExports.R ├── inst │ └── include │ └── mypackage.h # <= the package header └── src ├── RcppExports.cpp └── rcpp_hello.cpp # <= code from which RcppExports.cpp generates< section id="autogenerating-the-package-header" class="level3">
Autogenerating the package header
The package header is autogenerated when the attributes of an Rcpp function are edited to also generate a C++ interface.
Consider the Rcpp function below which is exposed to R and exported from the package. The line // [[Rcpp::interfaces(cpp)]]
instructs Rcpp to autogenerate two header files under inst/include
:
- A package header, called
mypackage.h
, and - A helper header called
mypackage_RcppExports.h
with ‘automagic’ C++ bindings for the functionhello_world_rcpp()
.
src/rcpp_hello.cpp
#include <Rcpp.h> // [[Rcpp::interfaces(cpp)]] //' @title Test Rcpp function //' //' @export //[[Rcpp::export]] void hello_world_rcpp() { Rcpp::Rcout << "hello world!\n"; }
Manually creating the package header
The package header can also be created manually, as mypackage.h
under inst/include
. In this case, the helper file mypackage_RcppExports.h
is not generated.
Examples of this are the widely used {RcppEigen} and {RcppArmadillo} packages, while this demonstration package by James Balamuta is a minimal example that is a good place to get started to understand how this approach works.
The manually defined package header can initially be empty, and is populated by the developer — more on header contents below.
It is possible to edit an autogenerated package header to include manually created header files in addition to mypackage_RcppExports.h
. To do this, remove the generator tag (see below) to prevent this file from being overwritten by Rcpp::compileAttributes()
. Then include any extra header files as usual.
We would however recommend not autogenerating headers from Rcpp functions, but rather writing a header-heavy package — this is the approach used by {RcppEigen} etc. (see more below on how we organise our packages).
Contents of the package header
We found it difficult to get information on the content of the package header.
Autogenerated package headers contain an autogeneration message and a generator token, similar to that present in RcppExports
files. Package headers should contain a header include guard.
The style of the header name in the include guard for autogenerated headers is RCPP_mypackage_H_GEN_
. Package headers from the Rcpp core team, such as {RcppEigen} and {RcppArmadillo}, are manually defined and follow the convention mypackage__mypackage__h
. In examples, such as this bare-bones demonstration package by James Balamuta, you might also encounter a single underscore (_
) and a capital H
(mypackage_mypackage_H
).
If you are linting your Rcpp package’s C++ code with Cpplint, all three are incompatible with Cpplint’s preference, which is DIR_SUBDIR_FILE_H
. Exclude the package header from linting to avoid this warning if you wish to follow an Rcpp community style instead.
The package header must also link to the code you want to export, and there are at least three ways of doing this.
- Include the autogenerated file
mypackage_RcppExports.h
; this is already done as part of the package header generation. - Directly write C++ code in the package header. This is technically possible, but unlikely to be a good option as your package’s C++ codebase grows.
- Manually include any other C++ header files in the package header. This last option might lead to a package header such as that shown below.
inst/include/mypackage.h
// Manually created package header with manual code inclusion #ifndef mypackage_mypackage_H #define mypackage_mypackage_H // include files using paths relative to inst/include #include "header_01.h" #include "header_02.h" #endif // mypackage_mypackage_H
Here, the header files might contain code that you wish to make available to other packages, such as a C++ function, struct, or class, and indeed in the current package as well — more on how to do this below.
< section id="using-rcpp-in-header-code" class="level2">Using Rcpp in header code
Using {Rcpp}’s C++ functionality, such as the Rcpp classes DataFrame
or List
, or classes and functions of Rcpp-based packages such as {RcppEigen}, is as simple as including those headers in the appropriate location, just as one would in a source file — see the example below.
inst/include/header_01.h
// In a manually created header file, say, header_01.h // which is included in mypackage.h // to use Rcpp #include <Rcpp.h> // note the use of inline, more on this later inline void hello_world_rcpp() { Rcpp::Rcout << "hello world!\n"; }
The appropriate headers are automatically included in autogenerated package headers’ helper files, and the developer need not do anything more.
Don’t forget to link to {Rcpp} or similar packages to the package under development by adding the package names under Imports
, Depends
, or LinkingTo
as appropriate.
This can often be handled by functions in the {usethis} package such as usethis::use_rcpp_eigen()
. You might also need to add // [[Rcpp::depends(<package>)]]
in your package’s C++ source files, with a suitable package dependency specified.
The same principles apply to using C++ code from this package ({mypackage}) in future packages.
< section id="using-header-code-in-the-package" class="level2">Using header code in the package
There are some considerations when seeking to use header code from {mypackage} within {mypackage} itself.
Any functions defined in the package headers must be inline functions (see the example above). This prevents compilation errors related to multiple definitions.
C++ source files should include the package header, using #include mypackage.h
. Functions, structs, or classes defined in header files will be available from the namespace mypackage
, as shown in the example below.
The code in header files will usually need to be wrapped in (R)C++ code that is exposed to R to make functions from the headers available in R — see the snippet below.
mypackage/src/hello_world.cpp
// #include <Rcpp.h> // include Rcpp if necessary #include <mypackage.h> // include package header // Function exposed to R //' @title Rcpp function wrapping a header function //' //' @export // [[Rcpp::export]] void print_hello_world() { mypackage::hello_world_rcpp(); // note the namespacing }
Remember to add PKG_CPPFLAGS += -I../inst/include/
to both Makevars
and Makevars.win
under src/
. Furthermore, as noted in the Rcpp attributes documentation, the package will not automatically cause a rebuild when headers are modified — this needs to be done manually.
Linking header code between pacakges
Once you have developed your package, you can link to its C++ header code in the same way as you would to any other Rcpp-based package.
Consider the snippet below which shows how to link the C++ code from {mypackage} in a different package called {yourpackage}.
yourpackage/src/hello_world.cpp
// [[Rcpp::depends(mypackage)]] /// specify dependency #include <mypackage.h> // Define and export an Rcpp function void print_linked_hello() { mypackage::hello_world_rcpp(); }
Be sure to add LinkingTo: mypackage
in the DESCRIPTION
of the second package {yourpackage}.
C++ code sharing in Epiverse-TRACE
In Epiverse-TRACE, we have structured the {finalsize} and {epidemics} packages to have manually created headers, following the principles laid out above. We follow some additional principles as well.
- Header-heavy packages
- Our packages are header-heavy, so that most of the actual code is defined in the headers. The source files are primarily intended to contain wrappers that expose the header code to R (and our users).
- Namespaces to organise header code
- Our header code is organised into C++ namespaces, which makes it easier to understand where functions are likely to be defined, and what they might be related to. It also makes it possible to include the package headers (and namespaces) that are relevant to users, rather than including the entire codebase.
As an example, functions related to non-pharmaceutical interventions or vaccination regimes from the {epidemics} package can be used in other packages without also including the compartmental epidemic models contained therein.
< section id="ensuring-the-quality-of-header-code" class="level2">Ensuring the quality of header code
You can lint and statically check code in a package header using tools for linting C++ code such as Cpplint and Cppcheck. When doing so, it may be important to specify minimum C++ standards, or even the language (C or C++) to avoid linter errors. This is because tools — such as Cppcheck — assume that headers with the extension .h
are C headers, which throws errors when encountering C++ features such as the use of namespaces.
Cppcheck’s language and C++ standard can be set using:
cppcheck --std=c++14 --language=c++ --enable=warning,style --error-exitcode=1 inst/include/*.h
Furthermore, header code can also be tested independently of the R(cpp) code that eventually wraps it. This can be done using the Catch2 testing framework, which is conveniently available using {testthat} — this is an extensive topic for another post.
< section id="conclusion" class="level2">Conclusion
Developing an Rcpp-based package with C++ code sharing in mind takes some organisation, or even reorganisation, of the C++ codebase. It is probably a good idea to consider whether your package will implement code that would be of interest to other developers, or to you in related projects. If either of these is true, it may help to structure your package with C++ code sharing in mind from the very beginning of development. This can substantially reduce development overheads and mistakes associated with maintaining multiple copies of the same or similar code in different projects. Fortunately, some great examples of how to do this are among the most-used Rcpp-based packages, providing both a conceptual template to consult for your work, as well as being a demonstration of how beneficial this practice can be in the long run. In Epiverse-TRACE, we intend to continue developing with C++ code sharing as a core principle so that we and other developers can build on our initial work.
Reuse
< section class="quarto-appendix-contents">Citation
@online{gupte2023, author = {Gupte, Pratik}, title = {Sharing the {C++} {Code} of an {Rcpp} {Package}}, date = {2023-04-24}, url = {https://epiverse-trace.github.io//posts/share-cpp}, langid = {en} }
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.