Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
[This is the second post in a three part series that demonstrates how to create an R package that includes RcppArmadillo source code. Follow these links for part one and part three]
Last time I showed how you might speed up getting the coefficients from a linear regression. Comparisons once the code was compiled and loaded were, of course, flattering for the Rcpp solution.
But this misses the fact that compilation takes time — and at this stage we have to wait while Rcpp::sourceCpp compiles the code each session.
On my system I’d have to do about 50mil regressions per session to repay the compilation time. That’s plausible as a once off, but most of the time it would not be worth it.
The solution is to build an R package that includes the C++ code. That way you pay the compilation tax only once. After your package is built and installed, you load the package the regular way with library(PAX) — which is basically instantaneous.
Are you ready?
To build packages, you need to have Rtools installed.
-  Install Rtools:
- First go to the CRAN Rtools page and download the version that’s suitable for your R installation;
 - Accept all the defaults, particularly the path (probably C:\Rtools).Make sure you keep note of where you install Rtools — we are going to need it soon.
 
- Check your PATH:
- 
- 
- 
- open cmdand type in path —$ path
 
- open 
 
- 
 
- 
- 
- 
- 
- Do you see the path to R and Rtools? You should see something like the following: C:\Program Files\R\R-3.5.1\bin\x64for R andC:\Rtools\binfor Rtools.
 
- Do you see the path to R and Rtools? You should see something like the following: 
 
- 
 
- 
- 
- 
- 
- If you don’t, then you need to edit your path: hit the windows key and type in “edit the system environment variables”. 
 
- If you don’t, then you need to edit your path: hit the windows key and type in “edit the system environment variables”. 
 
- 
 
- 
- 
- 
- 
- When you see it pop up in the search pane, hit enter. This should open the “System Properties” box. Select the “Advanced” tab (if it is not already selected) and press the “Environment Variables” button.
 
 
- 
 
- 
- 
- 
- 
- Now click the “Browse” button and navigate to your R and your Rtools folders. If you are having trouble finding R and can find a working R shortcut (perhaps on your desktop) you can see the path in the properties if you right-click. Of course you wrote down the path to Rtools two steps back (right?) so that one will be easy.
 
 
- 
 
- 
- 
- 
- 
- Now open cmdand check your path with$ path. You should see the paths to R and Rtools (you may have to reset).
 
- Now open 
 
- 
 
- 
- 
- 
- 
- Now you’re set!
 
 
- 
 
- 
Creating a RcppArmadillo package
- Open R and create an RcppArmadillo package skeleton: R> RcppArmadillo.package.skeleton("PAX", path = "~/Dropbox/R/packages")
- Using what ever method you prefer (cmd, explorer etc), copy the .cpp files into .../PAX/src/
- back in R … R> setwd("C:\Users\abc\Dropbox\R\packages\PAX")
- … R> Rcpp::compileAttributes()
- … R> tools::package_native_routine_registration_skeleton(dir = "path-to-PAX", character_only = TRUE). NOTE: the character_only variable should be = TRUE the first time, and = FALSE if you’re updating the package.
- Copy the text that was output in R to \PAX\src\init.c… this tells R about your C++ functions.
- Now build your package … open cmd: $ R CMD build C:\Users\abc\Dropbox\R\packages\PAX… NOTE: complete paths always work; relative paths sometimes fail. You should see output similar to the below:- * checking for file 'C:/Users/abc/Dropbox/R/packages/PAX/DESCRIPTION' ... OK
 - * preparing 'PAX':
 - * checking DESCRIPTION meta-information ... OK
 - * cleaning src
 - * installing the package to process help pages
 - * saving partial Rd database
 - * cleaning src
 - * checking for LF line-endings in source and make files and shell scripts
 - * checking for empty or unneeded directories
 - * building 'PAX_1.0.tar.gz'
 
- Still in cmd, run $ R CMD INSTALL PAX_1.0.tar.gz… note that you don’t need the full path in this step. You should see some compilation stuff such as:- * installing *source* package 'PAX' ...
 - ** libs
 - c:/Rtools/mingw_64/bin/g++ -std=gnu++11 -I"C:/PROGRA~1/R/R-35~1.1/include" -DNDEBUG -I"C:/Users/abc/R/rpax/Rcpp/include" -I"C:/Users/abc/R/rpax/RcppArmadillo/include" -fopenmp -O2 -Wall -mtune=generic -c RcppExports.cpp -o RcppExports.o
 - …
 
- This should conclude with * Done(PAX)
- Open up R and try R> library(PAX)… now execute R> getCoef .. you should see the function and definition:
 R> getCoef
function (X,Y)
{
.Call('_PAX_getCoef', X, Y)
} 
- .
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
