Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
If you are a user who needs to work with Rcpp
-based packages, or you are a maintainer of one of such packages, you may be interested in the recent development of the unwind API, which can be leveraged to boost performance since the last Rcpp
update. In a nutshell, until R 3.5.0, every R call from C++ code was executed inside a try-catch, which is really slow, to avoid breaking things apart. From v3.5.0 on, this API provides a new and safe fast evaluation path for such calls.
Some motivation
Here is a small comparison of the old and the new APIs. The following toy example just calls an R function N
times from C++. A pure R for
loop is also provided as a reference.
Rcpp::cppFunction(' void old_api(Function func, int n) { for (int i=0; i<n; i++) func(); } ') Rcpp::cppFunction(plugins = "unwindProtect", ' void new_api(Function func, int n) { for (int i=0; i<n; i++) func(); } ') reference <- function(func, N) { for (i in 1:N) func() } func <- function() 1 N <- 1e6 system.time(old_api(func, N)) ## user system elapsed ## 17.863 0.006 17.950 system.time(new_api(func, N)) ## user system elapsed ## 0.289 0.000 0.290 system.time(reference(func, N)) ## user system elapsed ## 0.216 0.000 0.217
Obviously, there is still some penalty compared to not switching between domains, but the performance gain with respect to the old API is outstanding.
A real-world example
This is a quite heavy simulation of an M/M/1 system using simmer
:
library(simmer) system.time({ mm1 <- trajectory() %>% seize("server", 1) %>% timeout(function() rexp(1, 66)) %>% release("server", 1) env <- simmer() %>% add_resource("server", 1) %>% add_generator("customer", mm1, function() rexp(50, 60), mon=F) %>% run(10000, progress=progress::progress_bar$new()$update) })
In my system, it takes around 17 seconds with the old API. The new API makes it in less than 5 seconds. As a reference, if we avoid R calls in the timeout
activity and precompute all the arrivals instead of defining a dynamic generator, i.e.:
system.time({ input <- data.frame( time = rexp(10000*60, 60), service = rexp(10000*60, 66) ) mm1 <- trajectory() %>% seize("server", 1) %>% timeout_from_attribute("service") %>% release("server", 1) env <- simmer() %>% add_resource("server", 1) %>% add_dataframe("customer", mm1, input, mon=F, batch=50) %>% run(10000, progress=progress::progress_bar$new()$update) })
then the simulation takes around 2.5 seconds.
How to start using this feature
First of all, you need R >= 3.5.0 and Rcpp >= 0.12.18 installed. Then, if you are a user, the easiest way to enable this globally is to add CPPFLAGS += -DRCPP_USE_UNWIND_PROTECT
to your ~/.R/Makevars
. Packages installed or re-installed, as well as functions compiled with Rcpp::sourceCpp
and Rcpp::cppFunction
, will benefit from this performance gains. If you are a package maintainer, you can add -DRCPP_USE_UNWIND_PROTECT
to your package’s PKG_CPPFLAGS
in src/Makevars
. Alternatively, there is a plugin available, so this flag can be enabled by adding [[Rcpp::plugins(unwindProtect)]]
to one of your source files.
Note that this is fairly safe according to reverse dependency checks, but there might be still issues in some packages. But the sooner we start testing this feature and reporting possible issues, the sooner it will be enabled by default in Rcpp
.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.