sapply with variadic trailing arguments (…)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Motivation
In R, we can pass further arguments to sapply. The arguments are then passed the function to be applied over.
x <- seq(-3, 3, by=.2 ) sapply( x, dnorm, 0, 4, FALSE )
Conceptually this does something like:
sapply( x, function(.){ dnorm(.,0,4,FALSE) } )
Implementation in Rcpp11
sapply has been part of sugar for a long time, and is now very central to the modernized version of sugar in the devel version of Rcpp11
, but until now we did not have a mechanism similar to R's ellipsis.
variadic templates and std::tuple
give us the tools to implement the feature in Rcpp11.
#include <Rcpp11> using namespace Rcpp11 ; // [[Rcpp::export]] NumericVector bazinga(NumericVector x){ NumericVector res = sapply( x, ::Rf_dnorm4, 0.0, 4.0, false ) ; return res ; } /*** R x <- seq(-3, 3, by=.2 ) bazinga(x) */
Details
For the details, further arguments are tied together into a functor object SapplyFunctionBinder wrapping both the underlying function to be called ::Rf_dnorm4
and the data as std::tuple<double,double,bool>
.
template <int RTYPE, typename Function, typename... Args > class SapplyFunctionBinder { public: typedef typename Rcpp::traits::storage_type<RTYPE>::type storage_type ; typedef typename std::tuple<Args...> Tuple ; typedef typename Rcpp::traits::index_sequence<Args...>::type Sequence ; typedef typename std::result_of<Function(storage_type,Args...)>::type fun_result_type ; SapplyFunctionBinder( Function fun_, Args&&... args) : fun(fun_), tuple(std::forward<Args>(args)...){} inline fun_result_type operator()( storage_type x ) const { return apply( x, Sequence() ) ; } private: Function fun ; Tuple tuple ; template <int... S> inline fun_result_type apply( storage_type x, Rcpp::traits::sequence<S...> ) const { return fun( x, std::get<S>(tuple)... ); } } ;
Alternatives
Alternatively, this can be done with lambda functions :
NumericVector res = sapply( x, [](double a){ return ::Rf_dnorm4(a, 0.0, 4.0, false ) ; } ) ;
We could also bind the function with std::bind
:
using namespace std::placeholders ; NumericVector res = sapply( x, std::bind(::Rf_dnorm4, _1, 0.0, 4.0, false) ) ;
Comparison. Cost of the abstraction
Let's compare these alternatives through microbenchmark.
#include <Rcpp11> using namespace Rcpp11 ; // [[Rcpp::export]] NumericVector sapply_variadic(NumericVector x){ NumericVector res = sapply( x, ::Rf_dnorm4, 0.0, 4.0, false ) ; return res ; } // [[Rcpp::export]] NumericVector sapply_lambda(NumericVector x){ NumericVector res = sapply( x, [](double a){ return ::Rf_dnorm4(a, 0.0, 4.0, false ) ; } ) ; return res ; } // [[Rcpp::export]] NumericVector sapply_bind(NumericVector x){ using namespace std::placeholders ; NumericVector res = sapply( x, std::bind(::Rf_dnorm4, _1, 0.0, 4.0, false) ) ; return res ; } // [[Rcpp::export]] NumericVector sapply_loop(NumericVector x){ int n = x.size() ; NumericVector res(n); for( int i=0; i<n; i++){ res[i] = Rf_dnorm4( x[i], 0.0, 4.0, false ) ; } return res ; }
Here are the results.
$ Rcpp11Script /tmp/test.cpp > x <- seq(-3, 3, length.out = 1e+06) > require(microbenchmark) Loading required package: microbenchmark > microbenchmark(sapply_variadic(x), sapply_lambda(x), + sapply_bind(x), sapply_loop(x)) Unit: milliseconds expr min lq median uq max neval sapply_variadic(x) 20.01696 20.11962 21.36856 22.07377 31.22063 100 sapply_lambda(x) 20.53550 20.63079 21.83883 22.55680 31.62075 100 sapply_bind(x) 19.96870 20.56051 21.32460 22.26086 30.66203 100 sapply_loop(x) 20.81417 20.92458 22.13156 22.84175 31.48991 100 All 4 solutions give pretty identical performance. This is abstraction we did not have to pay for. In comparisons, a direct call to the vectorised R function dnorm R_direct <- function(x){ dnorm(x, 0, 4, FALSE) }
yields:
> microbenchmark(sapply_variadic(x), sapply_lambda(x), + sapply_bind(x), sapply_loop(x), R_direct(x)) Unit: milliseconds expr min lq median uq max neval sapply_variadic(x) 20.05441 20.12312 21.35391 22.67544 31.34657 100 sapply_lambda(x) 20.28648 20.39238 21.60423 22.31166 30.66797 100 sapply_bind(x) 19.94212 20.02965 21.26132 21.92637 30.68198 100 sapply_loop(x) 20.25010 20.31937 21.57333 22.89537 31.75865 100 R_direct(x) 33.73723 33.89319 35.06729 36.05020 43.95266 100
I also intended to test the version using R's sapply.
sapply_R <- function(x){ sapply(x, dnorm, 0, 4, FALSE ) }
But ... life's too short and I killed it.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.