Vectorized R vs Rcpp
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In my previous post, I tried to show, that Rcpp is 1000 faster than pure R and that generated the fuss in the comments. Being lazy, I didn’t vectorize R code and at the end I was comparing apples vs oranges.
To fix that problem, I built a new script, where I’m trying to compare apples against apples. First piece of code named “ifelse R” uses R “ifelse” function to vectorize code. Second piece of code is fully vectorized code written in R, third – pure C++ code and the last one is C++, where Rcpp ”ifelse” function is used.
name | seconds |
---|---|
ifelse R | 27.50 |
vectorized R | 10.40 |
pure C++ | 0.44 |
vectorized C++ | 2.24 |
Here we go – vectorization truly helps, but pure C++ code still 23 times faster. Of course you pay the price when writing it in C++.
I found a bit strange, that vectorized C++ code doesn’t perform that well…
You can get the code from github or review it below:
#Author Dzidorius Martinaitis #Date 2012-02-01 #Description http://www.investuotojas.eu/2012/02/01/vectorized-r-vs-rcpp bid = runif(50000000,5,9) ask = runif(50000000,5,9) close = runif(50000000,5,9) x=data.frame(bid=bid,ask=ask,last_price=close) rez=0 ########### ifelse R ################# answ=as.vector(system.time( { rez = ifelse(x$last_price>0,ifelse(x[, "bid"] > x[, "last_price"], x[, "bid"], ifelse((x[, "ask"] > 0) & (x[, "ask"] < x[, "last_price"]), x[, "ask"], x[, "last_price"])), 0.5*(x[, "ask"] + x[,"bid"])) })[1]) ########### end ifelse R ################# ########### vectorized R ################# answ=append(answ,system.time( { lgt0 = x$last_price > 0 bgtl = x$bid > x$last_price agt0 = x$ask > 0 altl = x$ask > x$last_price rez = x$last_price rez[lgt0 & agt0 & altl] = x$ask[lgt0 & agt0 & altl] rez[lgt0 & bgtl] = x$bid[lgt0 & bgtl] rez[!lgt0] = (x$ask[!lgt0]+x$bid[!lgt0])/2 } )[1]) ########### end vectorized R ################# #C++ code starts here library(inline) library(Rcpp) ########### pure C++ ################# code=' NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);NumericVector ret(ask_); int bid_size = bid.size(); for(int i =0;i<bid_size;i++) { if(close[i]>0) { if(bid[i]>close[i]) { ret[i] = bid[i]; } else if(ask[i]>0 && ask[i]<close[i]) { ret[i] = ask[i];// } else { ret[i] = close[i];// } } else { ret[i]=(bid[i]+ask[i])/2; } } return ret; ' getLastPrice <- cxxfunction(signature( bid_ = "numeric",ask_ = "numeric",close_="numeric"),body=code,plugin="Rcpp") rez=0 answ=append(answ,system.time( { rez=getLastPrice(as.numeric(x$bid),as.numeric(x$ask),as.numeric(x$last_price)) })[1]) ########### end pure C++ ################# #summary(rez) ########### vectorized C++ ################# code=' NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);NumericVector ret(ask_); int bid_size = bid.size(); ret=ifelse(close>0,ifelse(bid >close, bid, ifelse(ask > 0,ifelse(ask < close,ask, close),close)), 0.5*(ask + bid)); return ret; ' getLastPrice <- cxxfunction(signature( bid_ = "numeric",ask_ = "numeric",close_="numeric"),body=code,plugin="Rcpp") rez=0 answ=append(answ,system.time( { rez=getLastPrice(as.numeric(x$bid),as.numeric(x$ask),as.numeric(x$last_price)) } )[1]) ########### end vectorized C++ ################# #summary(rez) names(answ)=c('ifelse R','vectorized R','pure C++','vectorized C++') library(ggplot2) a=data.frame(ind=1:4,val=answ) ggplot(a,aes(ind,val))+geom_point(legend=F)+geom_text(aes(label=names(answ),hjust=c(-0.2,-0.2,-0.2,0.8),vjust=c(0,0,0,-1)),size=4) |
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.