Avoiding unnecessary memory allocations in R

[This article was first published on bioCS, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

As a rule, everything I discover in R has already been discussed by Hadley Wickham. In this case, he writes:
The reason why the C++ function is faster is subtle, and relates to memory management. The R version needs to create an intermediate vector the same length as y (x – ys), and allocating memory is an expensive operation. The C++ function avoids this overhead because it uses an intermediate scalar.
In my case, I want to count the number of items in a vector below a certain threshold. R will allocate a new vector for the result of the comparison, and then sum over that vector. It’s possible to speed that up about ten-fold by directly counting in C++:

library(Rcpp)
`%count<%` <- cppFunction('
size_t count_less(NumericVector x, NumericVector y) {
const size_t nx = x.size();
const size_t ny = y.size();
if (nx > 1 & ny > 1) stop("Only one parameter can be a vector!");
size_t count = 0;
if (nx == 1) {
double c = x[0];
for (int i = 0; i < ny; i++) count += c < y[i];
} else {
double c = y[0];
for (int i = 0; i < nx; i++) count += x[i] < c;
}
return count;
}
')
set.seed(42)
N <- 100000000
v <- runif(N, 0, 10000)
system.time(sum(v < 5000))
system.time(v %count<% 5000)

Often this won’t be the bottleneck, but may be useful at some point.

To leave a comment for the author, please follow the link and comment on their blog: bioCS.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)