Rcpp vs. R implementation of cosine similarity

[This article was first published on R Chronicle, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

While speeding up some code the other day working on a project with a colleague I ended up trying Rcpp for the first time. I re-implemented the cosine distance function using RcppArmadillo relatively easily using bits and pieces of code I found scattered around the web. But the speed increase was not as much as I expected comparing the Rcpp code to pure R.
require(inline)
require(RcppArmadillo)
## extract cosine similarity between columns
cosine <- function(x) {
y <- t(x) %*% x
res <- 1 - y / (sqrt(diag(y)) %*% t(sqrt(diag(y))))
return(res)
}
cosineRcpp <- cxxfunction(
signature(Xs = "matrix"),
plugin = c("RcppArmadillo"),
body='
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP
int n = Xr.nrow(), k = Xr.ncol();
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy
arma::mat Y = arma::trans(X) * X; // matrix product
arma::mat res = (1 - Y / (arma::sqrt(arma::diagvec(Y)) * arma::trans(arma::sqrt(arma::diagvec(Y)))));
return Rcpp::wrap(res);
')
mat <- matrix(rnorm(100000), ncol=1000)
x <- cosine(mat)
y <- cosineRcpp(mat)
identical(x, y)
[1] TRUE
view raw Rcpp_cosine.r hosted with ❤ by GitHub
And here is the speed comparison…
Read more »

To leave a comment for the author, please follow the link and comment on their blog: R Chronicle.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)