Rcpp vs. R implementation of cosine similarity
[This article was first published on R Chronicle, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
While speeding up some code the other day working on a project with a colleague I ended up trying Rcpp for the first time. I re-implemented the cosine distance function using RcppArmadillo relatively easily using bits and pieces of code I found scattered around the web. But the speed increase was not as much as I expected comparing the Rcpp code to pure R.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
require(inline) | |
require(RcppArmadillo) | |
## extract cosine similarity between columns | |
cosine <- function(x) { | |
y <- t(x) %*% x | |
res <- 1 - y / (sqrt(diag(y)) %*% t(sqrt(diag(y)))) | |
return(res) | |
} | |
cosineRcpp <- cxxfunction( | |
signature(Xs = "matrix"), | |
plugin = c("RcppArmadillo"), | |
body=' | |
Rcpp::NumericMatrix Xr(Xs); // creates Rcpp matrix from SEXP | |
int n = Xr.nrow(), k = Xr.ncol(); | |
arma::mat X(Xr.begin(), n, k, false); // reuses memory and avoids extra copy | |
arma::mat Y = arma::trans(X) * X; // matrix product | |
arma::mat res = (1 - Y / (arma::sqrt(arma::diagvec(Y)) * arma::trans(arma::sqrt(arma::diagvec(Y))))); | |
return Rcpp::wrap(res); | |
') | |
mat <- matrix(rnorm(100000), ncol=1000) | |
x <- cosine(mat) | |
y <- cosineRcpp(mat) | |
identical(x, y) | |
[1] TRUE |
Read more »
To leave a comment for the author, please follow the link and comment on their blog: R Chronicle.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.