[This article was first published on jared huling, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
This post is mostly an attempt to familiarize myself with Rmarkdown, jekyll, and github. I recently posted an R package (rfunctions), which contains some functions I wrote (or modified) that make my life a little easier. I’ll go through some examples in R to highlight the various functions included.
Installation
rfunctions is not available on CRAN, but can be installed using the R package devtools. rfunctions can be installed with the following R code:
Accelerated crossprod function
A project I’ve been working on requires fast evaluation of $X^TX$ for a design matrix $X$. I found a great example in the paper for RcppEigen by Douglas Bates and Dirk Eddelbuettel for just such a thing. RcppEigen provides a simple and effective interface between R and the blazing-fast Eigen C++ library for numerical linear algebra. Their example uses inline, a nice tool for inline C++ code in R, and I a made a proper R function from that. The following showcases the speed of Eigen. Note that since $X^TX$ is symmetric, we only have to compute half of the values, which further reduces computation time.
crossprodcpp can also compute a weighted cross product $X^T W X$ where $W$ is a diagonal weight matrix
Largest Singular Value Computation
The Lanczos algorithm is a well-known method for fast computation of extremal eigenvalues. The Golub-Kahan-Lanczos bidiagonalization algorithm is an extension of this to approximate the largest singular values of a matrix $X$ from below. The function gklBidiag approximates the largest singular value of a matrix. Since GKL bidiagonalization is initialized from a random vector, we can compute a probabilistic upper bound for the singular value. The following compares the speed of gklBidiag and the implementation in the popular Fortran library PROPACK found in the svd package
As gklBidiag also works on sparse matrices (of the SparseMatrix class from the Matrix package), I can showcase another function in rfunctions, simSparseMatrix, which unsurprisingly simulates matrices with very few nonzero values. The nonzero values can either be all 1’s or generated from a normal distribution. The level of sparsity of the simulated matrix can be specified
Faster Addition/Subtraction of Matrices
This may seem pointless, but I wrote functions to add and subtract matrices. It turns out my functions are faster than using the + and - operators. I’m sure someone will be quick to point out why using my add() and subtract() functions is silly and a bad idea.
The add() and subtract() methods for dense matrices are slower than the corresponding operators, so they’re only worth using when you have sparse matrices.
To leave a comment for the author, please follow the link and comment on their blog: jared huling.