wapply: A faster (but less functional) ‘rollapply’ for vector setups
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
For some cryptic reason I needed a function that calculates function values on sliding windows of a vector. Googling around soon brought me to ‘rollapply’, which when I tested it seems to be a very versatile function. However, I wanted to code my own version just for vector purposes in the hope that it may be somewhat faster.
This is what turned out (wapply for “window apply”):
wapply <- function(x, width, by = NULL, FUN = NULL, ...) { FUN <- match.fun(FUN) if (is.null(by)) by <- width lenX <- length(x) SEQ1 <- seq(1, lenX - width + 1, by = by) SEQ2 <- lapply(SEQ1, function(x) x:(x + width - 1)) OUT <- lapply(SEQ2, function(a) FUN(x[a], ...)) OUT <- base:::simplify2array(OUT, higher = TRUE) return(OUT) }
It is much more restricted than ‘rollapply’ (no padding, left/center/right adjustment etc).
But interestingly, for some setups it is very much faster:
library(zoo)
x <- 1:200000
large window, small slides:
> system.time(RES1 <- rollapply(x, width = 1000, by = 50, FUN = fun)) User System verstrichen 3.71 0.00 3.84 > system.time(RES2 <- wapply(x, width = 1000, by = 50, FUN = fun)) User System verstrichen 1.89 0.00 1.92 > all.equal(RES1, RES2) [1] TRUE
small window, small slides:
> system.time(RES1 <- rollapply(x, width = 50, by = 50, FUN = fun)) User System verstrichen 2.59 0.00 2.67 > system.time(RES2 <- wapply(x, width = 50, by = 50, FUN = fun)) User System verstrichen 0.86 0.00 0.89 > all.equal(RES1, RES2) [1] TRUE
small window, large slides:
> system.time(RES1 <- rollapply(x, width = 50, by = 1000, FUN = fun)) User System verstrichen 1.68 0.00 1.77 > system.time(RES2 <- wapply(x, width = 50, by = 1000, FUN = fun)) User System verstrichen 0.06 0.00 0.06 > all.equal(RES1, RES2) [1] TRUE
There is about a 2-3 fold gain in speed for the above two setups but a 35-fold gain in the small window/large slides setup. Interesting…
I noticed that zoo:::rollapply.zoo uses mapply internally, maybe there is some overhead for pure vector calculations…
Cheers,
Andrej
Filed under: General Tagged: function, rollapply, vector, window
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.