Using memoise to cache R values
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Introduction
The memoise
package can be very handy for caching the results of slow calculations. In interactive work, the slowest calculations can be reading data, so that is demonstrated here. The microbenchmark
package shows timing results.
Methods and results
Setup
First, load the package being tested, and also a benchmarking package.
1 2 | library(memoise) library(microbenchmark) |
Test conventional function
The demonstration will be for reading a CTD file.
1 | library(oce) |
## Loading required package: methods ## Loading required package: mapproj ## Loading required package: maps ## Loading required package: ncdf4 ## Loading required package: tiff
1 | microbenchmark(d <- read.oce("/data/arctic/beaufort/2012/d201211_0002.cnv")) |
## Unit: milliseconds ## expr min lq ## d <- read.oce("/data/arctic/beaufort/2012/d201211_0002.cnv") 160.4 162.5 ## median uq max neval ## 162.9 167.6 258.6 100
Memoise the function
Memoising read.oce()
is simple
1 | r <- memoise(read.oce) |
Measure the speed of memoised code
1 | microbenchmark(d <- r("/data/arctic/beaufort/2012/d201211_0002.cnv")) |
## Unit: microseconds ## expr min lq median ## d <- r("/data/arctic/beaufort/2012/d201211_0002.cnv") 47.47 48.61 49.5 ## uq max neval ## 52.57 165199 100
Conclusions
In this example, the speedup was by a factor of about 3000.
The operation tested here is quick enough for interactive work, but this is a 1-dbar file, and the time would be increased to several seconds for raw CTD data, and increased to perhaps a half minute or so if a whole section of CTD profiles is to be read. Using memoise()
would reduce that half minute to a hundredth of a second – easily converting an annoyingly slow operation to what feels like zero time in an interactive session.
Resources
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.