microbenchmarking with R

Posted on April 28, 2012 by tylerrinker in R bloggers | 0 Comments

[This article was first published on TRinker's R Blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I love to benchmark. Maybe I’m a bit weird but I love to bench everything in R. Recently I’ve had people raise accuracy challenges to the typical system.time and rbenchmark package approaches to benchmarking. I saw Hadley Wickham promoting the package microbenchmarking and decided to give it a whirl. This approach claims to improve accuracy and adjusts to your OS. A nice box plot or a ggplot of the functions output can also aid in understanding and comparing functions. Here’s a demo test:

library(microbenchmark); library(plyr) 
op <- microbenchmark(
    PLYR=ddply(mtcars, .(cyl, gear), summarise, 
        output = mean(hp)),
    AGGR=aggregate(hp ~ cyl + gear, mtcars, mean),
    TAPPLY = tapply(mtcars$hp, interaction(mtcars$cyl, 
        mtcars$gear), mean),
times=1000L)

print(op) #standard data frame of the output
boxplot(op) #boxplot of output
library(ggplot2) #nice log plot of the output
qplot(y=time, data=op, colour=expr) + scale_y_log10()

The output to the console window using print(op) yields like this:

Unit: milliseconds
    expr      min       lq   median       uq       max
1   AGGR 2.856758 2.972932 3.121999  3.48615 121.49828
2   PLYR 7.880229 8.497956 8.983880 10.71436 139.04940
3 TAPPLY 1.108085 1.159873 1.196731  1.30824  67.33326

The ggplot log plot from the output: