microbenchmarking with R
[This article was first published on TRinker's R Blog » R, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I love to benchmark. Maybe I’m a bit weird but I love to bench everything in R. Recently I’ve had people raise accuracy challenges to the typical system.time and rbenchmark package approaches to benchmarking. I saw Hadley Wickham promoting the package microbenchmarking and decided to give it a whirl. This approach claims to improve accuracy and adjusts to your OS. A nice box plot or a ggplot of the functions output can also aid in understanding and comparing functions. Here’s a demo test:
library(microbenchmark); library(plyr) op <- microbenchmark( PLYR=ddply(mtcars, .(cyl, gear), summarise, output = mean(hp)), AGGR=aggregate(hp ~ cyl + gear, mtcars, mean), TAPPLY = tapply(mtcars$hp, interaction(mtcars$cyl, mtcars$gear), mean), times=1000L) print(op) #standard data frame of the output boxplot(op) #boxplot of output library(ggplot2) #nice log plot of the output qplot(y=time, data=op, colour=expr) + scale_y_log10()
The output to the console window using print(op) yields like this:
Unit: milliseconds expr min lq median uq max 1 AGGR 2.856758 2.972932 3.121999 3.48615 121.49828 2 PLYR 7.880229 8.497956 8.983880 10.71436 139.04940 3 TAPPLY 1.108085 1.159873 1.196731 1.30824 67.33326
The ggplot log plot from the output:
The boxplot from output:
To leave a comment for the author, please follow the link and comment on their blog: TRinker's R Blog » R.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.