Timing in R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
As time goes on, your R scripts are probably getting longer and more complicated, right? Timing parts of your script could save you precious time when re-running code over and over again. Today I’m going to go through the 4 main functions for doing so.
Nested timings
1) Sys.time()
Sys.time()
takes a “snap-shot” of the current time and so it can be used to record start and end times of code.
start_time = Sys.time() Sys.sleep(0.5) end_time = Sys.time()
To calculate the difference, we just use a simple subtraction
end_time - start_time ## Time difference of 0.5061 secs
Notice it creates a neat little message for the time difference.
2) The tictoc package
You can install the CRAN
version of tictoc via
install.packages("tictoc")
whilst the most recent development is available via the tictoc GitHub page.
library("tictoc")
Like Sys.time()
, tictoc also gives us ability to nest timings within code. However, we now have some more options to customise our timing. At it’s most basic it acts like Sys.time()
:
tic() Sys.sleep(0.5) toc() ## 0.506 sec elapsed
Now for a more contrived example.
# start timer for the entire section, notice we can name sections of code tic("total time") # start timer for first subsection tic("Start time til half way") Sys.sleep(2) # end timer for the first subsection, log = TRUE tells toc to give us a message toc(log = TRUE) ## Start time til half way: 2.037 sec elapsed
Now to start the timer for the second subsection
tic("Half way til end") Sys.sleep(2) # end timer for second subsection toc(log = TRUE) ## Half way til end: 2.012 sec elapsed # end timer for entire section toc(log = TRUE) ## total time: 4.067 sec elapsed
We can view the results as a list (format = TRUE
returns this list in a nice format), rather than raw code
tic.log(format = TRUE) ## [[1]] ## [1] "Start time til half way: 2.037 sec elapsed" ## ## [[2]] ## [1] "Half way til end: 2.012 sec elapsed" ## ## [[3]] ## [1] "total time: 4.067 sec elapsed"
We can also reset the log via
tic.clearlog()
Comparing functions
1) system.time()
Why oh WHY did R choose to give system.time()
a lower case s
and Sys.time()
and upper case s
? Anyway… system.time()
can be used to time functions without having to take note of the start and end times.
system.time(Sys.sleep(0.5)) ## user system elapsed ## 0.000 0.000 0.501 system.time(Sys.sleep(1)) ## user system elapsed ## 0.000 0.000 1.003
We only want to take notice of the “elapsed” time, for the definition of the “user” and “system” times see this thread.
For a repeated timing, we would use the replicate()
function.
system.time(replicate(10, Sys.sleep(0.1))) ## user system elapsed ## 0.000 0.000 1.007
2) The microbenchmark package
You can install the CRAN
version of microbenchmark via
install.packages("microbenchmark")
Alternatively you can install the latest update via the microbenchmark GitHub page.
library("microbenchmark")
At it’s most basic, microbenchmark()
can we used to time single pieces of code.
# times = 10: repeat the test 10 times # unit = "s": output in seconds microbenchmark(Sys.sleep(0.1), times = 10, unit = "s") ## Unit: seconds ## expr min lq mean median uq max neval ## Sys.sleep(0.1) 0.1001 0.1006 0.1005 0.1006 0.1006 0.1006 10
Notice we get a nicely formatted table of summary statistics. We can record our times in anything from seconds to nanoseconds(!!!!). Already this is better than system.time()
. Not only that, but we can compare sections of code in an easy-to-do way and name the sections of code for an easy-to-read output.
sleep = microbenchmark(sleepy = Sys.sleep(0.1), sleepier = Sys.sleep(0.2), sleepiest = Sys.sleep(0.3), times = 10, unit = "s")
As well as this (more?!) microbenchmark comes with a two built-in plotting functions.
microbenchmark:::autoplot.microbenchmark(sleep)
microbenchmark:::boxplot.microbenchmark(sleep)
These provide quick and efficient ways of visualising our timings.
Conclusion
Sys.time()
and system.time()
have there place, but for most cases we can do better. The tictoc and microbenchmark packages are particularly useful and make it easy to store timings for later use, and the range of options for both packages stretch far past the options for Sys.time()
and system.time()
. The built-in plotting functions are handy.
Thanks for chatting!
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.