Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
To quantify the impact of the CPU on an analysis, I created the package benchmarkme. The idea is simple. If everyone runs the same R script, we can easily compare machines.
One of the benchmarks in the package is for comparing read/write speeds; we write a large CSV file (using write.csv) and read it back in using read.csv
The package is on CRAN can be installed in the usual way
install.packages("benchmarkme")
Running
library(benchmarkme) ## If your computer is relatively slow, remove 200 from below res = benchmark_io(runs = 3, size = c(5, 50, 200)) ## Upload you data set upload_results(res)
creates three matrices of size 5MB, 20MB and 200MB, writes the associated CSV file to the directory
Sys.getenv("TMPDIR")
and then reads the data set back into R. The object res contains the timings which can compared to other users via
plot(res)
The above graph plots the current benchmarking results for writing a 5MB file (my machine is relatively fast).
Shiny
You can also compare your results using the Shiny interface. Simply create a results bundle
create_bundle(res, filename = "results.rds")
and upload to the webpage.
Network drives
Often the dataset we wish to access is on a network drive. Unfortunately, network drives can be slow. The benchmark_io function has an argument that allows us to change the directory and estimate the network drive impact
res_net = benchmark_io(runs = 3, size = c(5, 20, 200), tmpdir = "path_to_dir")
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.