Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
R tip: consider using radix sort.
The “method = "radix"” option can greatly speed up sorting and ordering tables in R.
For a 1 million row table the speedup is already as much as 35 times (around 9.6 seconds versus 3 tenths of a second). Below is an excerpt from an experiment sorting showing default settings and showing radix sort (full code here).
timings <- microbenchmark(
order_default = d[order(d$col_a, d$col_b, d$col_c, d$col_x), ,
drop = FALSE],
order_radix = d[order(d$col_a, d$col_b, d$col_c, d$col_x,
method = "radix"), ,
drop = FALSE],
check = my_check,
times = 10L)
print(timings)
## Unit: milliseconds ## expr min lq mean median uq ## order_default 9531.2865 9653.6827 9759.8929 9690.6702 9833.2170 ## order_radix 262.1377 263.3226 278.2547 265.1452 274.2476 ## max neval ## 10329.3520 10 ## 382.2544 10
