Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
When experimenting with various quantiles functions in R, I was shocked [ok this is a bit excessive, let us say surprised] by how widely the execution times would vary. To the point of blaming a completely different feature of R. Borrowing from Charlie Geyer’s webpage on the topic of probability distributions in R, here is a table for some standard distributions: I ran
u=runif(1e7) system.time(x<-qcauchy(u))
choosing an arbitrary parameter whenever needed.
Distribution | Function | Time |
---|---|---|
Cauchy | qcauchy |
2.2 |
Chi-Square | qchisq |
43.8 |
Exponential | qexp |
0.95 |
F | qf |
34.2 |
Gamma | qgamma |
37.2 |
Logistic | qlogis |
1.7 |
Log Normal | qlnorm |
2.2 |
Normal | qnorm |
1.4 |
Student t | qt |
31.7 |
Uniform | qunif |
0.86 |
Weibull | qweibull |
2.9 |
Of course, it does not mean much in that all the slow distributions (except for Weibull) are parameterised. Nonetheless, that a chi-square inversion take 50 times longer than a uniform inversion remains puzzling as to why it is not coded more efficiently. In particular, I was wondering why the chi-square inversion was slower than the Gamma inversion. Rerunning both inversions showed that they are equivalent:
> u=runif(1e7) > system.time(x<-qgamma(u,sha=1.5)) utilisateur système écoulé 21.534 0.016 21.532 > system.time(x<-qchisq(u,df=3)) utilisateur système écoulé 21.372 0.008 21.361
Which also shows how variable system.time can be.
Filed under: Books, R, Statistics Tagged: Charlie Geyer, execution time, pseudo-random generator, R, random simulation, standard quantile functions, system.time
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.