[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In extreme value theory, we assume that the 90% quantile (of the initial distribution) can be obtained easily, e.g. the empirical quantile, and then, for the exceeding observations, we fit a Pareto distribution (a Generalized Pareto one to be precise), and get a parametric quantile for the 95% quantile. I.e.
If I want to get the 90% quantile regression, and the 10% quantile, the code is simply,
library(mnormt) library(quantreg) library(splines) set.seed(1) mu=c(0,0) r=0 Sigma <- matrix(c(1,r,r,1), 2, 2) Z=rmnorm(2500,mu,Sigma) X=Z[,1] Y=Z[,2] base=data.frame(X,Y) plot(X,Y,col="blue",cex=.7) I=(Y>qnorm(.25))&(Y<qnorm(.75)) baseI=base[I==FALSE,] points(X[I],Y[I],col="light blue",cex=.7) abline(h=qnorm(.25),lty=2,col="blue") abline(h=qnorm(.75),lty=2,col="blue") u=seq(-5,5,by=.02) reg=rq(Y~X,data=base,tau=.05) lines(u,predict(reg,newdata=data.frame(X=u)),lty=2) reg=rq(Y~X,data=baseI,tau=.05*2) lines(u,predict(reg,newdata=data.frame(X=u)))The graph is the following
But what if observations
But why could that be interesting ? Well, because I wanted to run a quantile regression on marathon results. But I could not get the overall dataset (since I had to import observations manually, and I have to admit that it was a bit boring). So I extracted finish times of the first 10% athletes, and the latest 10%. And I was wondering if it was enough to look at the 5% and 95% quantiles, based on the age of the runner… To be continued.
To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.