Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Tomorrow, we will discuss Fisher-Tippett theorem. The idea is that there are only three possible limiting distributions for normalized versions of the maxima of i.i.d. samples
i.e. the limiting distribution of the maximum is Weibull’s.
set.seed(1) s=1000000 n=100 M=matrix(runif(s),n,s/n) V=apply(M,2,max) bn=1 an=1/n U=(V-bn)/an hist(U,probability=TRUE,,col="light green", xlim=c(-7,1),main="",breaks=seq(-20,10,by=.25)) u=seq(-10,0,by=.1) v=exp(u) lines(u,v,lwd=3,col="red")
which means that the limiting distribution is Fréchet’s.
set.seed(1) s=1000000 n=100 M=matrix((runif(s))^(-1/2),n,s/n) V=apply(M,2,max) bn=0 an=n^(1/2) U=(V-bn)/an hist(U,probability=TRUE,col="light green", xlim=c(0,7),main="",breaks=seq(0,max(U)+1,by=.25)) u=seq(0,10,by=.1) v=dfrechet(u,shape=2) lines(u,v,lwd=3,col="red")
i.e. the limiting distribution for the maximum is Gumbel’s distribution.
library(evd) set.seed(1) s=1000000 n=100 M=matrix(rexp(s,1),n,s/n) V=apply(M,2,max) (bn=qexp(1-1/n)) log(n) an=1 U=(V-bn)/an hist(U,probability=TRUE,col="light green", xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,15,by=.25)) u=seq(-5,15,by=.1) v=dgumbel(u) lines(u,v,lwd=3,col="red")
Consider now a Gaussian
as
as
and it is only slightly better with 1,000 observations,
set.seed(1) s=10000000 n=1000 M=matrix(rnorm(s,0,1),n,s/n) V=apply(M,2,max) (bn=qnorm(1-1/n,0,1)) an=1/bn U=(V-bn)/an hist(U,probability=TRUE,col="light green", xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,15,by=.25)) u=seq(-5,15,by=.1) v=dgumbel(u) lines(u,v,lwd=3,col="red")
Even worst, consider lognormal observations. In that case, recall that if we consider (increasing) transformation of variates, we are in the same domain of attraction. Hence, since
then
i.e. using Taylor’s approximation on the right term,
This gives us normalizing coefficients we should use here.
set.seed(1) s=10000000 n=1000 M=matrix(rlnorm(s,0,1),n,s/n) V=apply(M,2,max) bn=exp(qnorm(1-1/n,0,1)) an=exp(qnorm(1-1/n,0,1))/(qnorm(1-1/n,0,1)) U=(V-bn)/an hist(U,probability=TRUE,col="light green", xlim=c(-2,7),ylim=c(0,.39),main="",breaks=seq(-5,40,by=.25)) u=seq(-5,15,by=.1) v=dgumbel(u) lines(u,v,lwd=3,col="red")
Credit: illustration is from Maurice Sendak’s popular book where the wild things are, translated in French as Max et les Maximonstres.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.