variance of an exponential order statistics

[This article was first published on R – Xi'an's Og, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

This afternoon, one of my Monte Carlo students at ENSAE came to me with an exercise from Monte Carlo Statistical Methods that I did not remember having written. And I thus “charged” George Casella with authorship for that exercise!

Exercise 3.3 starts with the usual question (a) about the (Binomial) precision of a tail probability estimator, which is easy to answer by iterating simulation batches. Expressed via the empirical cdf, it is concerned with the vertical variability of this empirical cdf. The second part (b) is more unusual in that the first part is again an evaluation of a tail probability, but then it switches to find the .995 quantile by simulation and produce a precise enough [to three digits] estimate. Which amounts to assess the horizontal variability of this empirical cdf.

As we discussed about this question, my first suggestion was to aim at a value of N, number of Monte Carlo simulations, such that the .995 x N-th spacing had a length of less than one thousandth of the .995 x N-th order statistic. In the case of the Exponential distribution suggested in the exercise, generating order statistics is straightforward, since, as suggested by Devroye, see Section V.3.3, the i-th spacing is an Exponential variate with rate (N-i+1). This is so fast that Devroye suggests simulating Uniform order statistics by inverting Exponential order statistics (p.220)!

However, while still discussing the problem with my student, I came to a better expression of the question, which was to figure out the variance of the .995 x N-th order statistic in the Exponential case. Working with the density of this order statistic however led nowhere useful. A bit later, after Google-ing the problem, I came upon this Stack Exchange solution that made use of the spacing result mentioned above, namely that the expectation and variance of the k-th order statistic are

\mathbb{E}[X_{(k)}]=\sum\limits_{i=N-k+1}^N\frac1i,\qquad \mbox{Var}(X_{(k)})=\sum\limits_{i=N-k+1}^N\frac1{i^2}

which leads to the proper condition on N when imposing the variability constraint.


Filed under: Books, Kids, pictures, R, Statistics, University life Tagged: climate simulation, ecdf, empirical cdf, ENSAE, George Casella, Luc Devroye, Malakoff, Master program, Monte Carlo Statistical Methods, order statistics, spacings

To leave a comment for the author, please follow the link and comment on their blog: R – Xi'an's Og.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)