A simple proof that the p-value distribution is uniform when the null hypothesis is true
[This article was first published on Shravan Vasishth's Slog (Statistics blog), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
[Scroll to graphic below if math doesn’t render for you]Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Thanks to Mark Andrews for correcting some crucial typos (I hope I got it right this time!).
Thanks also to Andrew Gelman for pointing out that the proof below holds only when the null hypothesis is a point null $H_0: \mu = 0$, and the dependent measure is continuous, such as reading time in milliseconds, or EEG responses.
Someone asked this question in my linear modeling class: why is it that the p-value has a uniform distribution when the null hypothesis is true? The proof is remarkably simple (and is called the probability integral transform).
First, notice that when a random variable Z comes from a $Uniform(0,1)$ distribution, then the probability that $Z$ is less than (or equal to) some value $z$ is exactly $z$: $P(Z\leq z)=z$.
Next, we prove the following proposition:
Proposition:
If a random variable $Z=F(T)$, then $Z \sim Uniform(0,1)$.
Note here that the p-value is a random variable, call it $Z$. The p-value is computed by calculating the probability of seeing a t-statistic or something more extreme under the null hypothesis. The t-statistic comes from a random variable $T$ that is a transformation of the random variable $\bar{X}$: $T=(\bar{X}-\mu)/(\sigma/\sqrt{n})$. This random variable T has a CDF $F$.
So, if we can prove the above proposition, we have shown that the p-value’s distribution under the null hypothesis is $Uniform(0,1)$.
Proof:
Let $Z=F(T)$.
$P(Z\leq z) = P(F(T)\leq z) = P(F^{-1} F(T) \leq F^{-1}(z) )
= P(T \leq F^{-1} (z) )
= F(F^{-1}(z))= z$.
Since $P(Z\leq z)=z$, Z is uniformly distributed, that is, Uniform(0,1).
A screengrab in case the above doesn’t render:
To leave a comment for the author, please follow the link and comment on their blog: Shravan Vasishth's Slog (Statistics blog).
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.