Parametric Inference: Karlin-Rubin Theorem

[This article was first published on Analysis with Programming, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

A family of pdfs or pmfs $\{g(t|\theta):\theta\in\Theta\}$ for a univariate random variable $T$ with real-valued parameter $\theta$ has a monotone likelihood ratio (MLR) if, for every $\theta_2>\theta_1$, $g(t|\theta_2)/g(t|\theta_1)$ is a monotone (nonincreasing or nondecreasing) function of $t$ on $\{t:g(t|\theta_1)>0\;\text{or}\;g(t|\theta_2)>0\}$. Note that $c/0$ is defined as $\infty$ if $0< c$.
Consider testing $H_0:\theta\leq \theta_0$ versus $H_1:\theta>\theta_0$. Suppose that $T$ is a sufficient statistic for $\theta$ and the family of pdfs or pmfs $\{g(t|\theta):\theta\in\Theta\}$ of $T$ has an MLR. Then for any $t_0$, the test that rejects $H_0$ if and only if $T >t_0$ is a UMP level $\alpha$ test, where $\alpha=P_{\theta_0}(T >t_0)$.
Example 1
To better understand the theorem, consider a single observation, $X$, from $\mathrm{n}(\theta,1)$, and test the following hypotheses: $$ H_0:\theta\leq \theta_0\quad\mathrm{versus}\quad H_1:\theta>\theta_0. $$ Then $\theta_1>\theta_0$, and the likelihood ratio test statistics would be $$ \lambda(x)=\frac{f(x|\theta_1)}{f(x|\theta_0)}. $$ And we say that the null hypothesis is rejected if $\lambda(x)>k$. To see if the distribution of the sample has MLR property, we simplify the above equation as follows: $$ \begin{aligned} \lambda(x)&=\frac{\frac{1}{\sqrt{2\pi}}\exp\left[-\frac{(x-\theta_1)^2}{2}\right]}{\frac{1}{\sqrt{2\pi}}\exp\left[-\frac{(x-\theta_0)^2}{2}\right]}\\ &=\exp \left[-\frac{x^2-2x\theta_1+\theta_1^2}{2}+\frac{x^2-2x\theta_0+\theta_0^2}{2}\right]\\ &=\exp\left[\frac{2x\theta_1-\theta_1^2-2x\theta_0+\theta_0^2}{2}\right]\\ &=\exp\left[\frac{2x(\theta_1-\theta_0)-(\theta_1^2-\theta_0^2)}{2}\right]\\ &=\exp\left[x(\theta_1-\theta_0)\right]\times\exp\left[-\frac{\theta_1^2-\theta_0^2}{2}\right] \end{aligned} $$ which is increasing as a function of $x$, since $\theta_1>\theta_0$.
Figure 1. Normal Densities with $\mu=1,2$.
library(magrittr)
library(lattice)
x <- seq(-3, 6, length.out = 100)
{ dnorm(x, 2, 1) ~ x } %>% xyplot(
col = 'black', type = 'l', lwd = 2,
ylab = 'Density',
panel = function (x, y, ...) {
panel.grid(h = -1, v = -1)
panel.xyplot(x, y, ...)
panel.curve(dnorm(x, 1, 1), -3, 6, lty = 2, lwd = 2)
panel.arrows(3.9, dnorm(3.8, 2, 1), 4.5, .15, angle = 20)
panel.text(4.8, .17, labels = expression(paste('Model under ', H[1])))
panel.arrows(-1.1, dnorm(-1, 1, 1), -2, .15, angle = 20)
panel.text(-2, .17, labels = expression(paste('Model under ', H[0])))
}, key = list(corner = c(.9, .9),
text = list(c(expression(paste('f(x', '|', theta[1], ')')),
expression(paste('f(x', '|', theta[0], ')')))),
lines = list(lty = c(1, 2), lwd = 2)
)
)
view raw karlin1.R hosted with ❤ by GitHub
By illustration, consider Figure 1. The plot of the likelihood ratio of these models is monotone increasing as seen in Figure 2, where rejecting $H_0$ if $\lambda(x)>k$ is equivalent to rejecting it if $T\geq t_0$.
Figure 2. Likelihood Ratio of the Normal Densities.
x <- seq(-3, 3, length.out = 100)
{ dnorm(x, 2, 1) / dnorm(x, 1, 1) ~ x } %>%
xyplot(
col = 'black', type = 'l', lwd = 2, ylim = c(-.6, 4.8),
ylab = expression(paste('f(x', '|', theta[1], ')', ' / ', 'f(x', '|', theta[0], ')')),
panel = function (x, y, ...) {
panel.grid(h = -1, v = -1)
panel.xyplot(x, y, ...)
panel.abline(h = 0, lty = 2)
panel.segments(0, dnorm(2, 2, 1) / dnorm(2, 1, 1),
2, dnorm(2, 2, 1) / dnorm(2, 1, 1), lty = 2)
panel.text(0.5, dnorm(2.1, 2, 1) / dnorm(2.1, 1, 1), labels = c('Assumed k'))
from.z <- 2
to.z <- 3
S.x <- c(from.z, seq(from.z, to.z, 0.01), to.z)
S.y <- c(0, dnorm(seq(from.z, to.z, 0.01), 2, 1) / dnorm(seq(from.z, to.z, 0.01), 1, 1), 0)
panel.polygon(S.x, S.y, col = 'gray', border = 'white')
panel.segments(2, dnorm(2, 2, 1) / dnorm(2, 1, 1), 2, 0)
panel.segments(2, 0, 3, 0)
panel.segments(3, dnorm(3, 2, 1) / dnorm(3, 1, 1), 3, 0)
panel.curve(dnorm(x, 2, 1) / dnorm(x, 1, 1), 2, 3, lty = 2, lwd = 2)
panel.text(2.5, 1, labels = expression(T >= t[0]))
panel.text(2, -.2, labels = expression(t[0]))
panel.arrows(2.45, dnorm(2.5, 2, 1) / dnorm(2.5, 1, 1), 1.7, 2.9, angle = 20)
panel.text(1.3, 3, labels = expression(lambda(x) > k))
}
)
view raw karlin2.R hosted with ❤ by GitHub
And by factorization theorem the likelihood ratio test statistic can be written as a function of the sufficient statistics since the term, $h(x)$ will be cancelled out. That is, $$ \lambda(t)=\frac{g(t|\theta_1)}{g(t|\theta_0)}. $$ And by Karlin-Rubin theorem, the rejection region $R=\{t:t>t_0\}$ is a uniformly most powerful level-$\alpha$ test. Where $t_0$ satisfies the following: $$ \begin{aligned} \mathrm{P}(T>t_0|\theta_0)&=\mathrm{P}(T\in R|\theta_0)\\ \alpha&=1-\mathrm{P}(X\leq t_0|\theta_0)\\ 1-\alpha&=\int_{-\infty}^{t_0}\frac{1}{\sqrt{2\pi}}\exp\left[-\frac{(x-\theta_0)^2}{2}\right]\operatorname{d}x \end{aligned} $$ Hence the quantile of the $1-\alpha$ probability, which is $z_{\alpha}$ is equal to $t_0$, that is $z_{\alpha}=t_0$, and thus we reject $H_0$ if $T>z_{\alpha}$.

Example 2
Now consider testing the hypotheses, $H_0:\theta\geq \theta_0$ versus $H_1:\theta< \theta_0$ using the sample $X$ (single observation) from Beta($\theta$, 2), and to be more specific let $\theta_0=4$ and $\theta_1=3$. Can we apply Karlin-Rubin? Of course! Visually, we have something like in Figure 3.
Figure 3. Beta Densities Under Different Parameters.
x <- seq(0, 1, length.out = 100)
{ dbeta(x, 3, 2) ~ x } %>% xyplot(
col = 'black', type = 'l', lwd = 2,
ylab = 'Density', ylim = c(-0.2, 2.2),
panel = function (x, y, ...) {
panel.grid(h = -1, v = -1)
panel.xyplot(x, y, ...)
panel.curve(dbeta(x, 4, 2), 0, 1, lty = 2, lwd = 2)
panel.arrows(.24, dbeta(.25, 3, 2), .18, .9, angle = 20)
panel.text(.14, 1, labels = expression(paste('Model under ', H[1])))
panel.arrows(.39, dbeta(.38, 4, 2), .52, .55, angle = 20)
panel.text(.65, .51, labels = expression(paste('Model under ', H[0])))
}, key = list(corner = c(.1, .9),
text = list(c(expression(paste('f(x', '|', theta[1]==3, ', ', 2, ')')),
expression(paste('f(x', '|', theta[0]==4, ', ', 2, ')')))),
lines = list(lty = c(1, 2), lwd = 2)
)
)
view raw karlin3.R hosted with ❤ by GitHub
Note that for this test, $\theta_1<\theta_0$, and so the likelihood ratio test statistics is simplified as follows: $$ \begin{aligned} \lambda(x)&=\frac{f(x|\theta_1=3, 2)}{f(x|\theta_0=4, 2)}=\frac{\displaystyle\frac{\Gamma(\theta_1+2)}{\Gamma(\theta_1)\Gamma(2)}x^{\theta_1-1}(1-x)^{2-1}}{\displaystyle\frac{\Gamma(\theta_0+2)}{\Gamma(\theta_0)\Gamma(2)}x^{\theta_0-1}(1-x)^{2-1}}\\ &=\frac{\displaystyle\frac{\Gamma(5)}{\Gamma(3)\Gamma(2)}x^{2}(1-x)}{\displaystyle\frac{\Gamma(6)}{\Gamma(4)\Gamma(2)}x^{3}(1-x)}=\frac{\displaystyle\frac{12\Gamma(3)}{\Gamma(3)\Gamma(2)}x^{2}(1-x)}{\displaystyle\frac{20\Gamma(4)}{\Gamma(4)\Gamma(2)}x^{3}(1-x)}\\ &=\frac{3}{5x}, \end{aligned} $$ which is decreasing as a function of $x$, see the plot of this in Figure 4. And we say that $H_0$ is rejected if $\lambda(x) > k$ if and only if $T < t_0$. Where $t_0$ satisfies the following equations: $$ \begin{aligned} \mathrm{P}(T < t_0|\theta_0)&=\mathrm{P}(X < t_0|\theta_0)\\ \alpha&=\int_{0}^{t_0}\frac{\Gamma(\theta_0+2)}{\Gamma(\theta_0)\Gamma(2)}x^{\theta_0-1}(1-x)^{2-1}\operatorname{d}x\\ \alpha&=\int_{0}^{t_0}\frac{\Gamma(6)}{\Gamma(4)\Gamma(2)}x^{3}(1-x)\operatorname{d}x. \end{aligned} $$ Hence the quantile of the $\alpha$ probability, $x_{\alpha}=t_0$. And thus we reject $H_0$ if $T < x_{\alpha}$.
Figure 4. Likelihood Ratio of the Beta Densities.
x <- seq(0, 1, length.out = 100)
{ dbeta(x, 3, 2) / dbeta(x, 4, 2) ~ x } %>% xyplot(
col = 'black', type = 'l', lwd = 2,
ylab = expression(paste('f(x', '|', theta[1], ')', ' / ', 'f(x', '|', theta[0], ')')),
ylim = c(-0.2, 5.15),
panel = function (x, y, ...) {
panel.grid(h = -1, v = -1)
panel.xyplot(x, y, ...)
panel.abline(h = 0, lty = 2)
panel.segments(.3, dbeta(.3, 3, 2) / dbeta(.3, 4, 2), .6, dbeta(.3, 3, 2) / dbeta(.3, 4, 2), lty = 2)
panel.text(.5, 2.2, labels = 'Assumed k')
from.z <- 0.01
to.z <- .3
S.x <- c(from.z, seq(from.z, to.z, 0.01), to.z)
S.y <- c(0, dbeta(seq(from.z, to.z, 0.01), 3, 2) / dbeta(seq(from.z, to.z, 0.01), 4, 2), 0)
panel.polygon(S.x, S.y, col = 'gray', border = 'white')
panel.segments(.3, dbeta(.3, 3, 2) / dbeta(.3, 4, 2), .3, 0)
panel.segments(0, 0, 0, 0)
panel.segments(0, 6, 0, 0)
panel.polygon(c(0, seq(0, .02, .01), .02), c(6, rep(6, each = length(seq(0, .02, .01))), 6))
panel.curve(dbeta(x, 3, 2) / dbeta(x, 4, 2), 0, .3, lty = 2, lwd = 2)
panel.arrows(.21, dbeta(.2, 3, 2) / dbeta(.2, 4, 2), .35, 3.3, angle = 20)
panel.text(.42, 3.3, labels = expression(lambda(x) > k))
panel.text(.15, 1.5, labels = expression(T < t[0]))
}
)
view raw karlin4.R hosted with ❤ by GitHub

Reference

  1. Casella, G. and Berger, R.L. (2001). Statistical Inference. Thomson Learning, Inc.

To leave a comment for the author, please follow the link and comment on their blog: Analysis with Programming.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)