Confidence interval for predictions with GLMs
[This article was first published on Freakonometrics - Tag - R-english, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Consider a (simple) Poisson regression . Given a sample where , the goal is to derive a 95%
confidence interval for given , where is the
prediction. Hence, we want to derive a confidence interval for the
prediction, not the potential observation, i.e. the dot on the graph belowWant to share your content on R-bloggers? click here if you have a blog, or here if you don't.
> r=glm(dist~speed,data=cars,family=poisson) > P=predict(r,type="response", + newdata=data.frame(speed=seq(-1,35,by=.2))) > plot(cars,xlim=c(0,31),ylim=c(0,170)) > abline(v=30,lty=2) > lines(seq(-1,35,by=.2),P,lwd=2,col="red") > P0=predict(r,type="response",se.fit=TRUE, + newdata=data.frame(speed=30)) > points(30,P1$fit,pch=4,lwd=3)i.e.
Let denote the maximum likelihood estimator of . Then
where is Fisher information of (from standard maximum likelihood theory). Recall that
where computation of those values is based on the following calculations
In the case of the log-Poisson regression
Let us get back to our initial problem.
- confidence interval for the linear combination
Then, since as an asymptotic multivariate distribution, any linear combination of the parameters will also be normal, i.e.
has a normal distribution, centered on , with variance where is the variance of . All those quantities can be easily computed. First, we can get the variance of the estimators
> i1=sum(predict(reg,type="response")) > i2=sum(cars$speed*predict(reg,type="response")) > i3=sum(cars$speed^2*predict(reg,type="response")) > I=matrix(c(i1,i2,i2,i3),2,2) > V=solve(I)Hence, if we compare with the output of the regression,
> summary(reg)$cov.unscaled (Intercept) speed (Intercept) 0.0066870446 -3.474479e-04 speed -0.0003474479 1.940302e-05 > V [,1] [,2] [1,] 0.0066871228 -3.474515e-04 [2,] -0.0003474515 1.940318e-05Based on those values, it is easy to derive the standard deviation for the linear combination,
> x=30 > P2=predict(r,type="link",se.fit=TRUE, + newdata=data.frame(speed=x)) > P2 $fit 1 5.046034 $se.fit [1] 0.05747075 $residual.scale [1] 1 > > sqrt(V[1,1]+2*x*V[2,1]+x^2*V[2,2]) [1] 0.05747084 > sqrt(t(c(1,x))%*%V%*%c(1,x)) [,1] [1,] 0.05747084And once we have the standard deviation, and normality (at least asymptotically), confidence intervals are derived, and then, taking the exponential of the bounds, we get confidence interval
> segments(30,exp(P2$fit-1.96*P2$se.fit), + 30,exp(P2$fit+1.96*P2$se.fit),col="blue",lwd=3)Based on that technique, confidence intervals are not centered on the prediction, but who cares ?
- delta method
> estmean=t(c(1,x))%*%coef(reg) > var=t(c(1,x))%*%summary(reg)$cov.unscaled%*%c(1,x) > library(msm) > deltamethod (~ exp(x1), estmean, var) [1] 8.931232 > P1=predict(r,type="response",se.fit=TRUE, + newdata=data.frame(speed=30)) > P1 $fit 1 155.4048 $se.fit 1 8.931232 $residual.scale [1] 1The delta method gives us (asymptotic) normality, so once we have a standard deviation, we get the confidence interval.
> segments(30,P1$fit-1.96*P1$se.fit,30, + P1$fit+1.96*P1$se.fit,col="blue",lwd=3)
Note that those quantities – obtained with two different approaches – are rather close here
> exp(P2$fit-1.96*P2$se.fit) 1 138.8495 > P1$fit-1.96*P1$se.fit 1 137.8996 > exp(P2$fit+1.96*P2$se.fit) 1 173.9341 > P1$fit+1.96*P1$se.fit 1 172.9101
- bootstrap techniques
To leave a comment for the author, please follow the link and comment on their blog: Freakonometrics - Tag - R-english.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.