[This article was first published on R-english – Freakonometrics, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Consider here the case where, in some parametric inference problem, parameter
> library(compositions)
> data(DiagnosticProb)
> Y=DiagnosticProb[,"type"]-1
> X=DiagnosticProb[,c("A","B","C")]
> model = glm(Y~ilr(X),family=binomial)
> b = ilrInv(coef(model)[-1],orig=X)
> as.numeric(b)
[1] 0.3447106 0.2374977 0.4177917
We can visualize that estimator on the simplex, using
> tripoint=function(s){
+ p=s/sum(s)
+ abc2xy(matrix(p,1,3))
+ }
> lab=LETTERS[1:3]
> xl=c(-.1,1.25)
> yl=c(-.1,1.15)
> library(trifield)
> A=abc2xy(matrix(c(1,0,0),1,3))
> B=abc2xy(matrix(c(0,1,0),1,3))
> C=abc2xy(matrix(c(0,0,1),1,3))
> plot(0:1,0:1,col="white",
+ xlim=xl,ylim=yl,xlab="",ylab="",axes=FALSE)
> polygon(rbind(A,B,C),col="light yellow")
> text(B[1],-.05,lab[2])
> text(A[1],1.05,lab[1])
> text(C[1],-.05,lab[3])
> segments((A[1]+C[1])/2,(A[2]+C[2])/2,B[1],B[2],col="grey",lty=2)
> segments((A[1]+B[1])/2,(A[2]+B[2])/2,C[1],C[2],col="grey",lty=2)
> segments((B[1]+C[1])/2,(B[2]+C[2])/2,A[1],A[2],col="grey",lty=2)
> points(tripoint(b),pch=19,cex=2,col="red")
If we want to compute a ‘confidence region’, we can either use Bayesian models (with a Dirichlet distribution as prior distribution), or use bootstrap. We will use here the second idea
> MB=matrix(NA,1e4,2)
> for(sim in 1:1e4){
+ idx=sample(1:nrow(DiagnosticProb),
+ size=nrow(DiagnosticProb),replace=TRUE)
+ Y=DiagnosticProb[idx,"type"]-1
+ X=DiagnosticProb[idx,c("A","B","C")]
+ model = glm(Y~ilr(X),family=binomial)
+ MB[sim,]=tripoint(as.numeric(
+ ilrInv(coef(model)[-1],orig=X)))}
To get some ‘confidence region’, we can then use the bagplot, to get either a region where 50% of the boostraped estimators are, or 95%,
> library(aplpack) > P1=bagplot(MB[,1],MB[,2], factor =1.96, cex=.9, + dkmethod=2,show.baghull=TRUE) > P2=bagplot(MB[,1],MB[,2], factor =0.67, cex=.9, + dkmethod=2,show.baghull=TRUE)
Then we can easily plot those two regions,
> plot(0:1,0:1,col="white") > polygon(rbind(A,B,C),col="light yellow") > text(B[1],-.05,lab[2]) > text(A[1],1.05,lab[1]) > text(C[1],-.05,lab[3]) > polygon(P1$hull.loop,col="yellow",border=NA) > polygon(P2$hull.loop,col="orange",border=NA) > segments((A[1]+C[1])/2,(A[2]+C[2])/2,B[1],B[2],col="grey",lty=2) > segments((A[1]+B[1])/2,(A[2]+B[2])/2,C[1],C[2],col="grey",lty=2) > segments((B[1]+C[1])/2,(B[2]+C[2])/2,A[1],A[2],col="grey",lty=2) > points(tripoint(b),pch=19,cex=2,col="red")
To leave a comment for the author, please follow the link and comment on their blog: R-english – Freakonometrics.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
