[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I am working through Bayesian Approaches to Clinical Trials and Health-Care Evaluation (David J. Spiegelhalter, Keith R. Abrams, Jonathan P. Myles) (referred to as BACTHCE from here on). In chapter three I saw an example (3.13) where I wanted to do the calculations myself. The example concerns a hierarchical model of death rates which is calculated via a normal approximation of odds ratios.Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Data
According to BACTHCE: ‘Epidemiology, animal models and biochemical studies suggested intravenous magnesium sulphate may have a protective effect after acute myocardial infarction (AMI), articularly through preventing serious arrhythmias‘. A small set of randomized trials was performed. We will analyze those data. In the data below MgD and ControlD refers to deaths for Mg treated and control patients, while MgN and ControlN refers to the number of patients enrolled in each group.r1 <- read.csv(text=”
Trial MgD MgN ControlD ControlN
Morton 1 40 2 36
Rasmussen 9 135 23 135
Smith 2 200 7 200
Abraham 1 48 1 46
Feldstedt 10 150 8 148
Shechter 1 59 9 56
Ceremuzynski 1 25 3 23
LIMIT-2 90 1159 118 1157
“,skip=1,sep=”)
Analysis
Odds rations
The first analysis will be to reproduce all the derived variables, such as log odds ratios. This will reproduce the essentials of table 3.8. All formulas are in BACTHCE. Equivalent N is the N if this were normal distributed data with SD=2.logoddrat <- function(
a1,b1=N1-a1,N1=a1+b1,
a2,b2=N2-a2,N2=a2+b2) {
log(((a1+.5)/(b1+.5))/
((a2+.5)/(b2+.5)))
}
sdlogoddrat <- function(
a1,b1=N1-a1,N1=a1+b1,
a2,b2=N2-a2,N2=a2+b2) {
sqrt(1/(a1+.5)+1/(b1+.5) +1
/(a2+.5)+1/(b2+.5))
}
r1$logoddsratio <- with(r1,
logoddrat(a1=MgD,N1=MgN,a2=ControlD,N2=ControlN))
r1$sdodds <- with(r1,
sdlogoddrat(a1=MgD,N1=MgN,a2=ControlD,N2=ControlN))
r1$equivN <- (2/r1$sdodds)^2
r1
Trial MgD MgN ControlD ControlN logoddsratio sdodds equivN
1 Morton 1 40 2 36 -0.64616697 1.0587581 3.568342
2 Rasmussen 9 135 23 135 -1.02299771 0.4057220 24.299805
3 Smith 2 200 7 200 -1.12412388 0.7372510 7.359177
4 Abraham 1 48 1 46 -0.04301739 1.1731854 2.906208
5 Feldstedt 10 150 8 148 0.21130909 0.4765711 17.611833
6 Shechter 1 59 9 56 -2.05412373 0.9000425 4.937805
7 Ceremuzynski 1 25 3 23 -1.02554609 1.0207731 3.838854
8 LIMIT-2 90 1159 118 1157 -0.29801453 0.1462380 187.042101
Odds ratios in plot
The plot shows the odds ratios as function of standard deviation of the hyper priors (tau). Mu is the overall mean. Note that I could not find formula for the overall mean in BACTHCE, and just assumed that the value for By in shrinkage was applied.
mut <- function(tau,r1) {
yy <- r1$logoddsratio
By <- r1$sdodds^2/(r1$sdodds^2+tau^2)
mus <- sum(yy*r1$equivN*By)/sum(r1$equivN*By)
yy <- r1$logoddsratio*(1-By)+mus*By
data.frame(Trial=c(as.character(r1$Trial),’Mu’),tau=tau,mu=c(yy,mus))
}
la <- lapply(seq(0,2,.01),function(x) mut(x,r1))
lac <- do.call(rbind,la)
lac$Trial <- factor(lac$Trial,levels=c(levels(r1$Trial),’Mu’))
ggplot(lac,aes(x=tau,y=mu,col=Trial)) +
geom_line() +
theme(legend.position=”bottom”) +
Beta binomial hierarchical model
I wanted to know how that would compare if I did that calculation as a beta-binomial, for which I used JAGS. Lor is the estimated log odds ratio, which with -0.7 is close to what is found in BACTHCE using a method of moments estimator for tau. It can be seen that tau in the neighborhood of 0 is possible. The theta parameter represents the probability of death in both groups. What I noticed here is that the control group has a slightly larger standard deviation in theta. Is that a reason for heterogeneity in tau which was mentioned in BACTHCE?
library(R2jags)
datain <- as.list(r1[,2:5])
datain$n <- nrow(r1)
model1 <- function() {
for (i in 1:2) {
kappa_log[i] ~ dpar(1,1.5)
kappa[i] <- exp(kappa_log[i])
theta[i] ~ dnorm(thetam[i],thetatau[i])
thetam[i] ~ dunif(0,1)
thetatau[i] <- pow(thetasd[i],-2)
thetasd[i] ~ dnorm(0,.3) %_% I(0,)
a[i] <- kappa[i] * theta[i]
b[i] <- kappa[i] * (1 – theta[i])
}
for (i in 1:n) {
MgD[i] ~ dbetabin(a[1],b[1],MgN[i])
ControlD[i] ~ dbetabin(a[2],b[2],ControlN[i])
}
lor <- log((theta[1]/(1-theta[1]))/
(theta[2]/(1-theta[2])))
}
parameters <- c(‘theta’,’lor’)
jj <- jags(model.file=model1,data=datain,parameters=parameters,DIC=FALSE)
jj
Inference for Bugs model at “C:/Users/Kees/AppData/Local/Temp/Rtmp6h4pKI/model1814317e3684.txt”, fit using jags,
3 chains, each with 2000 iterations (first 1000 discarded)
n.sims = 3000 iterations saved
mu.vect sd.vect 2.5% 25% 50% 75% 97.5% Rhat n.eff
lor -0.701 0.485 -1.637 -1.019 -0.706 -0.388 0.298 1.001 2500
theta[1] 0.057 0.021 0.029 0.043 0.053 0.065 0.116 1.003 1200
theta[2] 0.106 0.031 0.060 0.085 0.101 0.123 0.183 1.001 3000
For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).
Beta binomial in SAS
In clinical trials, SAS is often the tool of choice. Hence the beta binomial in PROC MCMC. The results are a bit different from JAGS, but then I had some difficulty creating the model in PROC MCMC and reruns gave somewhat varying results. I already did increase the number of samples, but there might be better tricks to get stable results.
data r1;
input Trial $ MgD MgN ControlD ControlN;
cards;
Morton 1 40 2 36
Rasmussen 9 135 23 135
Smith 2 200 7 200
Abraham 1 48 1 46
Feldstedt 10 150 8 148
Shechter 1 59 9 56
Ceremuzynski 1 25 3 23
LIMIT-2 90 1159 118 1157
;;;;
run;
proc mcmc data=r1 monitor=(theta1 theta2 lor)
ntu=2000 nbi=4000 nmc=8000 nthin=8;
parms (kappa_log1 kappa_log2) 1.8 (theta1 theta2) .1;
prior kappa_log1 ~ uniform(0,5);
prior kappa_log2 ~ uniform(0,5);
prior theta1 ~ uniform(0,1);
prior theta2 ~ uniform(0,1);
kappa1 = exp(kappa_log1);
kappa2 = exp(kappa_log2);
a1 = kappa1*theta1;
b1 = kappa1*(1-theta1);
a2 = kappa2*theta2;
b2 = kappa2*(1-theta2);
lor = log((a1/b1)/(a2/b2));
random Mgp ~beta(a1,b1) subject=trial;
random Controlp ~beta(a2,b2) subject=trial;
model MgD ~ binomial(MgN,Mgp);
model ControlD ~ binomial(ControlN,Controlp);
run;
Posterior Summaries and Intervals | |||||
---|---|---|---|---|---|
Parameter | N | Mean | Standard Deviation | 95% HPD Interval | |
theta1 | 1000 | 0.0519 | 0.0157 | 0.0244 | 0.0841 |
theta2 | 1000 | 0.0959 | 0.0224 | 0.0596 | 0.1404 |
lor | 1000 | -0.6790 | 0.3880 | -1.4862 | 0.0872 |
< section style="background-color: #e0e0e0; color: #002288; -family: Arial, Helvetica, sans-serif; -size: small; line-height: 16.0029983520508px; padding-top: 1px; text-align: center;">< article style="padding-top: 1px;">
Effective Sample Sizes | |||
---|---|---|---|
Parameter | ESS | Autocorrelation Time | Efficiency |
theta1 | 591.5 | 1.6907 | 0.5915 |
theta2 | 247.0 | 4.0483 | 0.2470 |
lor | 343.7 | 2.9097 | 0.3437 |
To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.