Hierarchical log odds model example

[This article was first published on Wiekvoet, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I am working through Bayesian Approaches to Clinical Trials and Health-Care Evaluation (David J. Spiegelhalter, Keith R. Abrams, Jonathan P. Myles) (referred to as BACTHCE from here on). In chapter three I saw an example (3.13) where I wanted to do the calculations myself. The example concerns a hierarchical model of death rates which is calculated via a normal approximation of odds ratios.

Data 

According to BACTHCE: ‘Epidemiology, animal models and biochemical studies suggested intravenous magnesium sulphate may have a protective effect after acute myocardial infarction (AMI), articularly through preventing serious arrhythmias‘. A small set of randomized trials was performed. We will analyze those data. In the data below MgD and ControlD refers to deaths for Mg treated and control patients, while MgN and ControlN refers to the number of patients enrolled in each group.
r1 <- read.csv(text="
        Trial      MgD  MgN ControlD ControlN
        Morton       1   40   2   36 
        Rasmussen    9  135  23  135
        Smith        2  200   7  200
        Abraham      1   48   1   46
        Feldstedt   10  150   8  148
        Shechter     1   59   9   56
        Ceremuzynski 1   25   3   23
        LIMIT-2     90 1159 118 1157
        “,skip=1,sep=”)

Analysis

Odds rations

The first analysis will be to reproduce all the derived variables, such as log odds ratios. This will reproduce the essentials of table 3.8. All formulas are in BACTHCE. Equivalent N is the N if this were normal distributed data with SD=2.
logoddrat <- function(
    a1,b1=N1-a1,N1=a1+b1,
    a2,b2=N2-a2,N2=a2+b2) {
  log(((a1+.5)/(b1+.5))/
          ((a2+.5)/(b2+.5)))
}
sdlogoddrat <- function(
    a1,b1=N1-a1,N1=a1+b1,
    a2,b2=N2-a2,N2=a2+b2) {
  sqrt(1/(a1+.5)+1/(b1+.5) +1
          /(a2+.5)+1/(b2+.5))
}

r1$logoddsratio <- with(r1,
    logoddrat(a1=MgD,N1=MgN,a2=ControlD,N2=ControlN))
r1$sdodds <- with(r1,
    sdlogoddrat(a1=MgD,N1=MgN,a2=ControlD,N2=ControlN))
r1$equivN <- (2/r1$sdodds)^2
r1
         Trial MgD  MgN ControlD ControlN logoddsratio    sdodds     equivN
1       Morton   1   40        2       36  -0.64616697 1.0587581   3.568342
2    Rasmussen   9  135       23      135  -1.02299771 0.4057220  24.299805
3        Smith   2  200        7      200  -1.12412388 0.7372510   7.359177
4      Abraham   1   48        1       46  -0.04301739 1.1731854   2.906208
5    Feldstedt  10  150        8      148   0.21130909 0.4765711  17.611833
6     Shechter   1   59        9       56  -2.05412373 0.9000425   4.937805
7 Ceremuzynski   1   25        3       23  -1.02554609 1.0207731   3.838854
8      LIMIT-2  90 1159      118     1157  -0.29801453 0.1462380 187.042101

Odds ratios in plot

The plot shows the odds ratios as function of standard deviation of the hyper priors (tau). Mu is the overall mean. Note that I could not find formula for the overall mean in BACTHCE, and just assumed that the value for By in shrinkage was applied. 
mut <- function(tau,r1) {  
  yy <- r1$logoddsratio
  By <- r1$sdodds^2/(r1$sdodds^2+tau^2)
  mus <- sum(yy*r1$equivN*By)/sum(r1$equivN*By)
  yy <- r1$logoddsratio*(1-By)+mus*By
  data.frame(Trial=c(as.character(r1$Trial),’Mu’),tau=tau,mu=c(yy,mus)) 
}

la <- lapply(seq(0,2,.01),function(x) mut(x,r1))
lac <- do.call(rbind,la) 
lac$Trial <- factor(lac$Trial,levels=c(levels(r1$Trial),'Mu'))

ggplot(lac,aes(x=tau,y=mu,col=Trial)) +
    geom_line() +
    theme(legend.position=”bottom”)  +
    guides(col=guide_legend(ncol=3)) +
    scale_y_continuous(‘Log odds ratio’)

Beta binomial hierarchical model

I wanted to know how that would compare if I did that calculation as a beta-binomial, for which I used JAGS. Lor is the estimated log odds ratio, which with -0.7 is close to what is found in BACTHCE using a method of moments estimator for tau. It can be seen that tau in the neighborhood of 0 is possible. The theta parameter represents the probability of death in both groups. What I noticed here is that the control group has a slightly larger standard deviation in theta. Is that a reason for heterogeneity in tau which was mentioned in BACTHCE?
library(R2jags)
datain <- as.list(r1[,2:5])
datain$n <- nrow(r1)
model1 <- function() {
  for (i in 1:2) {
    kappa_log[i] ~  dpar(1,1.5)
    kappa[i] <- exp(kappa_log[i])
    theta[i] ~ dnorm(thetam[i],thetatau[i])
    thetam[i] ~ dunif(0,1)
    thetatau[i] <- pow(thetasd[i],-2) 
    thetasd[i] ~ dnorm(0,.3) %_% I(0,) 
    a[i] <- kappa[i] * theta[i]
    b[i] <- kappa[i] * (1 - theta[i]) 
  } 
  for (i in 1:n) {
    MgD[i] ~ dbetabin(a[1],b[1],MgN[i])
    ControlD[i] ~ dbetabin(a[2],b[2],ControlN[i])
  }
  lor <- log((theta[1]/(1-theta[1]))/
          (theta[2]/(1-theta[2])))
}

parameters <- c('theta','lor')

jj <- jags(model.file=model1,data=datain,parameters=parameters,DIC=FALSE)
jj 
Inference for Bugs model at “C:/Users/Kees/AppData/Local/Temp/Rtmp6h4pKI/model1814317e3684.txt”, fit using jags,
 3 chains, each with 2000 iterations (first 1000 discarded)
 n.sims = 3000 iterations saved
         mu.vect sd.vect   2.5%    25%    50%    75% 97.5%  Rhat n.eff
lor       -0.701   0.485 -1.637 -1.019 -0.706 -0.388 0.298 1.001  2500
theta[1]   0.057   0.021  0.029  0.043  0.053  0.065 0.116 1.003  1200
theta[2]   0.106   0.031  0.060  0.085  0.101  0.123 0.183 1.001  3000

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

Beta binomial in SAS

In clinical trials, SAS is often the tool of choice. Hence the beta binomial in PROC MCMC. The results are a bit different from JAGS, but then I had some difficulty creating the model in PROC MCMC and reruns gave somewhat varying results. I already did increase the number of samples, but there might be better tricks to get stable results.
data r1;
input  Trial $ MgD  MgN ControlD ControlN;
cards;
        Morton       1   40   2   36 
        Rasmussen    9  135  23  135
        Smith        2  200   7  200
        Abraham      1   48   1   46
        Feldstedt   10  150   8  148
        Shechter     1   59   9   56
        Ceremuzynski 1   25   3   23
        LIMIT-2     90 1159 118 1157
;;;;
run;

proc mcmc data=r1 monitor=(theta1 theta2 lor) 
     ntu=2000 nbi=4000 nmc=8000 nthin=8;
      parms (kappa_log1 kappa_log2) 1.8 (theta1 theta2) .1;
      prior kappa_log1 ~ uniform(0,5);
      prior kappa_log2 ~ uniform(0,5);
      prior theta1 ~ uniform(0,1);
      prior theta2 ~ uniform(0,1);
      kappa1 = exp(kappa_log1);
      kappa2 = exp(kappa_log2);
      a1 = kappa1*theta1;
      b1 = kappa1*(1-theta1);
      a2 = kappa2*theta2;
      b2 = kappa2*(1-theta2);
      lor = log((a1/b1)/(a2/b2));
      random Mgp ~beta(a1,b1) subject=trial;
      random Controlp ~beta(a2,b2) subject=trial;
      model MgD ~  binomial(MgN,Mgp);
      model ControlD ~ binomial(ControlN,Controlp); 
 run;  
Posterior Summaries and Intervals
ParameterNMeanStandard
Deviation
95% HPD Interval
theta110000.05190.01570.02440.0841
theta210000.09590.02240.05960.1404
lor1000-0.67900.3880-1.48620.0872



Effective Sample Sizes
ParameterESSAutocorrelation
Time
Efficiency
theta1591.51.69070.5915
theta2247.04.04830.2470
lor343.72.90970.3437

To leave a comment for the author, please follow the link and comment on their blog: Wiekvoet.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)