R and Meta-Analysis
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Joseph Rickert
Broadly speaking, a meta-analysis is any statistical analysis that attempts to combine the results of several individual studies. The term was apparently coined by statistician Gene V Glass in a 1976 speech he made to the American Education Research Association. Since that time, not only has meta-analysis become a fundamental tool in medicine, but it is also becoming popular in economics, finance, the social sciences and engineering. Organizations responsible for setting standards for evidence-based medicine such as the United Kingdom’s National Institute for Health and Care Excellence (NICE) make extensive use of meta-analysis.
The application of meta-analysis to medicine is intuitive and, on the surface, compelling. Clinical trials designed to test efficacy for some new treatment for a disease against the standard treatment tend to be based on relatively small samples. (For example, the largest four trials for Respiratory Tract Diseases currently listed on ClinicalTrials.gov has an estimated enrollment of 533 patients.) It would seem to be a “no brainer” to use “all of the information” to get more accurate results. However, as for so many things, the devil is in the details. The preliminary tasks of establishing a rigorous protocol for guiding the meta-analysis and the systematic review to search for relevant studies are themselves far from trivial. One has to work hard to avoid “selection bias”, “publication bias” and other even more subtle difficulties.
In my limited experience with meta-analysis, I found it extraordinariy difficult to determine whether patient populations from different clinical trials were sufficiently homogenous to be included in the same meta-analysis. Even when working with well-written papers, published in quality journals, a considerable amount of medical expertise was required to interpret the data. I came away with the strong impression that a good meta-analysis requires collaboration from a team of experts.
Historically, it has probably been the case that most meta-analyses were conducted either with general tools such as Excel or specialized software like RevMan from the Cochrane Collaboration. However, R is the natural platform for meta-analysis both because of the myriad possibilities for statistical analyses that are not generally available through the specialized software, and because of the many packages devoted to various aspects of meta-analysis. The CRAN Meta Analysis Task View is exceptionally well-organized listing R packages according to the different stages of conducting a meta-analysis and also calling out some specialized techniques such as meta-regression and network-meta analysis.
ln a future post, I hope to be able to explore some of these packages more closely. For now, let’s look at a very simple analysis based on Thomas Lumley’s rmeta package which has been a part of R since 1999. The following simple meta-analysis is written up very nicely in the book by Chen and Peace titled Applied Meta-Analysis with R.
The cochrane data set in the rmeta package contains the results from seven randomized clinical trials designed to test the effectiveness of corticosteriod therapy in preventing neonatal deaths in premature labor. The columns of the data set are: the name of the trial center, the number of deaths in the treatment group, the total number of patients in the treatment group, the number of deaths in the control group and the total number of patients in the control group.
# Simple Meta-analysis library(rmeta) data(cochrane) cochrane name ev.trt n.trt ev.ctrl n.ctrl 1 Auckland 36 532 60 538 2 Block 1 69 5 61 3 Doran 4 81 11 63 4 Gamsu 14 131 20 137 5 Morrison 3 67 7 59 6 Papageorgiou 1 71 7 75 7 Tauesch 8 56 10 7
The null hypothesis is that there is no difference between treatment and control. Following Chen and Peace, we fit both fixed effects and random effects models to look at the odds ratios.
model.FE <- meta.MH(n.trt,n.ctrl,ev.trt,ev.ctrl, names=name,data=cochrane) model.RE <- meta.DSL(n.trt,n.ctrl,ev.trt,ev.ctrl, names=name,data=cochrane)
The summary for the fixed effects models shows that while only two studies, Auckland and Doran, individually show a significant effect, the overall confidence interval from the Mantel Haenszel test does indicate a benefit from the treatment.
Fixed effects ( Mantel-Haenszel ) meta-analysis Call: meta.MH(ntrt = n.trt, nctrl = n.ctrl, ptrt = ev.trt, pctrl = ev.ctrl, names = name, data = cochrane) ------------------------------------ OR (lower 95% upper) Auckland 0.58 0.38 0.89 Block 0.16 0.02 1.45 Doran 0.25 0.07 0.81 Gamsu 0.70 0.34 1.45 Morrison 0.35 0.09 1.41 Papageorgiou 0.14 0.02 1.16 Tauesch 1.02 0.37 2.77 ------------------------------------ Mantel-Haenszel OR =0.53 95% CI ( 0.39,0.73 ) Test for heterogeneity: X^2( 6 ) = 6.9 ( p-value 0.3303 )
The summary for the random effects model for this data is identical except, as one would expect, the overall confidence interval is somewhat wider: SummaryOR= 0.53 95% CI ( 0.37,0.78 ). A slight modification to enhanced the forest plot code provided by Chen and Peace (which works for both the fixed effects and random effects model objects) shows the typical way to present these results.
CPplot <- function(model){ c1 <- c("","Study",model$names,NA,"Summary") c2 <- c("Deaths","(Steroid)",cochrane$ev.trt,NA,NA) c3 <- c("Deaths","(Placebo)",cochrane$ev.ctrl,NA,NA) c4 <- c("","OR",format(exp(model[[1]]),digits=2),NA,format(exp(model[[3]]),digits=2)) tableText <-cbind(c1,c2,c3,c4) mean <- c(NA,NA,model[[1]],NA,model[[3]]) stderr <- c(NA,NA,model[[2]],NA,model[[4]]) low <- mean - 1.96*stderr up <- mean + 1.96*stderr forestplot(tableText,mean,low,up,zero=0, is.summary=c(TRUE,TRUE,rep(FALSE,8),TRUE),clip=c(log(0.1),log(2.5)),xlog=TRUE) } CPplot(model.FE)
The whole idea of meta-analysis is intriguing. However, because of the challenges I mentioned above, I would be remiss not to point out that it elicits considerable criticism. The article Meta-analysis and its problems by H J Eysenck captures the issues and is well worth reading. Also, have a look at the review article by Walker, Hernandez and Kattan writing in the Cleveland Clinic Journal of Medicine.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.