Site icon R-bloggers

Simulation of a Two-Group Parallel-Arm RCT with Interim Analyses

[This article was first published on Statistical Science & Related Matters on Less Likely, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

  • Recently Andrew Althouse informed me that he was going to simulate a two-group parallel-arm randomized trial with interim analyses using the rpact R package, so I offered to also help in constructing the R code to do so. He already has a number of R scripts on his GitHub repo for doing similar simulations, which can be viewed here and a number of tweets explaining these simulations. For this example, his goal was to simulate a trial where the outcome was binary and the probability of death for each group could be tuned in addition to:


    • the total number of participants
    • the number of interim analyses
    • the schedule of the interim analyses
    • the group-sequential design used

    along with the usual trial analysis parameters such as:


    • the \(\alpha\) level
    • the type of test (1-sided vs. 2-sided).

    The goal was to be able to produce a table of various statistics such as :


    • the odds ratio
    • the confidence limits
    • the \(P\)-value
    • the number of successes

    for each of the interim analyses specified.

    The function below is a reflection of our efforts to do so, and also returns several plots from the rpact package for the design that is chosen along with a plot comparing the design to other designs. In order to get similar results, you will need to load the R function first, and then simply enter the proper inputs. While there may be more efficient ways to write the code, for example using lapply() instead of for loops, we have chosen not to do so, and we have also tried to minimize the number of R packages necessary for the function to work but the following will be required:



    You can quickly install and load both using:


    req_packs <- c("rpact", "stringr")
    install.packages(req_packs)
    lapply(req_packs, library, character.only = TRUE)

    Setting up the Function


    #' @title Simulation of a Two-Group Parallel-Arm Trial With Interim Analyses
    #' @docType Custom function for simulation from the rpact package
    #' @author Andrew Althouse with edits by Zad Rafi
    #' NOTE: If you want to confirm "type 1 error" under different stopping rules,
    #' make death = in the two treatment arms (e.g. no treatment effect)
    #' NOTE: I have set this one up to test the power for a treatment that would reduce mortality
    #' from 40% in control group (1) to 30% in treatment group (2)
    #' NOTE: Trial Design Parameters - Part 1
    #' Here we will specify the basics: total N patients to enroll, and death rate for each treatment arm
    #' NOTE: Trial Design Parameters - Part 2
    #' Here we will define the interim analysis strategy and stopping rules
    #' For this trial we will include provisions for efficacy stopping only (no pre-specified futility stopping)
    #' We will use the rpact package to compute the stopping/success thresholds at the interim and final analysis
    #' NOTE: Required packages: rpact and stringr
    #' @param nSims # The number of simulations, the default is 1000
    #' @param nPatients # here is where you specify the planned max number of patients you want included in each RCT
    #' @param death1 # here is where you specify the event rate for patients receiving 'treatment 1' in these trial
    #' @param death2 # here is where you specify the event rate for patients receiving 'treatment 2' in these trials
    #' @param nLooks # here is where you put the number of looks that will take place (INCLUDING the final analysis)
    #' @param analyses_scheduled # schedule of interim analyses
    #' @param sided # Whether the test is 1-sided or 2-sided
    #' @param alpha # Specified alpha level, the default is 0.05
    #' @param informationRates #
    #' @param trials # The total number of trials you wish to load in the table results. 
    #' @param typeOfDesign # The type of design.
    #' @param seed # Argument to set the seed for the simulations
    #' @return list of dataframes and plots
    #' @example See below this code block
    interim_sim <- function(nPatients = 1000, death1 = 0.4, death2 = 0.3, 
                            nLooks = 4, analyses_scheduled = c(0.25, 0.50, 0.75, 1),
                            sided = 1, alpha = 0.05, 
                            informationRates = analyses_scheduled, 
                            typeOfDesign = "asOF", nSims = 1000, trials = 10,
                            seed = 1031) {
    
    efficacy_thresholds <- numeric(nLooks)
    design <- getDesignGroupSequential(
      sided = sided, alpha = alpha,
      informationRates = analyses_scheduled,
      typeOfDesign = typeOfDesign)
    design_2 <- getDesignGroupSequential(typeOfDesign = "P") # Pocock
    design_3 <- getDesignGroupSequential(typeOfDesign = "asP") # Alpha-spending Pocock
    design_4 <- getDesignGroupSequential(typeOfDesign = "OF") # O'Brien-Fleming
    designSet <- getDesignSet(designs = c(design, design_2,
                                          design_3, design_4),
                              variedParameters = "typeOfDesign")
    
    RNGkind(kind = "L'Ecuyer-CMRG")
    set.seed(seed)
    for (j in 1:nLooks) {
      efficacy_thresholds[j] <- design$stageLevels[j]
    }
    
    analyses_nPatients <- analyses_scheduled * nPatients
    efficacy_thresholds
    pb <- txtProgressBar(min = 0, max = nSims, initial = 0, style = 3)
    trialnum <- numeric(nSims)
    or <- data.frame(matrix(ncol = nLooks, nrow = nSims))
    lcl <- data.frame(matrix(ncol = nLooks, nrow = nSims))
    ucl <- data.frame(matrix(ncol = nLooks, nrow = nSims))
    pval <- data.frame(matrix(ncol = nLooks, nrow = nSims))
    success <- data.frame(matrix(ncol = nLooks, nrow = nSims))
    
    strings <- c("OR_%d", "LCL_%d", "UCL_%d",
                 "Pval_%d", "Success_%d")
    
    colnames(or) <- sprintf("OR_%d", (1:nLooks))
    colnames(lcl) <- sprintf("LCL_%d", (1:nLooks))
    colnames(ucl) <- sprintf("UCL_%d", (1:nLooks))
    colnames(pval) <- sprintf("Pval_%d", (1:nLooks))
    colnames(success) <- sprintf("Success_%d", (1:nLooks))
    overall_success <- numeric(nSims)
    df <- data.frame(trialnum, or, lcl, ucl,
                     pval, success, overall_success)
    
    RNGkind(kind = "L'Ecuyer-CMRG")
    set.seed(seed)
    time <- system.time(
    for (i in 1:nSims) {
      trialnum[i] <- i
      pid <- seq(1, nPatients, by = 1)
      treatment <- rep(1:2, nPatients / 2)
      deathprob <- numeric(nPatients)
      deathprob[treatment == 1] <- death1
      deathprob[treatment == 2] <- death2
      death <- rbinom(nPatients, 1, deathprob)
      trialdata <- data.frame(cbind(pid, treatment, death))
    
      for (j in 1:nLooks) {
        analysisdata <- subset(trialdata, pid <= analyses_nPatients[j])
        model <- glm(death ~ treatment,
                     family = binomial(link = "logit"),
                     data = analysisdata)
        or[i, j] <- exp(summary(model)$coefficients[2])
        lcl[i, j] <- exp(confint.default((model))[2, 1])
        ucl[i, j] <- exp(confint.default((model))[2, 2])
        pval[i, j] <- summary(model)$coefficients[8]
        success[i, j] <- ifelse(or[i, j] < 1 & pval[i, j] < efficacy_thresholds[j], 1, 0)
      }
      overall_success[i] <- 0
      for (j in 1:nLooks) {
        if (success[i, j] == 1) {
          overall_success[i] <- 1
        }
      }
        setTxtProgressBar(pb, i)
    })
    
    df <- data.frame(trialnum, or, lcl, ucl, pval,
                     success, overall_success)
    simulation_results <- data.frame(matrix(vector(),
      nrow = nPatients, ncol = (length(df))))
    colnames(simulation_results) <- c("trialnum", (do.call(
      rbind,
      lapply(1:length(strings),
        FUN = function(j) {
          (do.call(rbind, lapply(1:nLooks,
            FUN = function(i) ((sprintf(strings, i)))
          )[]))[, j]
        }
      )
    )), "overall_success")
    simulation_results[intersect(names(df), names(simulation_results))] <- (df[intersect(
      names(df),
      names(simulation_results)
    )])
    simulation_results <- as.data.frame(simulation_results)
    
    outputs <- simulation_results[, c(-1, -length(simulation_results))]
    
    cols <- as.character(1:nLooks)
    rows <- as.character(1:length(strings))
    interim <- matrix(
      nrow = length(strings), ncol = nLooks,
      dimnames = list((rows), (cols)), 
      data = rep(0, (length(strings) * nLooks)))
    
    for (i in cols) {
      interim[, i] <- str_subset(colnames(outputs), i)
    }
    
    colnames(interim) <- sprintf("Interim_Look_%d", (1:nLooks))
    colnames(simulation_results)[1] <- c("TrialNum")
    colnames(simulation_results)[length(simulation_results)] <- c("Overall_Success")
    simulation_results_trials <- head(simulation_results, trials)
    results <- list(
      summary(design),
      summary(time),
      plot(design, 1),
      plot(designSet, type = 1),
      table(overall_success),
      simulation_results_trials)
    
    names(results) <- c(
      "Design Summary", "Time to complete simulation", 
      "Main Design Plot", "Plot of Various Designs", 
      "Table of Overall Success", "Simulation Results")
    
    return(results)
    }

    A Simulated Example


    results <- interim_sim(nPatients = 1000, death1 = 0.4, death2 = 0.3, nLooks = 4,
                           analyses_scheduled = c(0.25, 0.50, 0.75, 1),
                           sided = 1, alpha = 0.025, typeOfDesign = "asOF",
                           informationRates =  c(0.25, 0.50, 0.75, 1),
                           nSims = 1000, trials = 20, seed = 1031)

    Examining the Results

    results[1:5]
    #> $`Design Summary`
    #> Sequential analysis with a maximum of 4 looks (group sequential design)
    #> 
    #> O'Brien & Fleming type alpha spending design, one-sided local 
    #> significance level 2.5%, power 80%, undefined endpoint.
    #> 
    #> Stage                                    1       2       3       4 
    #> Information rate                       25%     50%     75%    100% 
    #> Efficacy boundary (z-value scale)    4.333   2.963   2.359   2.014 
    #> Cumulative alpha spent             <0.0001  0.0015  0.0096  0.0250 
    #> One-sided local significance level <0.0001  0.0015  0.0092  0.0220 
    #> 
    #> $`Time to complete simulation`
    #>    user  system elapsed 
    #>  11.818   0.112  12.008 
    #> 
    #> $`Main Design Plot`

    #> 
    #> $`Plot of Various Designs`

    #> 
    #> $`Table of Overall Success`
    #> overall_success
    #>   0   1 
    #> 134 866


    To examine the results, I used the kableExtra package, though this is not necessary, and simply using the following script will suffice


    table_results <- results[[6]]
    View(table_results) 

    Interim_Look_1
    Interim_Look_2
    Interim_Look_3
    Interim_Look_4
    TrialNum OR_1 LCL_1 UCL_1 Pval_1 Success_1 OR_2 LCL_2 UCL_2 Pval_2 Success_2 OR_3 LCL_3 UCL_3 Pval_3 Success_3 OR_4 LCL_4 UCL_4 Pval_4 Success_4 Overall_Success
    1 0.936 0.565 1.55 0.797 0 0.868 0.601 1.255 0.453 0 1.036 0.767 1.398 0.818 0 0.891 0.687 1.157 0.387 0 0
    2 0.87 0.519 1.459 0.598 0 0.757 0.525 1.092 0.136 0 0.745 0.551 1.007 0.056 0 0.721 0.555 0.936 0.014 1 1
    3 0.59 0.35 0.996 0.048 0 0.555 0.382 0.806 0.002 0 0.676 0.5 0.916 0.011 0 0.686 0.529 0.89 0.005 1 1
    4 0.652 0.385 1.103 0.111 0 0.671 0.461 0.976 0.037 0 0.725 0.533 0.988 0.042 0 0.75 0.576 0.976 0.032 0 0
    5 0.674 0.398 1.142 0.143 0 0.763 0.526 1.108 0.155 0 0.835 0.617 1.132 0.246 0 0.764 0.587 0.994 0.045 0 0
    6 0.525 0.309 0.892 0.017 0 0.622 0.427 0.907 0.014 0 0.668 0.491 0.909 0.01 0 0.625 0.479 0.815 0.001 1 1
    7 0.742 0.433 1.269 0.275 0 0.769 0.525 1.125 0.175 0 0.76 0.558 1.037 0.083 0 0.719 0.551 0.938 0.015 1 1
    8 0.661 0.394 1.108 0.116 0 0.437 0.301 0.632 0 1 0.535 0.397 0.721 0 1 0.551 0.425 0.714 0 1 1
    9 1.288 0.76 2.183 0.348 0 1.113 0.769 1.612 0.571 0 0.87 0.645 1.173 0.361 0 0.771 0.596 0.999 0.049 0 0
    10 0.783 0.466 1.316 0.356 0 0.701 0.484 1.015 0.06 0 0.655 0.482 0.89 0.007 1 0.614 0.472 0.798 0 1 1
    11 0.577 0.344 0.968 0.037 0 0.639 0.444 0.921 0.016 0 0.56 0.415 0.754 0 1 0.577 0.446 0.748 0 1 1
    12 0.538 0.315 0.919 0.023 0 0.629 0.434 0.913 0.015 0 0.576 0.426 0.78 0 1 0.561 0.431 0.731 0 1 1
    13 0.79 0.474 1.315 0.364 0 0.763 0.531 1.095 0.142 0 0.789 0.588 1.06 0.115 0 0.791 0.611 1.025 0.076 0 0
    14 0.535 0.318 0.902 0.019 0 0.69 0.477 0.999 0.049 0 0.583 0.43 0.792 0.001 1 0.621 0.477 0.809 0 1 1
    15 0.695 0.419 1.152 0.158 0 0.744 0.516 1.073 0.114 0 0.73 0.541 0.985 0.04 0 0.64 0.493 0.83 0.001 1 1
    16 1.116 0.657 1.896 0.685 0 0.763 0.526 1.108 0.155 0 0.69 0.509 0.935 0.017 0 0.69 0.532 0.896 0.005 1 1
    17 0.533 0.32 0.886 0.015 0 0.585 0.407 0.839 0.004 0 0.679 0.504 0.913 0.011 0 0.622 0.48 0.806 0 1 1
    18 0.613 0.364 1.033 0.066 0 0.692 0.478 1 0.05 0 0.78 0.577 1.055 0.107 0 0.705 0.543 0.914 0.008 1 1
    19 0.841 0.502 1.409 0.511 0 0.768 0.531 1.11 0.16 0 0.685 0.506 0.927 0.014 0 0.665 0.511 0.864 0.002 1 1
    20 0.275 0.159 0.476 0 1 0.449 0.31 0.651 0 1 0.498 0.369 0.672 0 1 0.498 0.384 0.646 0 1 1
    Abbreviations:
    TrialNum: Trial Number | OR: Odds Ratio | LCL: Lower Confidence Level | UCL: Upper Confidence Level | Pval: P-value

    Statistical Package Citations


    Please remember to cite the R packages if you use any of the R scripts from above.


    citation("rpact")
    #> 
    #> To cite package 'rpact' in publications use:
    #> 
    #>   Gernot Wassmer and Friedrich Pahlke (2021). rpact: Confirmatory Adaptive Clinical Trial Design and Analysis. R
    #>   package version 3.0.4. https://CRAN.R-project.org/package=rpact
    #> 
    #> A BibTeX entry for LaTeX users is
    #> 
    #>   @Manual{,
    #>     title = {rpact: Confirmatory Adaptive Clinical Trial Design and Analysis},
    #>     author = {Gernot Wassmer and Friedrich Pahlke},
    #>     year = {2021},
    #>     note = {R package version 3.0.4},
    #>     url = {https://CRAN.R-project.org/package=rpact},
    #>   }
    citation("stringr")
    #> 
    #> To cite package 'stringr' in publications use:
    #> 
    #>   Hadley Wickham (2019). stringr: Simple, Consistent Wrappers for Common String Operations. R package version 1.4.0.
    #>   https://CRAN.R-project.org/package=stringr
    #> 
    #> A BibTeX entry for LaTeX users is
    #> 
    #>   @Manual{,
    #>     title = {stringr: Simple, Consistent Wrappers for Common String Operations},
    #>     author = {Hadley Wickham},
    #>     year = {2019},
    #>     note = {R package version 1.4.0},
    #>     url = {https://CRAN.R-project.org/package=stringr},
    #>   }
    citation("knitr")
    #> Warning in meta$Date: partial match of 'Date' to 'Date/Publication'
    #> 
    #> To cite the 'knitr' package in publications use:
    #> 
    #>   Yihui Xie (2021). knitr: A General-Purpose Package for Dynamic Report Generation in R. R package version 1.32.
    #> 
    #> A BibTeX entry for LaTeX users is
    #> 
    #>   @Manual{,
    #>     title = {knitr: A General-Purpose Package for Dynamic Report Generation in R},
    #>     author = {Yihui Xie},
    #>     year = {2021},
    #>     note = {R package version 1.32},
    #>     url = {https://yihui.org/knitr/},
    #>   }
    #> 
    #>   Yihui Xie (2015) Dynamic Documents with R and knitr. 2nd edition. Chapman and Hall/CRC. ISBN 978-1498716963
    #> 
    #> A BibTeX entry for LaTeX users is
    #> 
    #>   @Book{,
    #>     title = {Dynamic Documents with {R} and knitr},
    #>     author = {Yihui Xie},
    #>     publisher = {Chapman and Hall/CRC},
    #>     address = {Boca Raton, Florida},
    #>     year = {2015},
    #>     edition = {2nd},
    #>     note = {ISBN 978-1498716963},
    #>     url = {https://yihui.org/knitr/},
    #>   }
    #> 
    #>   Yihui Xie (2014) knitr: A Comprehensive Tool for Reproducible Research in R. In Victoria Stodden, Friedrich Leisch
    #>   and Roger D. Peng, editors, Implementing Reproducible Computational Research. Chapman and Hall/CRC. ISBN
    #>   978-1466561595
    #> 
    #> A BibTeX entry for LaTeX users is
    #> 
    #>   @InCollection{,
    #>     booktitle = {Implementing Reproducible Computational Research},
    #>     editor = {Victoria Stodden and Friedrich Leisch and Roger D. Peng},
    #>     title = {knitr: A Comprehensive Tool for Reproducible Research in {R}},
    #>     author = {Yihui Xie},
    #>     publisher = {Chapman and Hall/CRC},
    #>     year = {2014},
    #>     note = {ISBN 978-1466561595},
    #>     url = {http://www.crcpress.com/product/isbn/9781466561595},
    #>   }

    Environment


    The analyses were run on:


    #> R version 4.0.5 (2021-03-31)
    #> Platform: x86_64-apple-darwin17.0 (64-bit)
    #> Running under: macOS Big Sur 10.16
    #> 
    #> Matrix products: default
    #> BLAS:   /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRblas.dylib
    #> LAPACK: /Library/Frameworks/R.framework/Versions/4.0/Resources/lib/libRlapack.dylib
    #> 
    #> Random number generation:
    #>  RNG:     L'Ecuyer-CMRG 
    #>  Normal:  Inversion 
    #>  Sample:  Rejection 
    #>  
    #> locale:
    #> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
    #> 
    #> attached base packages:
    #> [1] stats     graphics  grDevices utils     datasets  methods   base     
    #> 
    #> other attached packages:
    #> [1] stringr_1.4.0    rpact_3.0.4      kableExtra_1.3.4
    #> 
    #> loaded via a namespace (and not attached):
    #>  [1] Rcpp_1.0.6           svglite_2.0.0        prettyunits_1.1.1    ps_1.6.0             assertthat_0.2.1     digest_0.6.27       
    #>  [7] utf8_1.2.1           V8_3.4.0             R6_2.5.0             sys_3.4              stats4_4.0.5         evaluate_0.14       
    #> [13] highr_0.9            httr_1.4.2           ggplot2_3.3.3        blogdown_1.3         pillar_1.6.0         rlang_0.4.10        
    #> [19] curl_4.3             rstudioapi_0.13      callr_3.7.0          jquerylib_0.1.3      rmarkdown_2.7        labeling_0.4.2      
    #> [25] webshot_0.5.2        loo_2.4.1            munsell_0.5.0        compiler_4.0.5       xfun_0.22            rstan_2.21.1        
    #> [31] systems_1.0.1    pkgconfig_2.0.3      askpass_1.1          pkgbuild_1.2.0       htmltools_0.5.1.1    openssl_1.4.3       
    #> [37] tidyselect_1.1.0     tibble_3.1.1         gridExtra_2.3        bookdown_0.22        codetools_0.2-18     matrixStats_0.58.0  
    #> [43] viridisLite_0.4.0    fansi_0.4.2          crayon_1.4.1         dplyr_1.0.5          grid_4.0.5           jsonlite_1.7.2      
    #> [49] gtable_0.3.0         lifecycle_1.0.0      DBI_1.1.1            magrittr_2.0.1       credentials_1.3.0    StanHeaders_2.21.0-7
    #> [55] scales_1.1.1         RcppParallel_5.1.2   cli_2.4.0            stringi_1.5.3        farver_2.1.0         xml2_1.3.2          
    #> [61] bslib_0.2.4          ellipsis_0.3.1       generics_0.1.0       vctrs_0.3.7          RColorBrewer_1.1-2   tools_4.0.5         
    #> [67] glue_1.4.2           purrr_0.3.4          processx_3.5.1       parallel_4.0.5       yaml_2.2.1           inline_0.3.17       
    #> [73] colorspace_2.0-0     rvest_1.0.0          knitr_1.32           sass_0.3.1

    Article Citation

    To leave a comment for the author, please follow the link and comment on their blog: Statistical Science & Related Matters on Less Likely.

    R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
    Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.