Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Log-rank statistic for 2 groups
Log-rank test, based on Log-rank statistic, is a popular tool that determines whether 2 (or more) estimates of survival curves differ significantly. As it is stated in the literature, the Log-rank test for comparing survival (estimates of survival curves) in 2 groups ( and ) is based on the below statistic
where
and
- for are possible event times,
- is the overall risk set size on the time (),
- is the risk set size on the time in group ,
- is the risk set size on the time in group ,
- overall observed events in the time (),
- observed events in the time in group ,
- observed events in the time in group ,
- number of overall expected events in the time (),
- number of expected events in the time in group ,
- number of expected events in the time in group ,
- is a weight for the statistic,
also remember about few notes
that’s why we can substitute group with in and receive same results.
Weighted Log-rank extensions
Regular Log-rank comparison uses but many modifications to that approach have been proposed. The most popular modifications, called weighted Log-rank tests, are available in ?survMisc::comp
n
Gehan and Breslow proposed to use (this is also called generalized Wilcoxon),srqtN
Tharone and Ware proposed to use ,S1
Peto-Peto’s modified survival estimate ,S2
modified Peto-Peto (by Andersen) ,FH
Fleming-Harrington .
Watch out for
FH
as I submitted an info on survMisc repository where I think their mathematical notation is misleading for Fleming-Harrington.
Why are they useful?
The regular Log-rank test is sensitive to detect differences in late survival times, where Gehan-Breslow and Tharone-Ware propositions might be used if one is interested in early differences in survival times. Peto-Peto modifications are also useful in early differences and are more robust (than Tharone-Whare or Gehan-Breslow) for situations where many observations are censored. The most flexible is Fleming-Harrington method for weights, where high p
indicates detecting early differences and high q
indicates detecting differences in late survival times. But there is always an issue on how to detect p
and q
.
Remember that test selection should be performed at the research design level! Not after looking in the dataset.
Plots
library(survminer)
library(survival)
data("kidney", package="KMsurv")
fit <- survfit(Surv(time=time, event=delta) ~ type, data=kidney)
After preparing a functionality for this GitHub’s issue Other tests than log-rank for testing survival curves we are now able to compute p-values for various Log-rank tests in survminer package. Let as see below examples on executing all possible tests.
gghtemr
Let’s make it more interesting (or not) with ggthemr package that has many predefinied palettes.
After installation
devtools::install_github('cttobin/ggthemr')
one can set up a global ggplot2 palette/theme with
library(ggthemr)
ggthemr('dust')
and check current colors with
swatch()
[1] "#555555" "#db735c" "#EFA86E" "#9A8A76" "#F3C57B" "#7A6752" "#2A91A2" "#87F28A" "#6EDCEF"
attr(,"class")
[1] "ggthemr_swatch"
Note: the first colour in a swatch is a special one. It is reserved for outlining boxplots, text etc. For color lines first color is not used.
Log-rank (survdiff) + sea theme
ggthemr("sea") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(3, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "survdiff", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
Log-rank (comp) + dust theme
ggthemr("dust") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(3, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "1", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
Gehan-Breslow (generalized Wilcoxon) + flat dark theme
ggthemr("flat dark") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(5, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "n", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
Tharone-Ware + camoflauge
ggthemr("camoflauge") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(3, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "sqrtN", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
Peto-Peto’s modified survival estimate + fresh theme
ggthemr("fresh") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(5, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "S1", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
modified Peto-Peto’s (by Andersen) + grass theme
ggthemr("grass") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(5, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "S2", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
Fleming-Harrington (p=1, q=1) + light theme
ggthemr("light") # set ggthemr theme
ggsurvplot(
fit, # fitted survfit object
risk.table = TRUE, # include risk table?
conf.int = TRUE, # add confidence intervals?
pval = TRUE, # add p-value to the plot?
pval.method = TRUE, # write the name of the test
# that was used compute the p-value?
pval.method.coord = c(5, 0.1), # coordinates for the name
pval.method.size = 4, # size for the name of the test
log.rank.weights = "FH_p=1_q=1", # type of weights in log-rank test
### few options are set by defualt in survminer
### we will need to turn them off to allow
### ggthemr to work in his full glory
palette = swatch()[2:3], # pass the active palette
ggtheme = NULL, # disable adding custom survminer theme
.x = NULL, # disable adding custom survminer for the x axis
.y = NULL, # disable adding custom survminer for the y axis
.main = NULL, # disable adding custom survminer for the title
.submain = NULL, # disable adding custom survminer for the subtitle
.caption = NULL # disable adding custom survminer for the caption
)
References
-
Gehan A. A Generalized Wilcoxon Test for Comparing Arbitrarily Singly-Censored Samples. Biometrika 1965 Jun. 52(1/2):203-23. JSTOR
-
Tarone RE, Ware J 1977 On Distribution-Free Tests for Equality of Survival Distributions. Biometrika;64(1):156-60. JSTOR
-
Peto R, Peto J 1972 Asymptotically Efficient Rank Invariant Test Procedures. J Royal Statistical Society 135(2):186-207. JSTOR
-
Fleming TR, Harrington DP, O’Sullivan M 1987 Supremum Versions of the Log-Rank and Generalized Wilcoxon Statistics. J American Statistical Association 82(397):312-20. JSTOR
-
Billingsly P 1999 Convergence of Probability Measures. New York: John Wiley & Sons. Wiley (paywall)
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.