Survival Analysis With Generalized Additive Models : Part III (the baseline hazard)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the third part of the series on survival analysis with GAMs we will review the use of the baseline hazard estimates provided by this regression model. In contrast to the Cox mode, the log-baseline hazard is estimated along with other quantities (e.g. the log hazard ratios) by the Poisson GAM (PGAM) as:
In the aforementioned expression, the baseline hazard is equivalently modeled as a time-varying deviation () from a constant (the intercept ) , or as a time-varying function (). In the latter case, the constant is absorbed into the smooth term. The choice between these equivalent forms is dictated by the application at hand; in particular, the intercept may be switched on or off by centering the smooth terms appearing in the call to the gam function. Hence, in the PGAM formulation the log-baseline hazard is yet another covariate that one estimates by a smooth function; other covariates may modify this hazard in a proportional fashion by additively shifting the log-baseline hazard ().
In the “standard” way of fitting a PGAM by mgcv, the log-baseline hazard is estimated in the constant+deviation form. Exponentiation may be used to derive the baseline hazard and its standard errors. Continuing the analysis of the Primary Biliary Cirrhosis example from the second part of the series, we may write:
par(mfrow=c(2,2)) plot(fGAM,main="Gauss Lobatto (5)",ylab="log-basehaz") plot(fGAM2,main="Gauss Lobatto (10)",ylab="log-basehaz") plot(fGAM,main="Gauss Lobatto (5)",ylab="basehaz",trans=exp) plot(fGAM2,main="Gauss Lobatto (10)",ylab="basehaz",trans=exp)
There is no substantial difference in the estimated obtained by the coarse (Gauss Lobatto (5)) and finer (Gauss Lobatto (10)) discretization. Note that as a result of fitting the log-hazard as constant+ time-varying deviation, the standard error of the curve vanishes at ~1050: the value of the log-hazard at that instant in events per unit time is provided by the intercept term.
Estimation of the log-baseline hazard allows the PGAM to function as a parametric, smooth alternative to the Kaplan Meier estimator. This will be examined in the fourth part of this series.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.