Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the third part of the series on survival analysis with GAMs we will review the use of the baseline hazard estimates provided by this regression model. In contrast to the Cox mode, the log-baseline hazard is estimated along with other quantities (e.g. the log hazard ratios) by the Poisson GAM (PGAM) as:
In the aforementioned expression, the baseline hazard is equivalently modeled as a time-varying deviation (
In the “standard” way of fitting a PGAM by mgcv, the log-baseline hazard is estimated in the constant+deviation form. Exponentiation may be used to derive the baseline hazard and its standard errors. Continuing the analysis of the Primary Biliary Cirrhosis example from the second part of the series, we may write:
par(mfrow=c(2,2)) plot(fGAM,main="Gauss Lobatto (5)",ylab="log-basehaz") plot(fGAM2,main="Gauss Lobatto (10)",ylab="log-basehaz") plot(fGAM,main="Gauss Lobatto (5)",ylab="basehaz",trans=exp) plot(fGAM2,main="Gauss Lobatto (10)",ylab="basehaz",trans=exp)
There is no substantial difference in the estimated obtained by the coarse (Gauss Lobatto (5)) and finer (Gauss Lobatto (10)) discretization. Note that as a result of fitting the log-hazard as constant+ time-varying deviation, the standard error of the curve vanishes at ~1050: the value of the log-hazard at that instant in events per unit time is provided by the intercept term.
Estimation of the log-baseline hazard allows the PGAM to function as a parametric, smooth alternative to the Kaplan Meier estimator. This will be examined in the fourth part of this series.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.