Fitting Survival Regression Models via 'Stan'.
survstan
The aim of the R package survstan is to provide atoolkit for fitting survival models using Stan.
The R package survstan can be used to fit right-censored survival data under independent censoring. The implemented models allow the fitting of survival data in the presence/absence of covariates. All inferential procedures are currently based on the maximum likelihood (ML) approach.
Installation
You can install the released version of survstan from CRAN with:
install.packages("survstan")
You can install the development version of survstan from GitHub with:
# install.packages("devtools")
devtools::install_github("fndemarqui/survstan")
Inference procedures
Let $(t_{i}, \delta_{i})$ be the observed survival time and its corresponding failure indicator, $i=1, \cdots, n$, and $\boldsymbol{\theta}$ be a $k \times 1$ vector of parameters. Then, the likelihood function for right-censored survival data under independent censoring can be expressed as:
$$ L(\boldsymbol{\theta}) = \prod_{i=1}^{n}f(t_{i}|\boldsymbol{\theta})^{\delta_{i}}S(t_{i}|\boldsymbol{\theta})^{1-\delta_{i}}. $$
The maximum likelihood estimate (MLE) of $\boldsymbol{\theta}$ is obtained by directly maximization of $\log(L(\boldsymbol{\theta}))$ using the rstan::optimizing()
function. The function rstan::optimizing()
further provides the hessian matrix of $\log(L(\boldsymbol{\theta}))$, needed to obtain the observed Fisher information matrix, which is given by:
$$ \mathscr{I}(\hat{\boldsymbol{\theta}}) = -\frac{\partial^2}{\partial \boldsymbol{\theta}\boldsymbol{\theta}'} \log L(\boldsymbol{\theta})\mid_{\boldsymbol{\theta}=\hat{\boldsymbol{\theta}}}, $$
Inferences on $\boldsymbol{\theta}$ are then based on the asymptotic properties of the MLE, $\hat{\boldsymbol{\theta}}$, that state that:
$$ \hat{\boldsymbol{\theta}} \asymp N_{k}(\boldsymbol{\theta}, \mathscr{I}^{-1}(\hat{\boldsymbol{\theta}})). $$
Baseline Distributions
Some of the most popular baseline survival distributions are implemented in the R package survstan. Such distributions include:
- Exponential
- Weibull
- Lognormal
- Loglogistic
- Gamma,
- Generalized Gamma (original Stacy’s parametrization)
- Generalized Gamma (alternative Prentice’s parametrization)
- Gompertz
- Rayleigh
- Birnbaum-Saunders (fatigue)
The parametrizations adopted in the package survstan are presented next.
Exponential Distribution
If $T \sim \mbox{Exp}(\lambda)$, then
$$ f(t|\lambda) = \lambda\exp\left{-\lambda t\right}I_{[0, \infty)}(t), $$ where $\lambda>0$ is the rate parameter.
The survival and hazard functions in this case are given by:
$$ S(t|\lambda) = \exp\left{-\lambda t\right} $$ and $$ h(t|\lambda) = \lambda. $$
Weibull Distribution
If $T \sim \mbox{Weibull}(\alpha, \gamma)$, then
$$ f(t|\alpha, \gamma) = \frac{\alpha}{\gamma^{\alpha}}t^{\alpha-1}\exp\left{-\left(\frac{t}{\gamma}\right)^{\alpha}\right}I_{[0, \infty)}(t), $$ where $\alpha>0$ and $\gamma>0$ are the shape and scale parameters, respectively.
The survival and hazard functions in this case are given by:
$$ S(t|\alpha, \gamma) = \exp\left{-\left(\frac{t}{\gamma}\right)^{\alpha}\right} $$ and $$ h(t|\alpha, \gamma) = \frac{\alpha}{\gamma^{\alpha}}t^{\alpha-1}. $$
Lognormal Distribution
If $T \sim \mbox{LN}(\mu, \sigma)$, then
$$ f(t|\mu, \sigma) = \frac{1}{\sqrt{2\pi}t\sigma}\exp\left{-\frac{1}{2}\left(\frac{log(t)-\mu}{\sigma}\right)^2\right}I_{[0, \infty)}(t), $$ where $-\infty < \mu < \infty$ and $\sigma>0$ are the mean and standard deviation in the log scale of $T$.
The survival and hazard functions in this case are given by:
$$S(t|\mu, \sigma) = \Phi\left(\frac{-log(t)+\mu}{\sigma}\right)$$ and $$h(t|\mu, \sigma) = \frac{f(t|\mu, \sigma)}{S(t|\mu, \sigma)},$$ where $\Phi(\cdot)$ is the cumulative distribution function of the standard normal distribution.
Loglogistic Distribution
If $T \sim \mbox{LL}(\alpha, \gamma)$, then
$$ f(t|\alpha, \gamma) = \frac{\frac{\alpha}{\gamma}\left(\frac{t}{\gamma}\right)^{\alpha-1}}{\left[1 + \left(\frac{t}{\gamma}\right)^{\alpha}\right]^2}I_{[0, \infty)}(t), ~ \alpha>0, \gamma>0, $$
where $\alpha>0$ and $\gamma>0$ are the shape and scale parameters, respectively.
The survival and hazard functions in this case are given by:
$$S(t|\alpha, \gamma) = \frac{1}{1+ \left(\frac{t}{\gamma}\right)^{\alpha}}$$ and $$ h(t|\alpha, \gamma) = \frac{\frac{\alpha}{\gamma}\left(\frac{t}{\gamma}\right)^{\alpha-1}}{1 + \left(\frac{t}{\gamma}\right)^{\alpha}}. $$
Gamma Distribution
If $T \sim \mbox{Gamma}(\alpha, \lambda)$, then
$$f(t|\alpha, \lambda) = \frac{\lambda^{\alpha}}{\Gamma(\alpha)}t^{\alpha-1}\exp\left{-\lambda t\right}I_{[0, \infty)}(t),$$
where $\Gamma(\alpha) = \int_{0}^{\infty}u^{\alpha-1}\exp{-u}du$ is the gamma function.
The survival function is given by
$$S(t|\alpha, \lambda) = 1 - \frac{\gamma^{}(\alpha, \lambda t)}{\Gamma(\alpha)},$$ where $\gamma^{}(\alpha, \lambda t)$ is the lower incomplete gamma function, which is available only numerically. Finally, the hazard function is expressed as:
$$h(t|\alpha, \lambda) = \frac{f(t|\alpha, \lambda)}{S(t|\alpha, \lambda)}.$$
Generalized Gamma Distribution (original Stacy’s parametrization)
If $T \sim \mbox{ggstacy}(\alpha, \gamma, \kappa)$, then
$$f(t|\alpha, \gamma, \kappa) = \frac{\kappa}{\gamma^{\alpha}\Gamma(\alpha/\kappa)}t^{\alpha-1}\exp\left{-\left(\frac{t}{\gamma}\right)^{\kappa}\right}I_{[0, \infty)}(t),$$ for $\alpha>0$, $\gamma>0$ and $\kappa>0$.
It can be show that the survival function can be expressed as:
$$S(t|\alpha, \gamma, \kappa) = S_{G}(x|\nu, 1),$$ where $x = \displaystyle\left(\frac{t}{\gamma}\right)^\kappa$, and $F_{G}(\cdot|\nu, 1)$ corresponds to the distribution function of a gamma distribution with shape parameter $\nu = \alpha/\gamma$ and scale parameter equals to 1.
Finally, the hazard function is expressed as:
$$h(t|\alpha, \gamma, \kappa) = \frac{f(t|\alpha, \gamma, \kappa)}{S(t|\alpha, \gamma, \kappa)}.$$
Generalized Gamma Distribution (alternative Prentice’s parametrization)
If $T \sim \mbox{ggprentice}(\mu, \sigma, \varphi)$, then
$$f(t | \mu, \sigma, \varphi) = \begin{cases} \frac{|\varphi|(\varphi^{-2})^{\varphi^{-2}}}{\sigma t\Gamma(\varphi^{-2})}\exp{\varphi^{-2}[\varphi w - \exp(\varphi w)]}I_{[0, \infty)}(t), & \varphi \neq 0 \
\frac{1}{\sqrt{2\pi}t\sigma}\exp\left{-\frac{1}{2}\left(\frac{log(t)-\mu}{\sigma}\right)^2\right}I_{[0, \infty)}(t), & \varphi = 0 \end{cases} $$ where $w = \frac{\log(t) - \mu}{\sigma}$, for $-\infty < \mu < \infty$, $\sigma>0$ and $-\infty < \varphi < \infty$$.
It can be show that the survival function can be expressed as:
$$ S(t|\mu, \sigma, \varphi) = \begin{cases} S_{G}(x|1/\varphi^2, 1), & \varphi > 0 \
1-S_{G}(x|1/\varphi^2, 1), & \varphi < 0 \
S_{LN}(x|\mu, \sigma), & \varphi = 0 \end{cases} $$ where $x = \frac{1}{\varphi^2}\exp{\varphi w}$, $S_{G}(\cdot|1/\varphi^2, 1)$ is the distribution function of a gamma distribution with shape parameter $1/\varphi^2$ and scale parameter equals to 1, and $S_{LN}(x|\mu, \sigma)$ corresponds to the survival function of a lognormal distribution with location parameter $\mu$ and scale parameter $\sigma$.
Finally, the hazard function is expressed as:
$$h(t|\alpha, \gamma, \kappa) = \frac{f(t|\alpha, \gamma, \kappa)}{S(t|\alpha, \gamma, \kappa)}.$$
Gompertz Distribution
If $T \sim \mbox{Gamma}(\alpha, \gamma)$, then
$$f(t|\alpha, \lambda) = \alpha\exp\left{\gamma x-\frac{\alpha}{\gamma}\left(e^{\gamma x} - 1\right)\right}I_{[0, \infty)}(t).$$
The survival and hazard functions are given, respectively, by
$$S(t|\alpha, \lambda) = \exp\left{-\frac{\alpha}{\gamma}\left(e^{\gamma x} - 1\right)\right}.$$ and
$$h(t|\alpha, \lambda) = \alpha\exp{\gamma x}.$$
Rayleigh Distribution
Let $T \sim \mbox{rayleigh}(\sigma)$, where $\sigma>0$ is a scale parameter. Then, the density, survival and hazard functions are respectively given by:
$$f(t|\sigma) = \frac{x}{\sigma^2}\exp\left{-\frac{x^2}{2\sigma^2}\right},$$ $$S(t|\sigma) = \exp\left{-\frac{x^2}{2\sigma^2}\right}$$ and
$$h(t|\sigma) = \frac{x}{\sigma^2}.$$
Birnbaum-Saunders (fatigue) Distribution
If $T \sim \mbox{fatigue}(\alpha, \gamma)$, then
$$ f(t|\alpha, \gamma) = \frac{\sqrt{\frac{t}{\gamma}}+\sqrt{\frac{\gamma}{t}}}{2 \alpha t}\phi\left(\sqrt{\frac{t}{\gamma}}+\sqrt{\frac{\gamma}{t}}\right)(t), ~ \alpha>0, \gamma>0, $$
where $\phi(\cdot)$ is the probability density function of a standard normal distribution, $\alpha>0$ and $\gamma>0$ are the shape and scale parameters, respectively.
The survival function in this case is given by:
$$ S(t|\alpha, \gamma) =\Phi\left(\sqrt{\frac{t}{\gamma}}-\sqrt{\frac{\gamma}{t}}\right)(t) $$,
where $\Phi(\cdot)$ is the cumulative distribution function of a standard normal distribution. The hazard function is given by $$h(t|\mu, \sigma) = \frac{f(t|\alpha, \gamma)}{S(t|\alpha, \gamma)}. $$
Regression models
When covariates are available, it is possible to fit six different regression models with the R package survstan:
- accelerated failure time (AFT) models;
- proportional hazards (PH) models;
- proportional odds (PO) models;
- accelerated hazard (AH) models.
- Yang and Prentice (YP) models.
- extended hazard (EH) models.
The regression survival models implemented in the R package survstan are briefly described in the sequel. Denote by $\mathbf{x}$ a $1\times p$ vector of covariates, and let $\boldsymbol{\beta}$ and $\boldsymbol{\phi}$ be $p \times 1$ vectors of regression coefficients, and $\boldsymbol{\theta}$ a vector of parameters associated with some baseline survival distribution. To ensure identifiability of the implemented regression models, we shall assume that the linear predictors $\mathbf{x} \boldsymbol{\beta}$ and $\mathbf{x} \boldsymbol{\phi}$ do not include an intercept term.
Accelerate Failure Time Models
Accelerated failure time (AFT) models are defined as
$$ T = \exp{\mathbf{x} \boldsymbol{\beta}}\nu, $$ where $\nu$ follows a baseline distribution with survival function $S_{0}(\cdot|\boldsymbol{\theta})$ so that
$$ f(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = e^{-\mathbf{x} \boldsymbol{\beta}}f_{0}(te^{-\mathbf{x} \boldsymbol{\beta}}|\boldsymbol{\theta}) $$ and
$$ S(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = S_{0}(t e^{-\mathbf{x} \boldsymbol{\beta}}|\boldsymbol{\theta}). $$
Proportional Hazards Models
Proportional hazards (PH) models are defined as
$$ h(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = h_{0}(t|\boldsymbol{\theta})\exp{\mathbf{x} \boldsymbol{\beta}}, $$ where $h_{0}(t|\boldsymbol{\theta})$ is a baseline hazard function so that
$$ f(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = h_{0}(t|\boldsymbol{\theta})\exp\left{\mathbf{x} \boldsymbol{\beta} - H_{0}(t|\boldsymbol{\theta})e^{\mathbf{x} \boldsymbol{\beta}}\right}, $$ and
$$ S(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = \exp\left{ - H_{0}(t|\boldsymbol{\theta})e^{\mathbf{x} \boldsymbol{\beta}}\right}. $$
Proportional Odds Models
Proportional Odds (PO) models are defined as
$$ R(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = R_{0}(t|\boldsymbol{\theta})\exp{\mathbf{x} \boldsymbol{\beta}}, $$ where $\displaystyle R_{0}(t|\boldsymbol{\theta}) = \frac{1-S_{0}(t|\boldsymbol{\theta})}{S_{0}(t|\boldsymbol{\theta})} = \exp{H_{0}(t|\boldsymbol{\theta})}-1$ is a baseline odds function so that
$$ f(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = \frac{h_{0}(t|\boldsymbol{\theta})\exp{\mathbf{x} \boldsymbol{\beta} + H_{0}(t|\boldsymbol{\theta})}}{[1 + R_{0}(t|\boldsymbol{\theta})e^{\mathbf{x} \boldsymbol{\beta}}]^2}. $$
and
$$ S(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = \frac{1}{1 + R_{0}(t|\boldsymbol{\theta})e^{\mathbf{x} \boldsymbol{\beta}}}. $$
Accelerated Hazard Models
Accelerated hazard (AH) models can be defined as
$$h(t|\boldsymbol{\theta}, \boldsymbol{\beta},\mathbf{x}) = h_{0}\left(t/e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta}\right)$$
so that
$$S(t|\boldsymbol{\theta}, \boldsymbol{\beta},\mathbf{x}) = \exp\left{- H_{0}\left(t/ e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta}\right)e^{\mathbf{x}\boldsymbol{\beta}} \right} $$ and $$f(t|\boldsymbol{\theta}, \boldsymbol{\beta}, \mathbf{x}) = h_{0}\left(t/e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta}\right)\exp\left{- H_{0}\left(t/ e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta}\right)e^{\mathbf{x}\boldsymbol{\beta}} \right}. $$
Extended hazard Models
The survival function of the extended hazard (EH) model is given by:
$$S(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \exp\left{-H_{0}(t/e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta})\exp(\mathbf{x}(\boldsymbol{\beta} + \boldsymbol{\phi}))\right}. $$
The hazard and the probability density functions are then expressed as:
$$h(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = h_{0}(t/e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta})\exp{\mathbf{x}\boldsymbol{\phi}} $$ and
$$f(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = h_{0}(t/e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta})\exp{\mathbf{x}\boldsymbol{\beta}}\exp\left{-H_{0}(t/e^{\mathbf{x}\boldsymbol{\beta}}|\boldsymbol{\theta})\exp(\mathbf{x}(\boldsymbol{\beta}+ \boldsymbol{\phi}))\right}, $$
respectively.
The EH model includes the AH, AFT and PH models as particular cases when $\boldsymbol{\phi} = \mathbf{0}$, $\boldsymbol{\phi} = -\boldsymbol{\beta}$, and $\boldsymbol{\beta} = \mathbf{0}$, respectively.
Yang and Prentice Models
The survival function of the Yang and Prentice (YP) model is given by:
$$S(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \left[1+\frac{\kappa_{S}}{\kappa_{L}}R_{0}(t|\boldsymbol{\theta})\right]^{-\kappa_{L}}. $$
The hazard and the probability density functions are then expressed as:
$$h(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \frac{\kappa_{S}h_{0}(t|\boldsymbol{\theta})\exp{H_{0}(t|\boldsymbol{\theta})}}{\left[1+\frac{\kappa_{S}}{\kappa_{L}}R_{0}(t|\boldsymbol{\theta})\right]} $$ and
$$f(t|\boldsymbol{\theta},\boldsymbol{\beta}, \boldsymbol{\phi}) = \kappa_{S}h_{0}(t|\boldsymbol{\theta})\exp{H_{0}(t|\boldsymbol{\theta})}\left[1+\frac{\kappa_{S}}{\kappa_{L}}R_{0}(t|\boldsymbol{\theta})\right]^{-(1+\kappa_{L})}, $$
respectively, where $\kappa_{S} = \exp{\mathbf{x}\boldsymbol{\beta}}$ and $\kappa_{L} = \exp{\mathbf{x}\boldsymbol{\phi}}$.
The YO model includes the PH and PO models as particular cases when $\boldsymbol{\phi} = \boldsymbol{\beta}$ and $\boldsymbol{\phi} = \mathbf{0}$, respectively.