Description

Principal Stratification Analysis in R.

Description

Estimating causal effects in the presence of post-treatment confounding using principal stratification. 'PStrata' allows for customized monotonicity assumptions and exclusion restriction assumptions, with automatic full Bayesian inference supported by 'Stan'. The main function to use in this package is PStrata(), which provides posterior estimates of principal causal effect with uncertainty quantification. Visualization tools are also provided for diagnosis and interpretation. See Liu and Li (2023) <arXiv:2304.02740> for details.

README.md

cran.r-project.org

Manual (Version 0.0.3) Updated: 8/28/2023

The package PStrata fits Bayesian principal stratification models using Stan. A wide variety of models, priors and assumptions are allowed to provide flexibility.

Overview

Let $Z$ denote the assigned treatment, $D$ denote the post-randomization intermediate variable, $X$ denote the covariates and $Y$ denote the outcome. Define $S = (D(0), D(1))$ as the principal stratum.

PStrata requires specification of two models.

S-model: a multinomial model that specifies the distribution of principal strata given covariates $$\log \frac{p(S = s\mid X)}{p(S = s_0 \mid X)} = X \beta^S.$$
Y-model: a generalized linear model that specifies the distribution of outcome specific to a principal stratum and treatment, given the covariates $$\mathbb{E}[Y \mid X, S, Z] = g^{-1}(X \beta_{s, z}^Y).$$

Assumption

monotonicity: the monitonicity assumption assumes that $D(0)\leq D(1)$, and hence ruling out the existence of principal stratum $S = (1, 0)$. In PStrata, users explicitly specify all principal strata that are assumed to exist. This allows for more flexibility than the original monotonicity assumption.
exclusion restriction (ER): the ER assumption for principal stratum $S$ assumes that if $D(0) = D(1)$, then $p(Y \mid X, S, Z) = p(Y \mid X, S)$.

Example

Consider fitting a Bayesian principal stratification model on dataset sim_data_normal, with both S-model and Y-model involving only intercepts and no covariates. The Y-model is a linear model (gaussian distribution with link function being the identity function). Assume monotonicity and exclusion restriction on both $S = (0, 0)$ and $S = (1, 1)$.

We specify standard normal priors for the intercepts included in the S-model and Y-model. Also, the Y-model introduces a standard deviation parameter $\sigma$ on which we specify a standard inverse gamma prior.

The following script runs the Bayesian analysis (may take around 2 minutes).

obj <- PStrata(
    S.formula = Z + D ~ 1, # treatment + intermediate ~ covariates
    Y.formula = Y ~ 1,     # outcome ~ covariates
    Y.family = gaussian(link = "identity"),
    data = sim_data_normal,
    strata = c(n = "00*", c = "01", a = "11*"), # * for ER, names are optional
    prior_intercept = prior_normal(0, 1),
    prior_sigma = prior_inv_gamma(1),
    chains = 4, warmup = 500, iter = 1000 # optional parameters to pass to rstan()
)

Raw output of obj provides an overview of the estimated proportion for each strata and summary(obj) provides quantiles and confidence intervals.

To obtain the estimated mean effects of each principal stratum and treatment arm, run the following script.

res <- PSOutcome(obj)
summary(res, "data.frame") # returns a data.frame object. Other options: "array", "matrix"
plot(res)

To obtain contrasts, use function PSContrast. For example, to compare the stratum-specific treatment effects (defined by $Y(1) - Y(0)$), use the following script.

cont <- PSContrast(res, Z = TRUE)
summary(cont, "data.frame")
plot(cont)

r-PStrata

Manual (Version 0.0.3) Updated: 8/28/2023

Overview

Assumption

Example

Version

License

Status

Source

Homepage

Platforms (77)

Manual (Version 0.0.3) Updated: 8/28/2023

Overview

Assumption

Example

Version

License

Status

Source

Homepage

Platforms77 (77)

Platforms (77)