Description

Partial Identification of Causal Effects with Mostly Invalid Instruments.

Description

A tuneable and interpretable method for relaxing the instrumental variables (IV) assumptions to infer treatment effects in the presence of unobserved confounding. For a treatment-associated covariate to be a valid IV, it must be (a) unconfounded with the outcome and (b) have a causal effect on the outcome that is exclusively mediated by the exposure. There is no general test of the validity of these IV assumptions for any particular pre-treatment covariate. However, if different pre-treatment covariates give differing causal effect estimates when treated as IVs, then we know at least some of the covariates violate these assumptions. 'budgetIVr' exploits this fact by taking as input a minimum budget of pre-treatment covariates assumed to be valid IVs and idenfiying the set of causal effects that are consistent with the user's data and budget assumption. The following generalizations of this principle can be used in this package: (1) a vector of multiple budgets can be assigned alongside corresponding thresholds that model degrees of IV invalidity; (2) budgets and thresholds can be chosen using specialist knowledge or varied in a principled sensitivity analysis; (3) treatment effects can be nonlinear and/or depend on multiple exposures (at a computational cost). The methods in this package require only summary statistics. Confidence sets are constructed under the "no measurement error" (NOME) assumption from the Mendelian randomization literature. For further methodological details, please refer to Penn et al. (2024) <doi:10.48550/arXiv.2411.06913>.

README.md

cran.r-project.org

BudgetIV: partial identification of causal effects with mostly invalid isntruments

Introduction

budgetIV provides a tuneable and interpretable method for relaxing the instrumental variables (IV) assumptions to infer treatment effects in the presence of unobserved confounding. For a pre-treatment covariate to be a valid IV, it must be (a) unconfounded with the outcome and (b) have a causal effect on the outcome that is exclusively mediated by the exposure. It is impossible to test the validity of these IV assumptions for any particular pre-treatment covariate; however, when different pre-treatment covariates give differing causal effect estimates if treated as IVs, then we know at least one of the covariates violates these assumptions. budgetIV exploits this fact by taking as input a minimum ''budget'' of pre-treatment covariates assumed to be valid IVs. This can be extended to assuming a set of budgets for varying ''degrees'' of validity set by the user and defined formally through a parameter that captures violation of either IV assumption. These budget constraints can be chosen using specialist knowledge or varied in a principled sensitivity analysis. budgetIV supports non-linear treatment effects and multi-dimensional treatments; requires only summary statistics rather than raw data; and can be used to construct confidence sets under a standard assumption from the Mendelian randomisation literature. With one-dimensional $\Phi (X)$, a computationally-efficient variant Budget_IV_Scalar allows for use with thousands of pre-treatment covariates.

We assume a heterogenous treatment effect, implying the following structural causal model: $$Z := f_z (\epsilon_z),$$ $$X := f_x (Z, \epsilon_x),$$ $$Y = \theta \Phi (X) + g_y (Z, \epsilon_y).$$ There may be association between $\epsilon_y$ and $\epsilon_z$, indicating a violation of the unconfoundedness assumption (a); and $g_y$ may depend on $Z$, indicating violation of exclusivity (b). With budgetIV, the user defines degrees of validity $0 \leq \tau_1 \leq \tau_2 \leq \ldots \leq \tau_K$ that may apply to any candidate instrument $Z_i$. If $Z_i$ satisfies the $j$'th degree of validity, this means $\lvert \mathrm{Cov} (g_y (Z, \epsilon_y), Z_i) \rvert \leq \tau_j$. Choosing $\tau_1 = 0$ would demand some pre-treatment covariates give valid causal effect estimates, while choosing $\tau_K = \infty$ would allow for some covariates to give arbitrarily biased causal effect estimates if treated as IVs. budgetIV will return the corresponding identified/confidence set over causal effects that agree with the budget constraints and with the user-input summary statistics: beta_y corresponding to $\mathrm{Cov} (Y, Z)$ and beta_Phi corresponding to $\mathrm{Cov} (\Phi (X), Z)$. Other regression coefficients such as odds ratios, hazard ratios or multicolinearity-adjusted regression coefficients may be used for beta_y and beta_Phi, but this also changes the interpretation of the $\tau$'s.

For further methodological details and theoretical results and advanced use cases, please refer to Penn et al. (2024) doi:10.48550/arXiv.2411.06913.

Installation

To install the development version from GitHub, using devtools, run:

devtools::install_github('jpenn2023/budgetivr')

library(budgetivr)

Examples

First, we calculate summary statistics from the example dataset:

data(simulated_data_budgetIV)

beta_y <- simulated_data_budgetIV$beta_y

beta_phi_1 <- simulated_data_budgetIV$beta_phi_1
beta_phi_2 <- simulated_data_budgetIV$beta_phi_2

beta_phi <- matrix(c(beta_phi_1, beta_phi_2), nrow = 2, byrow = TRUE)

delta_beta_y <- simulated_data_budgetIV$delta_beta_y

Then, we define the basis functions $$\Phi (X)$$ and set background budget constraints:

phi_basis <- expression(x, x^2)

tau_vec = c(0)
b_vec = c(3)

Then, we define the baseline treatment $$x_0$$ and the treatment values to calculate the average treatment effect over:

X_baseline <- list("x" = c(0))

x_vals <- seq(from = 0, to = 1, length.out = 500)

ATE_search_domain <- expand.grid("x" = x_vals)

Now we run budgetIV to partially identify the budget assignments and corresponding average causal effect bounds:

partial_identification_ATE <- budgetIV(beta_y = beta_y,
                                       beta_phi = beta_phi,
                                       phi_basis = phi_basis,
                                       tau_vec = tau_vec,
                                       b_vec = b_vec,
                                       ATE_search_domain = ATE_search_domain,
                                       X_baseline = X_baseline,
                                       delta_beta_y = delta_beta_y)

r-budgetIVr

BudgetIV: partial identification of causal effects with mostly invalid isntruments

Introduction

Installation

Examples

Version

License

Status

Source

Homepage

Platforms (76)

BudgetIV: partial identification of causal effects with mostly invalid isntruments

Introduction

Installation

Examples

Version

License

Status

Source

Homepage

Platforms76 (76)

Platforms (76)