Matching and Weighting Multiply Imputed Datasets.
MatchThem
Matching and Weighting Multiply Imputed Datasets
Introduction
One of the significant challenges in matching procedures is the occurrence of missing data on the covariates. Matching involves comparing the values of covariates for units in control and treated subgroups, or relying on predictions from a logistic regression model. When there are missing values in the covariates within the model, it becomes impossible to make a valid comparison or generate accurate predictions for that unit. To tackle this issue, several solutions have been proposed, including complete-case analysis. However, these approaches have their flaws and limitations. As a result, the adoption of algorithms for multiply imputing the missing data is gaining popularity as an alternative.
The mice
and Amelia
packages are recognized statistical tools for imputing missing data within the R. In combination with these packages, the MatchThem
package streamlines the matching and weighting processes for multiply imputed datasets. It facilitates the credible implementation of matching and weighting approaches and methods in practical applications.
Installation
The MatchThem
package can be installed from the Comprehensive R Archive Network (CRAN) repository:
install.packages("MatchThem")
And, the latest version of the package can be installed from GitHub:
devtools::install_github(repo = "FarhadPishgar/MatchThem")
Suggested Workflow
Implementing algorithms for multiple imputation of missing data, as well as the matching or weighting procedures, may appear complex at first. To simplify this process, a suggested workflow has been designed, consisting of five steps. For more detailed information, please refer to the package's cheat sheet or vignette.
- Multiply Imputing of Missing Data in the Dataset:
mice
andAmelia
packages are recommended for performing multiple imputation of missing data in the dataset. - Matching or Weighting the Multiply Imputed Datasets: The matching procedure for selecting matched units from the control and treated subgroups of each imputed dataset or the weighting procedure can be accomplished using the
matchthem()
orweightthem()
functions provided by theMatchThem
package. - Assessing Balance on Matched or Weighted Datasets: To evaluate the balance of all covariates in the multiply imputed datasets after matching or weighting, the
cobalt
package can be employed. It provides tools and functions specifically designed for assessing the extent of balance. - Analyzing the Matched or Weighted Datasets: To estimate causal effects in each matched or weighted dataset, the
with()
function from theMatchThem
package should be utilized. This function provides the necessary tools for conducting the analysis on the datasets. - Pooling the Causal Effect Estimates: To combine the causal effect estimates obtained from analyzing each dataset, the
pool()
function from theMatchThem
package should be employed. This function facilitates the pooling of the estimates to obtain an overall estimate of the causal effects.
Acknowledgments
The logo for this package, a trip to the Arctic, was designed by Max Josino. You can view and explore more of his work on his website and Dribble profile. We sincerely thank Max Josino for his kind contribution. This package relies on the functionality provided by the mice
, MatchIt
, and WeightIt
packages.