Description
Tools and Tests for Experiments with Partially Synthetic Data Sets.
Description
A set of functions to support experimentation in the utility of partially synthetic data sets. All functions compare an observed data set to one or a set of partially synthetic data sets derived from the observed data to (1) check that data sets have identical attributes, (2) calculate overall and specific variable perturbation rates, (3) check for potential logical inconsistencies, and (4) calculate confidence intervals and standard errors of desired variables in multiple imputed data sets. Confidence interval and standard error formulas have options for either synthetic data sets or multiple imputed data sets. For more information on the formulas and methods used, see Reiter & Raghunathan (2007) <doi:10.1198/016214507000000932>.
README.md
SynthTools
The goal of SynthTools is to make measuring the utility of partially synthetic and multiple imputed data sets easier. SynthTools includes functions that check to make sure original and derived data sets have comparable attributes, compute overall and variable-specific perturbation rates, and compute standard errors and confidence intervals for continuous and categorical variables.
Installation
You can install the released version of SynthTools from CRAN with:
install.packages("SynthTools")
Example
This is a basic example which shows you how to check the comparability of an observed data set and a data set derived from it. PPA is the observed data set and PPAps1 is the partially synthetic data set derived from PPA:
library(SynthTools)
dataComp(PPA, PPAps1)