MyNixOS website logo
Description

Ease the Implementation of Imputation Methods.

The general workflow of most imputation methods is quite similar. The aim of this package is to provide parts of this general workflow to make the implementation of imputation methods easier. The heart of an imputation method is normally the used model. These models can be defined using the 'parsnip' package or customized specifications. The rest of an imputation method are more technical specification e.g. which columns and rows should be used for imputation and in which order. These technical specifications can be set inside the imputation functions.

imputeGeneric

R-CMD-check

The goal of imputeGeneric is to ease the implementation of imputation functions.

Installation

You can install the development version of imputeGeneric from GitHub with:

# install.packages("devtools")
devtools::install_github("torockel/imputeGeneric")

Purpose

The aim of imputeGeneric is to make the implementation and usage of imputation methods easier. The main function of the package is impute_iterative(). This function can turn any parsnip model into an imputation method. Furthermore, other customized approaches can be used in a general imputation framework. For more information, see the documentations of impute_iterative(), impute_supervised(), impute_unsupervised() and the following examples.

Examples

Simple example

The use of a parsnip model for imputation is demonstrated using regression trees from the rpart package via parsnip (decision_tree("regression")). First, a data set with missing values is created. Then, this data set is imputed once with regression trees using only completely observed rows and columns for the model building.

library(imputeGeneric)
library(parsnip)
# create data set
set.seed(123)
ds_mis <- data.frame(X = rnorm(100), Y = rnorm(100))
ds_mis$Z <- 5 + 2* ds_mis$X + ds_mis$Y + rnorm(100)
ds_mis$Z[sample.int(100, 30)] <- NA
ds_mis$Y[sample.int(100, 20)] <- NA
# impute data set
ds_imp <- impute_iterative(ds_mis, decision_tree("regression"), max_iter = 1)
anyNA(ds_imp)
#> [1] FALSE

To use other parsnip models instead of regression trees, only the model_spec_parsnip argument must be altered. E.g. for linear regression instead of regression trees use linear_reg().

ds_imp_lm <- impute_iterative(ds_mis, linear_reg(), max_iter = 1)
anyNA(ds_imp_lm)
#> [1] FALSE

More complex example

Many aspects of the imputation can be specified and customized. The missing values can be initially imputed e.g. with per column mean values (initial_imputation_fun = missMethods::impute_mean). In addition, all objects and columns can be used for the imputation models (rows_used_for_imputation = "all" and cols_used_for_imputation = "all"). Furthermore, the imputation can be iterative. The iteration will be stopped, if either the difference between two imputed data sets falls below a threshold (stop_fun = stop_ds_difference, stop_fun_args = list(eps = 0.1)) or the maximum number of iterations (max_iter = 5) is reached.

ds_imp2 <- impute_iterative(
  ds_mis, decision_tree("regression"), 
  initial_imputation_fun = missMethods::impute_mean,
  cols_used_for_imputation = "all",
  rows_used_for_imputation = "all",
  stop_fun = stop_ds_difference,
  stop_fun_args = list(eps = 0.1),
  max_iter = 5)
anyNA(ds_imp2)
#> [1] FALSE
Metadata

Version

0.1.0

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows