MyNixOS website logo
Description

Efficient Model Functions for Bagging.

Tree- and rule-based models can be bagged (<doi:10.1007/BF00058655>) using this package and their predictions equations are stored in an efficient format to reduce the model objects size and speed.

baguette

R-CMD-check Lifecycle:experimental CRANstatus Codecov testcoverage

Introduction

The goal of baguette is to provide efficient functions for bagging (aka bootstrap aggregating) ensemble models.

The model objects produced by baguette are kept smaller than they would otherwise be through two operations:

  • The butcher package is used to remove object elements that are not crucial to using the models. For example, some models contain copies of the training set or model residuals when created. These are removed to save space.

  • For ensembles whose base models use a formula method, there is a built-in redundancy because each model has an identical terms object. However, each one of these takes up separate space in memory and can be quite large when there are many predictors. The baguette package solves this problem by replacing each terms object with the object from the first model in the ensemble. Since the other terms objects are not modified, we get the same functional capabilities using far less memory to save the ensemble.

Installation

You can install the released version of baguette from CRAN with:

install.packages("baguette")

Install the development version from GitHub with:

# install.packages("pak")
pak::pak("tidymodels/baguette")

Available Engines

The baguette package provides engines for the models in the following table.

modelenginemode
bag_marsearthclassification
bag_marsearthregression
bag_mlpnnetclassification
bag_mlpnnetregression
bag_treerpartclassification
bag_treerpartregression
bag_treeC5.0classification

Example

Let’s build a bagged decision tree model to predict a continuous outcome.

library(baguette)

bag_tree() %>% 
  set_engine("rpart") # C5.0 is also available here
#> Bagged Decision Tree Model Specification (unknown mode)
#> 
#> Main Arguments:
#>   cost_complexity = 0
#>   min_n = 2
#> 
#> Computational engine: rpart

set.seed(123)
bag_cars <- 
  bag_tree() %>% 
  set_engine("rpart", times = 25) %>% # 25 ensemble members 
  set_mode("regression") %>% 
  fit(mpg ~ ., data = mtcars)

bag_cars
#> parsnip model object
#> 
#> Bagged CART (regression with 25 members)
#> 
#> Variable importance scores include:
#> 
#> # A tibble: 10 × 4
#>    term  value std.error  used
#>    <chr> <dbl>     <dbl> <int>
#>  1 disp  905.       51.9    25
#>  2 wt    889.       56.8    25
#>  3 hp    814.       48.7    25
#>  4 cyl   581.       42.9    25
#>  5 drat  540.       54.1    25
#>  6 qsec  281.       53.2    25
#>  7 vs    150.       51.2    20
#>  8 carb   84.4      30.6    25
#>  9 gear   80.0      35.8    23
#> 10 am     51.5      22.9    18

The models also return aggregated variable importance scores.

Contributing

This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.

Metadata

Version

1.0.2

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows