MyNixOS website logo
Description

Explore the Exact Bootstrap Method.

Researchers often use the bootstrap to understand a sample drawn from a population with unknown distribution. The exact bootstrap method is a practical tool for exploring the distribution of small sample size data. For a sample of size n, the exact bootstrap method generates the entire space of n to the power of n resamples and calculates all realizations of the selected statistic. The 'exactamente' package includes functions for implementing two bootstrap methods, the exact bootstrap and the regular bootstrap. The exact_bootstrap() function applies the exact bootstrap method following methodologies outlined in Kisielinska (2013) <doi:10.1007/s00180-012-0350-0>. The regular_bootstrap() function offers a more traditional bootstrap approach, where users can determine the number of resamples. The e_vs_r() function allows users to directly compare results from these bootstrap methods. To augment user experience, 'exactamente' includes the function exactamente_app() which launches an interactive 'shiny' web application. This application facilitates exploration and comparison of the bootstrap methods, providing options for modifying various parameters and visualizing results.

exactamente

‘exactamente’ is an R package that offers a collection of tools to assist researchers and data analysts in exploring bootstrap methods on small sample size data. Bootstrap methods are widely used for estimating the sampling distribution of an estimator by resampling with replacement from the original sample.

This package is focused particularly on the exact bootstrap as described in Kisielinska (2013). This method is advantageous for bootstrapping small data sets where standard methods might be inadequate.

For a given sample size N, the exact bootstrap generates all N^N resamples, including permutations such as [5, 5, 3], [5, 3, 5], and [3, 5, 5] as distinct resamples.

Furthermore, ‘exactamente’ provides a standard bootstrap function where the user can specify the desired number of resamples, allowing for direct comparison between the exact method and conventional bootstrap techniques.

Here is the the step-by-step process that the ‘exactamente’ package functions use to perform their bootstrap resampling and compute the summary statistics and density estimates:

  1. Bootstrap Resampling: Each bootstrap method starts by generating a collection of bootstrap samples, also known as resamples.
  • For exact_bootstrap(), the function generates N^N resamples, where each different permutation is treated as a unique resample.
  • For reg_bootstrap(), the function generates a user-specified number of resamples by sampling with replacement from the original sample.
  1. Compute the Resample Statistics: For each resample, the function computes a statistic of interest, such as the mean or the median. The function uses the user-specified ‘anon’ function for this computation. The ‘anon’ function defaults to the mean if not specified by the user.

  2. Process Bootstrap Statistics: This is a common process for both methods. After generating the resample statistics, each function calls process_bootstrap_stats() to derive the summary statistics and density estimates.

  3. Density Estimation: In process_bootstrap_stats(), it first calculates the kernel density estimate of the bootstrap statistics using the stats::density() function. If user-specified ‘density_args’ are provided, those are passed to the density function. The density estimate provides a smoothed representation of the distribution of the bootstrap statistics.

  4. Summary Statistics: After generating the density estimate, process_bootstrap_stats() computes various summary statistics for the bootstrap statistics:

  • The total number of resamples (nres).
  • The mode, defined as the value with the highest frequency in the bootstrap statistics.
  • The median, defined as the middle value when the bootstrap statistics are sorted.
  • The mean, defined as the average of the bootstrap statistics.
  • The standard deviation, which measures the dispersion of the bootstrap statistics.
  • The lower and upper bootstrap confidence interval (lCI and uCI), defined by the user-specified percentiles (‘lb’ and ‘ub’). The default is the 2.5th percentile and the 97.5th percentile, giving a 95% confidence interval.
  1. Return Result: Lastly, process_bootstrap_stats() returns a list containing the density estimate and the summary statistics. The bootstrap function then assigns a specific class to the result (‘extboot’, or ‘regboot’) and returns the result to the user.

By providing these detailed outputs, the ‘exactamente’ package enables users to thoroughly investigate the characteristics of the bootstrap distribution and the behavior of the bootstrap estimator under different resampling methods.

Installation

You can install the released version of ‘exactamente’ from CRAN:

install.packages("exactamente")

You can install the development version of ‘exactamente’ from GitHub like so:

# install.packages("devtools")
devtools::install_github("mightymetrika/exactamente")

Example Usage

After installation, you can load the ‘exactamente’ package using the library function.

library(exactamente)

To utilize the fundamental tools provided by ‘exactamente’, begin by creating a numeric vector of data. You can then use the _bootstrap functions to obtain objects holding summary statistics and density estimates of the respective bootstrap distributions. In the following example, we use the [1, 2, 7] sample vector and take the mean as our test statistic.

data <- c(1, 2, 7)

# Run exact bootstrap
e_res <- exact_bootstrap(data)

# Run regular bootstrap
set.seed(183)
r_res <- reg_bootstrap(data)

Both _bootstrap functions comes with plot() and summary() methods for further investigation of the bootstrap results.

res <- list(e_res, r_res)

lapply(res, plot)
#> [[1]]
#> 
#> [[2]]
lapply(res, summary)
#> [[1]]
#>            Method nres     mode   median     mean      sd      lCI      uCI
#> 1 exact_bootstrap   27 3.303687 3.333333 3.333333 1.54422 1.216667 5.916667
#> 
#> [[2]]
#>          Method  nres     mode   median   mean       sd lCI uCI
#> 1 reg_bootstrap 10000 3.335788 3.333333 3.3368 1.518024   1   7

‘exactamente’ also includes the e_vs_r() function, which enables direct comparison between the two methods using the same data set.

set.seed(183)
comp_res <- e_vs_r(data)
comp_res$comp_plot
comp_res$summary_table
#>            Method  nres     mode   median     mean       sd      lCI      uCI
#> 1 exact_bootstrap    27 3.303687 3.333333 3.333333 1.544220 1.216667 5.916667
#> 2   reg_bootstrap 10000 3.335788 3.333333 3.336800 1.518024 1.000000 7.000000

Interactive Shiny App

One of the exciting features of the ‘exactamente’ package is the inclusion of an interactive Shiny app. The app allows you to visually explore and compare the bootstrap methods. It is designed with a user-friendly interface and offers real-time results visualization.

You can access this app with the following command:

exactamente_app()

References

Kisielinska, J. (2013). The exact bootstrap method shown on the example of the mean and variance estimation. Computational Statistics, 28, 1061–1077.

Metadata

Version

0.1.1

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows