Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods.
CIPerm
: Computationally-Efficient Confidence Intervals for Mean Shift from Permutation Methods
This R package implements computationally-efficient construction of confidence intervals from permutation tests or randomization tests for simple differences in means. In other words, if we set up a permutation or randomization test to evaluate $H_0: \mu_A - \mu_B = 0$, then how can we use that same single set of permutations to cheaply construct a CI for the $(\mu_A - \mu_B)$ parameter?
The method is based on Minh D. Nguyen's 2009 MS thesis paper, "Nonparametric Inference using Randomization and Permutation Reference Distribution and their Monte-Carlo Approximation," https://doi.org/10.15760/etd.7798. See the nguyen
vignette for a brief summary of the method and for our replication of Nguyen's results.
Note that our R function arguments and outputs are structured differently than the similarly-named R functions in Nguyen (2009), but the results are equivalent.
A copy of our useR! 2022 conference poster about the package is in our GitHub repo, in the data-raw
folder.
Installation
Install a stable version of the package from CRAN:
install.packages("CIPerm")
Or, for the latest development version, install directly from GitHub:
# install.packages("remotes")
remotes::install_github("ColbyStatSvyRsch/CIPerm", build_vignettes = TRUE)
Usage
First use dset(x,y)
to tabulate summary statistics for each permutation. Then pass the results into cint()
to compute a confidence interval.
x <- c(19, 22, 25, 26)
y <- c(23, 33, 40)
demo <- dset(x, y)
cint(dset = demo, conf.level = .95, tail = "Two")
#> [1] -21 3
You can also pass the results of dset()
into pval()
to calculate p-values for a difference in means (value="m"
), a difference in medians (value="d"
), or the Wilcoxon rank sum test (value="w"
).
pval(dset = demo, tail = "Left", value = "m")
#> [1] 0.08571429
pval(dset = demo, tail = "Left", value = "d")
#> [1] 0.08571429
pval(dset = demo, tail = "Left", value = "w")
#> [1] 0.1142857
See also our naive
vignette for timing comparisons of Nguyen's one-pass method. In the "naive" approach, you would run many separate permutation tests with different null values for $(\mu_A - \mu_B)$, and your CI would consist of those values where the null hypothesis was not rejected. The vignette shows that if you need to check more than a few null values and your dataset isn't trivially small, Nguyen's method can be considerably faster than the naive approach.
References
Ernst, M.D. (2004). "Permutation Methods: A Basis for Exact Inference," Statistical Science, vol. 19, no. 4, 676-685, DOI:10.1214/088342304000000396.
Nguyen, M.D. (2009). "Nonparametric Inference using Randomization and Permutation Reference Distribution and their Monte-Carlo Approximation" [unpublished MS thesis; Mara Tableman, advisor], Portland State University. Dissertations and Theses. Paper 5927. DOI:10.15760/etd.7798.
Tupaj, E. and Wieczorek, J. (2022). "CIPerm
: An R Package for Computationally Efficient Confidence Intervals from Permutation Test," poster presented at the useR! Conference, held virtually, June 22, 2022. https://user2022.r-project.org/program/posters/