Description

Argmin Inference over a Discrete Candidate Set.

Description

Provides methods to construct frequentist confidence sets with valid marginal coverage for identifying the population-level argmin or argmax based on IID data. For instance, given an n by p loss matrix—where n is the sample size and p is the number of models—the CS.argmin() method produces a discrete confidence set that contains the model with the minimal (best) expected risk with desired probability. The argmin.HT() method helps check if a specific model should be included in such a confidence set. The main implemented method is proposed by Tianyu Zhang, Hao Lee and Jing Lei (2024) "Winners with confidence: Discrete argmin inference with an application to model selection".

README.md

cran.r-project.org

argminCS

Welcome! Here we have source code to perform argmin hypothesis test.

Overview

The goal of argminCS is to produce confidence set of argmin from iid samples with a valid type 1 control, while exhibiting desirable statistical power. In particular, the method ‘softmin.LOO’ is the main innovative component in our paper Winners with Confidence: Argmin Inference over a High-Dimensional Discrete Candidate Set. Several other methods are also implemented within the package to ease method comparison and simulations.

Citation

If you use the argminCS package for your research or any experiment, please cite our paper “Winners with Confidence: Argmin Inference over a High-Dimensional Discrete Candidate Set”.

Installation

You can install the development version of argminCS from this GitHub webpage with:

# install.packages("devtools")
devtools::install_github("xu3cl4/argminCS")

Example

This is a basic example which shows you how to solve a common problem:

library(argminCS)
dimension <- 4
sample.size <- 200
p <- 20
mu <- (1:p)/p
cov <- diag(length(mu))
set.seed(108)
data <- MASS::mvrnorm(sample.size, mu, cov)

## to test if 'dimension' is likely to be argmin with (default) softmin.LOO
difference.matrix <- matrix(rep(data[, dimension], p-1), 
                            ncol = p-1, 
                            byrow = FALSE) - data[, -dimension]
argmin.HT(difference.matrix, method='SML')
#> $test.stat.scale
#> [1] 2.274283
#> 
#> $critical.value
#> [1] 1.644854
#> 
#> $std
#> [1] 1
#> 
#> $ans
#> [1] "Reject"
#> 
#> $lambda
#> [1] 1.94368e+11
#> 
#> $lambda.capped
#> [1] TRUE
#> 
#> $residual.slepian
#> [1] 0
#> 
#> $variance.bound
#> [1] 0.0842562

## rather than perform a hypothesis testing for a specific dimension, 
## one can directly generate a discrete confidence set by 
CS.argmin(data, method='SML')
#> [1] 1 2

Detailed Tutorial

Here is a detailed tutorial.

Regarding other details of methods, we encourage users to install the package and check function documentation.

Loss Matrices

There are two differentially private loss matrices provided in this package for reproducibility. You can access them via

file_2023 <- system.file("extdata", "loss_matrix_2023_differentially_private.csv", package = "argminCS")
loss.2023 <- read.csv(file_2023)
head(loss.2023[,1:5])
#>   V1 V2 V3 V4 V5
#> 1  0  0  0  0  0
#> 2  0  0  0  0  0
#> 3  0  0  0  0  0
#> 4  0  1  0  0  0
#> 5  0  0  0  0  0
#> 6  0  0  0  0  0
dim(loss.2023)
#> [1] 183  44

file_2024 <- system.file("extdata", "loss_matrix_2024_differentially_private.csv", package = "argminCS")
loss.2024 <- read.csv(file_2024)
head(loss.2024[,1:5])
#>   V1 V2 V3 V4 V5
#> 1  0  0  0  1  0
#> 2  0  0  0  0  0
#> 3  0  0  0  0  0
#> 4  0  0  0  0  0
#> 5  0  0  0  0  0
#> 6  0  0  0  0  0
dim(loss.2024)
#> [1] 1236   39

Key References

Chernozhukov, V., D. Chetverikov, and K. Kato. 2013. “Testing Many Moment Inequalities.” RePEc.

Dey, N., M. Ryan, and J. P. Williams. 2024. “Anytime-Valid Generalized Universal Inference on Risk Minimizers.” arXiv.org. https://doi.org/10.48550/arxiv.2402.00202.

Futschik, Andreas, and Georg Pflug. 1995. “Confidence Sets for Discrete Stochastic Optimization.” Annals of Operations Research 56 (1): 95–108. https://doi.org/10.1007/BF02031702.

Gupta, Shanti S. 1965. “On Some Multiple Decision (Selection and Ranking) Rules.” Technometrics 7 (2): 225–45. https://doi.org/10.1080/00401706.1965.10490251.

Lei, Jing. 2020. “Cross-Validation with Confidence.” Journal of the American Statistical Association 115 (532): 1978–97. https://doi.org/10.1080/01621459.2019.1672556.

Mogstad, Magne, Joseph P Romano, Azeem M Shaikh, and Daniel Wilhelm. 2024. “Inference for Ranks with Applications to Mobility Across Neighbourhoods and Academic Achievement Across Countries.” Review of Economic Studies 91 (1): 476–518.

Zhang, Tianyu, Hao Lee, and Jing Lei. 2024. “Winners with Confidence: Discrete Argmin Inference with an Application to Model Selection.” arXiv Preprint arXiv:2408.02060.

r-argminCS

argminCS

Overview

Citation

Installation

Example

Detailed Tutorial

Loss Matrices

Key References

Version

License

Status

Source

Homepage

Platforms (76)

argminCS

Overview

Citation

Installation

Example

Detailed Tutorial

Loss Matrices

Key References

Version

License

Status

Source

Homepage

Platforms76 (76)

Platforms (76)