Description

Testing Two-Sample Mean in High Dimension.

Description

Implements the high-dimensional two-sample test proposed by Zhang (2019) <http://hdl.handle.net/2097/40235>. It also implements the test proposed by Srivastava, Katayama, and Kano (2013) <doi:10.1016/j.jmva.2012.08.014>. These tests are particularly suitable to high dimensional data from two populations for which the classical multivariate Hotelling's T-square test fails due to sample sizes smaller than dimensionality. In this case, the ZWL and ZWLm tests proposed by Zhang (2019) <http://hdl.handle.net/2097/40235>, referred to as zwl_test() in this package, provide a reliable and powerful test.

README.md

cran.r-project.org

highDmean

This package highDmean is an implementation of the high-dimensional two-sample test proposed by Zhang and Wang (2020) “Result consistency of high dimensional two-sample tests applied to gene ontology terms with gene sets”. Testing multivariate two-sample mean equality has a classical solution–Hotelling’s T-square test. When the dimensionality is greater than the sample sizes, Hotelling’s test fails due to the singularity of covariance matrix. In this case, the test proposed by Zhang and Wang (2020), referred to as zwl_test() in this package, can tackle the issue and provide reliable and powerful test. It also implement the test proposed by Srivastava, Katayama, and Kano (2013) “A two sample test in high dimensional data.”

Installation

You can install the released version of highDmean from CRAN with:

install.packages("highDmean")

Example

This is a basic example which shows you how to solve a common problem:

library(highDmean)
data <- buildData(n = 45, m =60, p = 300,
          muX = rep(0,300), muY = rep(0,300),
          dep = 'IND', S = 1, innov = rnorm)
zwl_test(data[[1]]$X, data[[1]]$Y, order = 2)
#> $statistic
#> [1] 0.7534648
#> 
#> $pvalue
#> [1] 0.4511707
#> 
#> $Tn
#> [1] 1.08859
#> 
#> $var
#> [1] 0.007897337

Main functions

The functions zwl_test() and SKK_test() accept n by p and m by p data matrices with sample data from the first and second populations and return test statistics and p-values for the null hypothesis of equal means.

The buildData() function simulates high-dimensional data in the two-population setting with specified sample sizes, numbers of components, covariance structure, etc., and the functions zwl_sim() and SKK_sim() return test statistic values and p-values for lists of simulated data sets generated by buildData().

r-highDmean

highDmean

Installation

Example

Main functions

Version

License

Status

Source

Homepage

Platforms (76)

highDmean

Installation

Example

Main functions

Version

License

Status

Source

Homepage

Platforms76 (76)

Platforms (76)