Description
Tools for Hypothesis Testing Based on Hypergeometric Intersection Distributions.
Description
Hypergeometric Intersection distributions are a broad group of distributions that describe the probability of picking intersections when drawing independently from two (or more) urns containing variable numbers of balls belonging to the same n categories. <arXiv:1305.0717>.
README.md
hint
Summary
hint
is an R
package for performing hypothesis testing based on Hypergeometric Intersection distributions. For example if you had three gene sets arising from three separate experiments with x
genes shared between the three, you could determine the probability of arriving at this intersection size by chance. See the companion paper for more information.
Installation
install.packages("hint")
Usage
library(hint)
## The probability of an intersection of size 5 or greater
## when sampling 15, 8, and 7 balls from three urns
## each with 1 ball in each of 29 categories.
phint(29, c(15, 8, 7), vals = 5)
v cum.p
1 5 0.0002289938
## Formalising a hypothesis test using 'hint.test'.
# Categories given in the first column.
# Numbers of balls in each category given in subsequent columns: each column representing an urn.
dd <- data.frame(categories = letters[1:20],
urn1_count = rep(1,20),
urn2_count = rep(1,20))
tt <- hint.test(dd, letters[1:9],
letters[4:15], alternative = "greater")
print(tt)
Hypergeometric intersection test
Parameters:
n a b q v
20 9 12 0 6
P(X >= v) = 0.4649917
plot(tt)
## Allow duplicates in the second urn.
dd <- data.frame(letters[1:20],
rep(1,20),
c(rep(1,4), rep(2,16)))
tt <- hint.test(dd, letters[1:9],
letters[9:14], alternative = "less")
print(tt)
Hypergeometric intersection test
Parameters:
n a b q v
20 9 6 16 1
P(X <= v) = 0.1596769
References
Kalinka (2013). The probability of drawing intersections: extending the hypergeometric distribution. arXiv.1305.0717