MyNixOS website logo
Description

Fast Mutual Information Based Independence Test.

A mutual information estimator based on k-nearest neighbor method proposed by A. Kraskov, et al. (2004) <doi:10.1103/PhysRevE.69.066138> to measure general dependence and the time complexity for our estimator is only squared to the sample size, which is faster than other statistics. Besides, an implementation of mutual information based independence test is provided for analyzing multivariate data in Euclidean space (T B. Berrett, et al. (2019) <doi:10.1093/biomet/asz024>); furthermore, we extend it to tackle datasets in metric spaces.

Fast Mutual Information based independence Test (fastmit)

Introduction

The fundamental problem for data mining and statistical analysis is:

  • Whether two random variables are dependent?

fastmit package provides solutions for this issue. It implements the kNN method described by Kraskov, et. al (2004) to estimate the empirical mutual information and furthermore uses permutation test to detect whether two random variables are independent. The core functions in fastmit package are mi and mi.test.

These functions based on mutual information have two main advantages:

  • It's applicable to complex data in metric spaces.

  • It is faster than other statistics (e.g., distance covariance and ball covariance), which makes it advantageous in large sample situations.

Installation

CRAN version

You can install the released version of fastmit from CRAN with:

install.packages("fastmit")

Github version

You can install the released version of fastmit from GitHub with:

library(devtools)
install_github("Mamba413/fastmit/R-package")

Windows user will need to install Rtools first.

Example

  • mi function
## simulate data
set.seed(1)
x <- rnorm(100)
y <- x + rnorm(100)
## estimate the empirical mutual information
mi(x, y)

In this example, the result is:

# [1] 0.3320034
  • mi.test function
## simulate data
set.seed(1)
error <- runif(50, min = -0.3, max = 0.3)
x <- runif(50, 0, 4*pi)
y <- cos(x) + error

## perform independence test via mutual information
mi.test(x, y)

In this example, the result is:

	Mutual Information test of independence

data:  x and y
number of observations = 50
replicates = 99
p-value = 0.01
alternative hypothesis: random variables are dependent
sample estimates:
       MI 
0.6953105 

If you find any bugs, or if you experience any crashes, please report to us. Also, if you have any questions, feel free to ask.

License

GPL-3

Reference

Metadata

Version

0.1.1

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows