MyNixOS website logo
Description

Optimization on manifolds of probability distributions with Goal.

goal-probability provides tools for implementing and applying basic statistical models. The core concept of goal-probability are statistical manifolds, i.e. manifold of probability distributions, with a focus on exponential family distributions.

This library provides tools for implementing and applying statistical and machine learning algorithms. The core concept of goal-probability is that of a statistical manifold, i.e. manifold of probability distributions, with a focus on exponential family distributions. What follows is brief introduction to how we define and work with statistical manifolds in Goal.

The core definition of this library is that of a Statistical Manifold.

class Manifold x => Statistical x where
    type SamplePoint x :: Type

A Statistical Manifold is a Manifold of probability distributions, such that each point on the manifold is a probability distribution over the specified space of SamplePoints. We may evaluate the probability mass/density of a SamplePoint under a given distribution as long as the distribution is AbsolutelyContinous.

class Statistical x => AbsolutelyContinuous c x where
    density :: Point c x -> SamplePoint x -> Double
    densities :: Point c x -> Sample x -> [Double]

Similarly, we may generate a Sample from a given distribution as long as it is Generative.

type Sample x = [SamplePoint x]

class Statistical x => Generative c x where
    samplePoint :: Point c x -> Random r (SamplePoint x)
    sample :: Int -> Point c x -> Random r (Sample x)

In both these cases, class methods are defined both both single and bulk evaluation, to potentially take advantage of bulk linear algebra operations.

Let us now look at some example distributions that we may define; for the sake of brevity, I will not introduce every bit of necessary code. In Goal we create a normal distribution by writing

nrm :: Source # Normal
nrm = fromTuple (0,1)

where 0 is the mean and 1 is the variance. For each Statistical Manifold, the Source coordinate system represents some standard parameterization, e.g. as one typically finds on Wikipedia. Similarly, we can create a binomial distribution with

bnm :: Source # Binomial 5
bnm = Point $ S.singleton 0.5

which is a binomial distribution over 5 fair coin tosses.

Exponential families are also a core part of this library. An ExponentiaFamily is a kind of Statistical Manifold defined in terms of a sufficientStatistic and a baseMeasure.

class Statistical x => ExponentialFamily x where
    sufficientStatistic :: SamplePoint x -> Mean # x
    baseMeasure :: Proxy x -> SamplePoint x -> Double

Exponential families may always be parameterized in terms of the so-called Natural and Mean parameters. Mean coordinates are equal to the average value of the sufficientStatistic under the given distribution. The Natural coordinates are arguably less intuitive, but they are critical for implementing evaluating exponential family distributions numerically. For example, the unnormalized density function of an ExponentialFamily distribution is given by

unnormalizedDensity :: forall x . ExponentialFamily x => Natural # x -> SamplePoint x -> Double
unnormalizedDensity p x =
    exp (p <.> sufficientStatistic x) * baseMeasure (Proxy @ x) x

For in-depth tutorials visit my blog.

Metadata

Version

0.20

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows