Optimization of latent variable and dynamical models with Goal.
goal-graphical provides tools for with dynamical and graphical models. Various graphical models are defined here, e.g. mixture models and restricted Boltzmann machines, dynamical models such as HMMs and Kalman filters, and in both cases algorithms for fitting them e.g. expectation maximization and contrastive divergence minimization.
This library provides definitions and algorithms for various graphical models such as mixture models, kalman Filters, and restricted Boltzmann machines, as well as algorithms for fitting them e.g. expectation maximization and contrastive divergence minimization. Underlying all of these models is is a generalized linear object known as a Harmonium, and in the following I will briefly introduce them.
The core definition of this library is a Manifold
of joint distributions I call of an AffineHarmonium
newtype AffineHarmonium f y x z w = AffineHarmonium (Affine f y z x, w)
which is a product Manifold
composed of a of a Manifold
of likelihood functions Affine f y z x
, and a Manifold
of distributions w
that partially define the space of priors. AffineHarmoniums
provide a bit more flexibility than what I call a Harmonium
type Harmonium f z w = AffineHarmonium f z w z w
which is a simpler object. Nevertheless, from a theoretical point of view, an AffineHarmonium
is a special case of a Harmonium
, and we may think of them more or less equivalently.
A Harmonium
is a model over a observable variables and latent variables, and represents a sort of generalized linear joint distribution over the two of them. The theory for Harmonium
s is summarized well by this paper and this paper. Although Harmonium
s might seem like a little-studied, and esoteric object, various well known models, such as mixture models and restricted Boltzmann machines, are in fact Harmonium
s, and various other models, such as factor analysis, Kalman filters, or hidden Markov models, can be expressed in terms of them.
All of the aforementioned models can be fit with Expectation-Maximization (EM), and EM can be expressed in an entirely general manner for Harmonium
s. Firstly, the expectation step of a Harmonium
is implemented by
expectationStep
:: ( ExponentialFamily z, Map Natural f x y, Bilinear f y x
, Translation z y, Translation w x, LegendreExponentialFamily w )
=> Sample z -- ^ Model Samples
-> Natural # AffineHarmonium f y x z w -- ^ Harmonium
-> Mean # AffineHarmonium f y x z w -- ^ Harmonium expected sufficient statistics
expectationStep zs hrm =
let mzs = sufficientStatistic <$> zs
mys = anchor <$> mzs
pstr = fst . split $ transposeHarmonium hrm
mws = transition <$> pstr >$> mys
mxs = anchor <$> mws
myx = (>$<) mys mxs
in joinHarmonium (average mzs) myx $ average mws
In summary, what we do is
- take some observations,
- compute their
sufficientStatistics
, - map these statistics into the predictions of the latent variables,
- transition these latent predictions from
Natural
coordinates toMean
coordinates, - and assemble the results into the
Mean sufficientStatistics
of the joint distribution.
The maximization step then consists simply of mapping the whole joint distribution from Mean
back to Natural
coordinates, such that expectationMaximization
may be expressed as
expectationMaximization
:: ( DuallyFlatExponentialFamily (AffineHarmonium f y x z w)
, ExponentialFamily z, Map Natural f x y, Bilinear f y x
, Translation z y, Translation w x, LegendreExponentialFamily w )
=> Sample z
-> Natural # AffineHarmonium f y x z w
-> Natural # AffineHarmonium f y x z w
expectationMaximization zs hrm = transition $ expectationStep zs hrm
As such, for a wide variety of models, we may reduce implementing expectation maximization to instantating the class requirements of the expectationMaximization
function. This is rarely trivial, but in some sense, much more straight-forward and well-defined that deriving EM algorithms from scratch.
For in-depth tutorials visit my blog.