Products of Hidden Markov Models: It Takes N>1 to Tango

Open AccessPosted Content

Products of Hidden Markov Models: It Takes N>1 to Tango

Graham W. Taylor, +1 more

- 09 May 2012 -

arXiv: Learning

Chats0

TLDR

It is demonstrated how the partition function can be estimated reliably via Annealed Importance Sampling, and suggested that advances in learning and evaluation for undirected graphical models and recent increases in available computing power make PoHMMs worth considering for complex time-series modeling tasks.

Abstract:

Products of Hidden Markov Models(PoHMMs) are an interesting class of generative models which have received little attention since their introduction. This maybe in part due to their more computationally expensive gradient-based learning algorithm,and the intractability of computing the log likelihood of sequences under the model. In this paper, we demonstrate how the partition function can be estimated reliably via Annealed Importance Sampling. We perform experiments using contrastive divergence learning on rainfall data and data captured from pairs of people dancing. Our results suggest that advances in learning and evaluation for undirected graphical models and recent increases in available computing power make PoHMMs worth considering for complex time-series modeling tasks.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Annealing between distributions by averaging moments

Roger Grosse, +2 more

TL;DR: A novel sequence of intermediate distributions for exponential families defined by averaging the moments of the initial and target distributions is presented and an asymptotically optimal piecewise linear schedule is derived.

...read moreread less

Proceedings Article

Learning and Model-Checking Networks of I/O Automata

Hua Mao, +1 more

TL;DR: A new statistical relational learning (SRL) approach in which models for structured data are constructed as networks of communicating nite probabilistic automata, which adds a dimension of model analysis not usually available for traditional SRL modeling frameworks.

...read moreread less

Deep Discriminative and Generative Models for Pattern Recognition

Li Deng, +1 more

TL;DR: This chapter proposes ways in which deep generative models can be beneficially integrated with deep discriminative models based on their respective strengths and examines the recent advances in endto-end optimization.

...read moreread less

Dissertation

Model selection in compositional spaces

Roger Grosse

TL;DR: A novel method for obtaining ground truth marginal likelihood values on synthetic data is presented, which enables the rigorous quantitative comparison of marginal likelihood estimators.

...read moreread less

Dissertation

The Emergence of Multimodal Concepts : From Perceptual Motion Primitives to Grounded Acoustic Words

Olivier Mangin

TL;DR: This thesis studies the question of the how a developmental cognitive agent can discover a dictionary of primitive patterns from its multimodal perceptual flow and specifies its links with Quine's indetermination of translation and blind source separation.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Training products of experts by minimizing contrastive divergence

Geoffrey E. Hinton

- 01 Aug 2002 -

Neural Computation

TL;DR: A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule.

...read moreread less

Journal ArticleDOI

Factorial Hidden Markov Models

Zoubin Ghahramani, +1 more

TL;DR: A generalization of HMMs in which this state is factored into multiple state variables and is therefore represented in a distributed manner, and a structured approximation in which the the state variables are decoupled, yielding a tractable algorithm for learning the parameters of the model.

...read moreread less

Proceedings ArticleDOI

Training restricted Boltzmann machines using approximations to the likelihood gradient

Tijmen Tieleman

TL;DR: A new algorithm for training Restricted Boltzmann Machines is introduced, which is compared to some standard Contrastive Divergence and Pseudo-Likelihood algorithms on the tasks of modeling and classifying various types of data.

...read moreread less

Annealed Importance Sampling

Radford M. Neal

TL;DR: In this article, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler, which is a generalization of a recently proposed variant of sequential importance sampling.

...read moreread less

Proceedings Article

On Contrastive Divergence Learning.

Miguel Á. Carreira-Perpiñán, +1 more

TL;DR: The properties of CD learning are studied and it is shown that it provides biased estimates in general, but that the bias is typically very small.

...read moreread less