scispace - formally typeset
Open AccessPosted Content

Products of Hidden Markov Models: It Takes N>1 to Tango

Reads0
Chats0
TLDR
It is demonstrated how the partition function can be estimated reliably via Annealed Importance Sampling, and suggested that advances in learning and evaluation for undirected graphical models and recent increases in available computing power make PoHMMs worth considering for complex time-series modeling tasks.
Abstract
Products of Hidden Markov Models(PoHMMs) are an interesting class of generative models which have received little attention since their introduction. This maybe in part due to their more computationally expensive gradient-based learning algorithm,and the intractability of computing the log likelihood of sequences under the model. In this paper, we demonstrate how the partition function can be estimated reliably via Annealed Importance Sampling. We perform experiments using contrastive divergence learning on rainfall data and data captured from pairs of people dancing. Our results suggest that advances in learning and evaluation for undirected graphical models and recent increases in available computing power make PoHMMs worth considering for complex time-series modeling tasks.

read more

Citations
More filters
Proceedings ArticleDOI

Annealing between distributions by averaging moments

TL;DR: A novel sequence of intermediate distributions for exponential families defined by averaging the moments of the initial and target distributions is presented and an asymptotically optimal piecewise linear schedule is derived.
Proceedings Article

Learning and Model-Checking Networks of I/O Automata

Hua Mao, +1 more
TL;DR: A new statistical relational learning (SRL) approach in which models for structured data are constructed as networks of communicating nite probabilistic automata, which adds a dimension of model analysis not usually available for traditional SRL modeling frameworks.

Deep Discriminative and Generative Models for Pattern Recognition

TL;DR: This chapter proposes ways in which deep generative models can be beneficially integrated with deep discriminative models based on their respective strengths and examines the recent advances in endto-end optimization.
Dissertation

Model selection in compositional spaces

Roger Grosse
TL;DR: A novel method for obtaining ground truth marginal likelihood values on synthetic data is presented, which enables the rigorous quantitative comparison of marginal likelihood estimators.
Dissertation

The Emergence of Multimodal Concepts : From Perceptual Motion Primitives to Grounded Acoustic Words

TL;DR: This thesis studies the question of the how a developmental cognitive agent can discover a dictionary of primitive patterns from its multimodal perceptual flow and specifies its links with Quine's indetermination of translation and blind source separation.
References
More filters
Journal ArticleDOI

Training products of experts by minimizing contrastive divergence

TL;DR: A product of experts (PoE) is an interesting candidate for a perceptual system in which rapid inference is vital and generation is unnecessary because it is hard even to approximate the derivatives of the renormalization term in the combination rule.
Journal ArticleDOI

Factorial Hidden Markov Models

TL;DR: A generalization of HMMs in which this state is factored into multiple state variables and is therefore represented in a distributed manner, and a structured approximation in which the the state variables are decoupled, yielding a tractable algorithm for learning the parameters of the model.
Proceedings ArticleDOI

Training restricted Boltzmann machines using approximations to the likelihood gradient

TL;DR: A new algorithm for training Restricted Boltzmann Machines is introduced, which is compared to some standard Contrastive Divergence and Pseudo-Likelihood algorithms on the tasks of modeling and classifying various types of data.

Annealed Importance Sampling

TL;DR: In this article, it is shown how one can use the Markov chain transitions for such an annealing sequence to define an importance sampler, which is a generalization of a recently proposed variant of sequential importance sampling.
Proceedings Article

On Contrastive Divergence Learning.

TL;DR: The properties of CD learning are studied and it is shown that it provides biased estimates in general, but that the bias is typically very small.
Related Papers (5)