scispace - formally typeset
Open AccessProceedings Article

Diffusion of Credit in Markovian Models

TLDR
Using results from Markov chain theory, it is shown that the problem of diffusion is reduced if the transition probabilities approach 0 or 1, and under this condition, standard HMMs have very limited modeling capabilities, but input/output HMMs can still perform interesting computations.
Abstract
This paper studies the problem of diffusion in Markovian models, such as hidden Markov models (HMMs) and how it makes very difficult the task of learning of long-term dependencies in sequences. Using results from Markov chain theory, we show that the problem of diffusion is reduced if the transition probabilities approach 0 or 1. Under this condition, standard HMMs have very limited modeling capabilities, but input/output HMMs can still perform interesting computations.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

Hierarchical Recurrent Neural Networks for Long-Term Dependencies

TL;DR: This paper proposes to use a more general type of a-priori knowledge, namely that the temporal dependencies are structured hierarchically, which implies that long-term dependencies are represented by variables with a long time scale.
Journal ArticleDOI

Discovery and segmentation of activities in video

TL;DR: In this article, Hidden Markov Models (HMMs) are used to organize observed activity into meaningful states by minimizing the entropy of the joint distribution of the HMMs' internal state machine.
Journal ArticleDOI

Input-output HMMs for sequence processing

TL;DR: It is demonstrated that IOHMMs are well suited for solving grammatical inference problems on a benchmark problem and able to map input sequences to output sequences, using the same processing style as recurrent neural networks.
Journal ArticleDOI

Structure learning in conditional probability models via an entropic prior and parameter extinction

TL;DR: An entropic prior is introduced for multinomial parameter estimation problems and the resulting models show superior generalization to held-out test data, and a guarantee that any such deletion will increase the posterior probability of the model.
Journal ArticleDOI

Clifford Support Vector Machines for Classification, Regression, and Recurrence

TL;DR: It is shown that one can apply CSVM for classification and regression and also to build a recurrent CSVM, which is an attractive approach for the multiple input multiple output processing of high-dimensional geometric entities.
References
More filters
Journal ArticleDOI

Learning long-term dependencies with gradient descent is difficult

TL;DR: This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods.
Book

Non-negative Matrices and Markov Chains

Eugene Seneta
TL;DR: Finite Non-Negative Matrices as mentioned in this paper are a generalization of finite stochastic matrices, and finite non-negative matrices have been studied extensively in the literature.
Proceedings Article

An Input Output HMM Architecture

TL;DR: A recurrent architecture having a modular structure that has similarities to hidden Markov models, but supports recurrent networks processing style and allows to exploit the supervised learning paradigm while using maximum likelihood estimation is introduced.
Journal ArticleDOI

Unified integration of explicit knowledge and learning by example in recurrent networks

TL;DR: A novel unified approach for integrating explicit knowledge and learning by example in recurrent networks is proposed, which is accomplished by using a technique based on linear programming, instead of learning from random initial weights.
Proceedings Article

Credit Assignment through Time: Alternatives to Backpropagation

TL;DR: This work considers and compares alternative algorithms and architectures on tasks for which the span of the input/output dependencies can be controlled and shows performance qualitatively superior to that obtained with backpropagation.
Related Papers (5)