Decoder selection based on cross-entropies

doi:10.1109/ICASSP.1988.196499

Proceedings ArticleDOI

Decoder selection based on cross-entropies

- pp 20-23

TLDR

The authors generalize the maximum likelihood and related optimization criteria for training and decoding with a speech recognizer by considering weighted linear combinations of the logarithms of the likelihoods of words, of acoustics, and of (word, acoustic) pairs.

Abstract:

The authors generalize the maximum likelihood and related optimization criteria for training and decoding with a speech recognizer. The generalizations are constructed by considering weighted linear combinations of the logarithms of the likelihoods of words, of acoustics, and of (word, acoustic) pairs. The utility of various patterns of weights are examined. >

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Minimum Phone Error and I-smoothing for improved discriminative training

Daniel Povey, +1 more

TL;DR: The Minimum Phone Error (MPE) and Minimum Word Error (MWE) criteria are smoothed approximations to the phone or word error rate respectively and I-smoothing which is a novel technique for smoothing discriminative training criteria using statistics for maximum likelihood estimation (MLE).

...read moreread less

Journal ArticleDOI

Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error

Erik McDermott, +4 more

- 01 Jan 2007 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary speech recognition tasks.

...read moreread less

Proceedings Article

Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters

John S. Bridle

TL;DR: It is shown that once the output layer of a multilayer perceptron is modified to provide mathematically correct probability distributions, and the usual squared error criterion is replaced with a probability-based score, the result is equivalent to Maximum Mutual Information training.

...read moreread less

Journal ArticleDOI

An inequality for rational functions with applications to some statistical estimation problems

Ponani S. Gopalakrishnan, +3 more

- 01 Jan 1991 -

IEEE Transactions on Information Theory

TL;DR: The well-known Baum-Eagon inequality provides an effective iterative scheme for finding a local maximum for homogeneous polynomials with positive coefficients over a domain of probability values.

...read moreread less

Journal ArticleDOI

Towards increasing speech recognition error rates

Hervé Bourlard, +5 more

- 01 May 1996 -

Speech Communication

TL;DR: In this article, the authors discuss some research directions for ASR that may not always yield an immediate and guaranteed decrease in error rate but which hold some promise for ultimately improving performance in the end applications, including discrimination between rival utterance models, the role of prior information in speech recognition, merging the language and acoustic models, feature extraction and temporal information, and decoding procedures reflecting human perceptual properties.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

A Maximum Likelihood Approach to Continuous Speech Recognition

Lalit R. Bahl, +2 more

- 01 Feb 1983 -

IEEE Transactions on Pattern Analysis an...

TL;DR: This paper describes a number of statistical models for use in speech recognition, with special attention to determining the parameters for such models from sparse data, and describes two decoding methods appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks.

...read moreread less

Proceedings ArticleDOI

Maximum mutual information estimation of hidden Markov model parameters for speech recognition

Lalit R. Bahl, +3 more

TL;DR: A method for estimating the parameters of hidden Markov models of speech is described and recognition results are presented comparing this method with maximum likelihood estimation.

...read moreread less

Journal ArticleDOI

A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood

A. Nadas

- 01 Aug 1983 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: The currently used method of maximum likelihood, while heuristic, is shown to be superior under certain assumptions to another heuristic: the method of conditional maximum likelihood.

...read moreread less

Journal ArticleDOI

On a model-robust training method for speech recognition

A. Nadas, +2 more

- 01 Sep 1988 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: For minimizing the decoding error rate of the (optimal) maximum a posteriori probability (MAP) decoder, it is shown that the CMLE (or maximum mutual information estimate, MMIE) may be preferable when the model is incorrect.

...read moreread less

Journal ArticleDOI

Optimal solution of a training problem in speech recognition

A. Nadas

- 01 Feb 1985 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: This correspondence presents the optimal Bayes solution to this optimization problem by maximizing the expected payoff: conditionally on given training data decode theoustic signal for a word as any word which maximizes the a posteriori expected joint probability of the word and the acoustic signal.

...read moreread less