Proceedings ArticleDOI
Decoder selection based on cross-entropies
TLDR
The authors generalize the maximum likelihood and related optimization criteria for training and decoding with a speech recognizer by considering weighted linear combinations of the logarithms of the likelihoods of words, of acoustics, and of (word, acoustic) pairs.Abstract:
The authors generalize the maximum likelihood and related optimization criteria for training and decoding with a speech recognizer. The generalizations are constructed by considering weighted linear combinations of the logarithms of the likelihoods of words, of acoustics, and of (word, acoustic) pairs. The utility of various patterns of weights are examined. >read more
Citations
More filters
Proceedings ArticleDOI
Minimum Phone Error and I-smoothing for improved discriminative training
Daniel Povey,Philip C. Woodland +1 more
TL;DR: The Minimum Phone Error (MPE) and Minimum Word Error (MWE) criteria are smoothed approximations to the phone or word error rate respectively and I-smoothing which is a novel technique for smoothing discriminative training criteria using statistics for maximum likelihood estimation (MLE).
Journal ArticleDOI
Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error
TL;DR: This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary speech recognition tasks.
Proceedings Article
Training Stochastic Model Recognition Algorithms as Networks can Lead to Maximum Mutual Information Estimation of Parameters
TL;DR: It is shown that once the output layer of a multilayer perceptron is modified to provide mathematically correct probability distributions, and the usual squared error criterion is replaced with a probability-based score, the result is equivalent to Maximum Mutual Information training.
Journal ArticleDOI
An inequality for rational functions with applications to some statistical estimation problems
TL;DR: The well-known Baum-Eagon inequality provides an effective iterative scheme for finding a local maximum for homogeneous polynomials with positive coefficients over a domain of probability values.
Journal ArticleDOI
Towards increasing speech recognition error rates
TL;DR: In this article, the authors discuss some research directions for ASR that may not always yield an immediate and guaranteed decrease in error rate but which hold some promise for ultimately improving performance in the end applications, including discrimination between rival utterance models, the role of prior information in speech recognition, merging the language and acoustic models, feature extraction and temporal information, and decoding procedures reflecting human perceptual properties.
References
More filters
Journal ArticleDOI
A Maximum Likelihood Approach to Continuous Speech Recognition
TL;DR: This paper describes a number of statistical models for use in speech recognition, with special attention to determining the parameters for such models from sparse data, and describes two decoding methods appropriate for constrained artificial languages and one appropriate for more realistic decoding tasks.
Proceedings ArticleDOI
Maximum mutual information estimation of hidden Markov model parameters for speech recognition
TL;DR: A method for estimating the parameters of hidden Markov models of speech is described and recognition results are presented comparing this method with maximum likelihood estimation.
Journal ArticleDOI
A decision theorectic formulation of a training problem in speech recognition and a comparison of training by unconditional versus conditional maximum likelihood
TL;DR: The currently used method of maximum likelihood, while heuristic, is shown to be superior under certain assumptions to another heuristic: the method of conditional maximum likelihood.
Journal ArticleDOI
On a model-robust training method for speech recognition
TL;DR: For minimizing the decoding error rate of the (optimal) maximum a posteriori probability (MAP) decoder, it is shown that the CMLE (or maximum mutual information estimate, MMIE) may be preferable when the model is incorrect.
Journal ArticleDOI
Optimal solution of a training problem in speech recognition
TL;DR: This correspondence presents the optimal Bayes solution to this optimization problem by maximizing the expected payoff: conditionally on given training data decode theoustic signal for a word as any word which maximizes the a posteriori expected joint probability of the word and the acoustic signal.