Large vocabulary decoding and confidence estimation using word posterior probabilities

doi:10.1109/ICASSP.2000.862067

Proceedings ArticleDOI

Large vocabulary decoding and confidence estimation using word posterior probabilities

Gunnar Evermann, +1 more

- Vol. 3, pp 1655-1658

Chats0

TLDR

The paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system and a novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented.

Abstract:

The paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system. A novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented. The problem of the robust estimation of confidence scores from word posteriors is examined and a method based on decision trees is suggested. The effectiveness of these techniques is demonstrated on the broadcast news and the conversational telephone speech corpora where improvements both in terms of word error rate and normalised cross entropy were achieved compared to the baseline HTK evaluation systems.

Citations

PDF

Open Access

More filters

Book

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Dan Jurafsky, +1 more

TL;DR: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora, to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation.

...read moreread less

Journal ArticleDOI

Finding consensus in speech recognition: word error minimization and other applications of confusion networks☆

Lidia Mangu, +2 more

- 01 Oct 2000 -

Computer Speech & Language

TL;DR: A new framework for distilling information from word lattices is described to improve the accuracy of the speech recognition output and obtain a more perspicuous representation of a set of alternative hypotheses.

...read moreread less

Book

Application of Hidden Markov Models in Speech Recognition

Mark J. F. Gales, +1 more

TL;DR: The aim of this review is first to present the core architecture of a HMM-based LVCSR system and then to describe the various refinements which are needed to achieve state-of-the-art performance.

...read moreread less

Posterior probability decoding, confidence estimation and system combination

Gunnar Evermann, +1 more

TL;DR: The word lattices produced by the Viterbi decoder were used to generate confusion networks, which provide a compact representation of the most likely word hypotheses and their associated word posterior probabilities.

...read moreread less

Synthesis Lectures on Human Language Technologies

Ido Dagan, +6 more

TL;DR: This book gives a comprehensive view of state-of-the-art techniques that are used to build spoken dialogue systems and presents dialogue modelling and system development issues relevant in both academic and industrial environments and also discusses requirements and challenges for advanced interaction management and future research.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)

Jonathan G. Fiscus

TL;DR: The NIST Recognizer Output Voting Error Reduction (ROVER) system as discussed by the authors was developed at NIST to produce a composite automatic speech recognition (ASR) system output when the outputs of multiple ASR systems are available, and for which the composite ASR output has a lower error rate than any of the individual systems.

...read moreread less

Proceedings Article

Finding consensus among words : Lattice-based word error minimization

Lidia Mangu, +2 more

TL;DR: A new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER) is described, which overcomes the mismatch between the word-based performance metric and the standard MAP scoring paradigm that is sentence-based.

...read moreread less

Proceedings Article

Explicit word error minimization in N-Best list rescoring

Andreas Stolcke, +2 more

TL;DR: A new algorithm is developed that explicitly minimizes expected word error for recognition hypotheses, and approximate the posterior hypothesis probabilities using N-best lists and chooses the hypothesis with the lowest error.

...read moreread less

Proceedings ArticleDOI

LVCSR log-likelihood ratio scoring for keyword spotting

Mitchel Weintraub

TL;DR: A new scoring algorithm has been developed for generating wordspotting hypotheses and their associated scores that uses a large-vocabulary continuous speech recognition system to generate the N-best answers along with their Viterbi alignments.

...read moreread less

Proceedings ArticleDOI

Using word probabilities as confidence measures

Frank Wessel, +2 more

TL;DR: An approach to estimate the confidence in a hypothesized word as its posterior probability, given all acoustic feature vectors of the speaker utterance, as the sum of all word hypothesis probabilities which represent the occurrence of the same word in more or less the same segment of time.

...read moreread less