scispace - formally typeset
Proceedings ArticleDOI

Large vocabulary decoding and confidence estimation using word posterior probabilities

Gunnar Evermann, +1 more
- Vol. 3, pp 1655-1658
Reads0
Chats0
TLDR
The paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system and a novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented.
Abstract
The paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system. A novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented. The problem of the robust estimation of confidence scores from word posteriors is examined and a method based on decision trees is suggested. The effectiveness of these techniques is demonstrated on the broadcast news and the conversational telephone speech corpora where improvements both in terms of word error rate and normalised cross entropy were achieved compared to the baseline HTK evaluation systems.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition

Dan Jurafsky, +1 more
TL;DR: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora, to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation.
Journal ArticleDOI

Finding consensus in speech recognition: word error minimization and other applications of confusion networks☆

TL;DR: A new framework for distilling information from word lattices is described to improve the accuracy of the speech recognition output and obtain a more perspicuous representation of a set of alternative hypotheses.
Book

Application of Hidden Markov Models in Speech Recognition

TL;DR: The aim of this review is first to present the core architecture of a HMM-based LVCSR system and then to describe the various refinements which are needed to achieve state-of-the-art performance.

Posterior probability decoding, confidence estimation and system combination

TL;DR: The word lattices produced by the Viterbi decoder were used to generate confusion networks, which provide a compact representation of the most likely word hypotheses and their associated word posterior probabilities.

Synthesis Lectures on Human Language Technologies

TL;DR: This book gives a comprehensive view of state-of-the-art techniques that are used to build spoken dialogue systems and presents dialogue modelling and system development issues relevant in both academic and industrial environments and also discusses requirements and challenges for advanced interaction management and future research.
References
More filters
Proceedings ArticleDOI

A post-processing system to yield reduced word error rates: Recognizer Output Voting Error Reduction (ROVER)

TL;DR: The NIST Recognizer Output Voting Error Reduction (ROVER) system as discussed by the authors was developed at NIST to produce a composite automatic speech recognition (ASR) system output when the outputs of multiple ASR systems are available, and for which the composite ASR output has a lower error rate than any of the individual systems.
Proceedings Article

Finding consensus among words : Lattice-based word error minimization

TL;DR: A new algorithm for finding the hypothesis in a recognition lattice that is expected to minimize the word error rate (WER) is described, which overcomes the mismatch between the word-based performance metric and the standard MAP scoring paradigm that is sentence-based.
Proceedings Article

Explicit word error minimization in N-Best list rescoring

TL;DR: A new algorithm is developed that explicitly minimizes expected word error for recognition hypotheses, and approximate the posterior hypothesis probabilities using N-best lists and chooses the hypothesis with the lowest error.
Proceedings ArticleDOI

LVCSR log-likelihood ratio scoring for keyword spotting

TL;DR: A new scoring algorithm has been developed for generating wordspotting hypotheses and their associated scores that uses a large-vocabulary continuous speech recognition system to generate the N-best answers along with their Viterbi alignments.
Proceedings ArticleDOI

Using word probabilities as confidence measures

TL;DR: An approach to estimate the confidence in a hypothesized word as its posterior probability, given all acoustic feature vectors of the speaker utterance, as the sum of all word hypothesis probabilities which represent the occurrence of the same word in more or less the same segment of time.
Related Papers (5)