scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Patent
Jung-Eun Kim1, Jae-won Lee1
05 Aug 2004
TL;DR: In this article, a meta-dialogue generation unit generates a question asking the user for additional information based on a content of a portion where the error exists and a type of the error.
Abstract: In order to handle portions of a recognized sentence having an error, a speaker or user is questioned about contents associated with the portions, and according to a user's answer a result is obtained. A speech recognition unit extracts a speech feature of a speech signal inputted from a user and finds a phoneme nearest to the speech feature to recognize a word. A recognition error determination unit finds a sentence confidence based on a confidence of the recognized word, performs examination of a semantic structure of a recognized sentence, and determines whether or not an error exists in the recognized sentence which is subject to speech recognition according to a predetermined criterion based on both the sentence confidence and a result of examining the semantic structure of the recognized sentence. Further, a meta-dialogue generation unit generates a question asking the user for additional information based on a content of a portion where the error exists and a type of the error.

92 citations

Proceedings ArticleDOI
23 Mar 1992
TL;DR: Preliminary senone modeling results are reported, which have significantly reduced the word error rate for speaker-independent continuous speech recognition and treat the state in phonetic hidden Markov models as the basic subphonetic unit-senone.
Abstract: There will never be sufficient training data to model all the various acoustic-phonetic phenomena. How to capture important clues and estimate those needed parameters reliably is one of the central issues in speech recognition. Successful examples include subword models, fenones and many other smoothing techniques. In comparison with subword models, subphonetic modeling may provide a finer level of details. The authors propose to model subphonetic events with Markov states and treat the state in phonetic hidden Markov models as the basic subphonetic unit-senone. Senones generalize fenones in several ways. A word model is a concatenation of senones and senones can be shared across different word models. Senone models not only allow parameter sharing, but also enable pronunciation optimization. The authors report preliminary senone modeling results, which have significantly reduced the word error rate for speaker-independent continuous speech recognition. >

92 citations

Proceedings ArticleDOI
Bo Li1, Shuo-Yiin Chang1, Tara N. Sainath1, Ruoming Pang1, Yanzhang He1, Trevor Strohman1, Yonghui Wu1 
24 Apr 2020
TL;DR: This work proposes to reduce E2E model’s latency by extending the RNN-T endpointer (RNN- T EP) model with additional early and late penalties and achieves 8.0% relative word error rate (WER) reduction and 130ms 90-percentile latency reduction over [2] on a Voice Search test set.
Abstract: End-to-end (E2E) models fold the acoustic, pronunciation and language models of a conventional speech recognition model into one neural network with a much smaller number of parameters than a conventional ASR system, thus making it suitable for on-device applications. For example, recurrent neural network transducer (RNN-T) as a streaming E2E model has shown promising potential for on-device ASR [1]. For such applications, quality and latency are two critical factors. We propose to reduce E2E model’s latency by extending the RNN-T endpointer (RNN-T EP) model [2] with additional early and late penalties. By further applying the minimum word error rate (MWER) training technique [3], we achieved 8.0% relative word error rate (WER) reduction and 130ms 90-percentile latency reduction over [2] on a Voice Search test set. We also experimented with a second-pass Listen, Attend and Spell (LAS) rescorer [4]. Although it did not directly improve the first pass latency, the large WER reduction provides extra room to trade WER for latency. RNN-T EP+LAS, together with MWER training brings in 18.7% relative WER reduction and 160ms 90-percentile latency reductions compared to the original proposed RNN-T EP [2] model.

92 citations

Book ChapterDOI
William W. Cohen1
09 Jul 1995
TL;DR: It is shown that FOIL usually forms classifiers with lower error rates and higher rates of precision and recall with a relational encoding than with a propositional encoding, and its performance can be improved by relation selection, a first order analog of feature selection.
Abstract: We evaluate the first order learning system FOIL on a series of text categorization problems. It is shown that FOIL usually forms classifiers with lower error rates and higher rates of precision and recall with a relational encoding than with a propositional encoding. We show that FOIL's performance can be improved by relation selection, a first order analog of feature selection. Relation selection improves FOIL's performance as measured by any of recall, precision, F-measure, or error rate. With an appropriate level of relation selection, FOIL appears to be competitive with or superior to existing propositional techniques.

92 citations

01 Jan 2002
TL;DR: The Bionic Pattern Recognition uses neural networks, which acts by the method of covering the high dimensional geometrical distribution of the sample set in the feature space of any one of the certain kinds of samples.
Abstract: A new model of pattern recognition principles,witch is based on "matter cognition"instead of "matter classification"in traditional statistical pattern recognition,has been proposed.This new model is better closer to the function of human being,rather than traditional statistical pattern recognition using"optimal seperating"as its main principle.So the new model of pattern recognition is called the Bionic Pattern Recognition.Its mathematical basis are topological analysis of sample set in the high dimensional feature space,therefore it is also called the Topological Pattern Recognition.The basic idea of this model is based on the fact of the continuity in the feature space of any one of the certain kinds of samples.We did experiments on recognition of omnidirectionally oriented rigid objects on the same level,with the Bionic Pattern Recognition using neural networks,which acts by the method of covering the high dimensional geometrical distribution of the sample set in the feature space.Many animal and vehicle models(even with rather similar shapes) were recognized omnidirectionally thousands of times.For total 8800 tests,the correct recognition rate is 99.75%,the error rate and the rejection rate are 0 and 0.25 respectively.

91 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528