scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Patent
29 Jan 2003
TL;DR: In this article, a method for associating words and word strings in a language by analyzing word formations around a word or word string to identify other words or word strings that are equivalent or near equivalent semantically is presented.
Abstract: A method for creating and using a cross-idea association database (figure 1) that includes a method for associating words and word strings in a language by analyzing word formations around a word or word string to identify other words or word strings that are equivalent or near equivalent semantically. One method for associating words and word strings includes querying a collection of documents with a user-supplied word or word string input device 210), determining a user-defined amount of words or word strings to the left and right of the query string, determining the frequency of occurrence of words or word strings located on the left and right of the query string, and ranking the located words.

122 citations

Proceedings Article
01 Jan 1998
TL;DR: In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates that are the lowest the authors have seen reported on these tasks.
Abstract: This paper addresses the problem of acoustic phonetic modeling. First, heterogeneous acoustic measurements are chosen in order to maximize the acoustic-phonetic information extracted from the speech signal in preprocessing. Second, classifier systems are presented for successfully utilizing high-dimensional acoustic measurement spaces. The techniques used for achieving these two goals can be broadly categorized as hierarchical, committeebased, or a hybrid of these two. This paper presents committeebased and hybrid approaches. In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates of 18.3% and 24.4%, respectively. These error rates are the lowest we have seen reported on these tasks. In addition, experiments with a telephone-based weather information word recognition task led to word error rate reductions of 10–16%.

121 citations

Journal ArticleDOI
TL;DR: The experimental results demonstrated the crucial importance of using the newly introduced iterations in improving the earlier stochastic approximation technique, and showed sensitivity of the noise estimation algorithm's performance to the forgetting factor embedded in the algorithm.
Abstract: We describe a novel algorithm for recursive estimation of nonstationary acoustic noise which corrupts clean speech, and a successful application of the algorithm in the speech feature enhancement framework of noise-normalized SPLICE for robust speech recognition. The noise estimation algorithm makes use of a nonlinear model of the acoustic environment in the cepstral domain. Central to the algorithm is the innovative iterative stochastic approximation technique that improves piecewise linear approximation to the nonlinearity involved and that subsequently increases the accuracy for noise estimation. We report comprehensive experiments on SPLICE-based, noise-robust speech recognition for the AURORA2 task using the results of iterative stochastic approximation. The effectiveness of the new technique is demonstrated in comparison with a more traditional, MMSE noise estimation algorithm under otherwise identical conditions. The word error rate reduction achieved by iterative stochastic approximation for recursive noise estimation in the framework of noise-normalized SPLICE is 27.9% for the multicondition training mode, and 67.4% for the clean-only training mode, respectively, compared with the results using the standard cepstra with no speech enhancement and using the baseline HMM supplied by AURORA2. These represent the best performance in the clean-training category of the September-2001 AURORA2 evaluation. The relative error rate reduction achieved by using the same noise estimate is increased to 48.40% and 76.86%, respectively, for the two training modes after using a better designed HMM system. The experimental results demonstrated the crucial importance of using the newly introduced iterations in improving the earlier stochastic approximation technique, and showed sensitivity of the noise estimation algorithm's performance to the forgetting factor embedded in the algorithm.

121 citations

Patent
11 Jun 1997
TL;DR: A portable acoustic signal (speech signal) preprocessing (SSP) device for accessing an automatic speech/speaker recognition (ASSR) server comprises a microphone for converting sound including speech, silence and background noise signals to analog signals; an analog signals to digital converter for converting the analog signal to digital signals; a digital signal processor (DSP) for generating feature vector data representing the digitized speech and silence/background noise, and for generating channel characterization signals; and an acoustic coupler for converting feature vector signals and the characterization signals to acoustic signals and coupling the acoustic signals
Abstract: A portable acoustic signal (speech signal) preprocessing (SSP) device for accessing an automatic speech/speaker recognition (ASSR) server comprises a microphone for converting sound including speech, silence and background noise signals to analog signals; an analog signals to digital converter for converting the analog signals to digital signals; a digital signal processor (DSP) for generating feature vector data representing the digitized speech and silence/background noise, and for generating channel characterization signals; and an acoustic coupler for converting the feature vector data and the characterization signals to acoustic signals and coupling the acoustic signals to a communication channel to access the ASSR server to perform speech and speaker recognition at a remote location. The SSP device may also be configured to compress and encrypt data transmitted to the ASSR server via the DSP and encryption keys stored in a memory. The ASSR server receives the preprocessed acoustic signals to perform speech/speaker recognition by setting references, selecting appropriate decoding models and algorithms to decode the acoustic signals by modeling the channel transfer function from the channel characterization signals and processing the silence/background noise data to reduce word error rate for speech recognition and to perform accurate speaker recognition. A client/server system having the portable SSP device and the ASSR server can be used to remotely activate, reset, or change personal identification numbers (PINs) or user passwords for smartcards, magnetic cards, or electronic money cards.

121 citations

Journal ArticleDOI
TL;DR: The presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers and the estimated formantfrequency were used in a number of variants for recognition.
Abstract: This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and the segment boundaries that optimally match the spectrum. We used this method in experimental tests that were carried out on the TI digit string data base. The main results of the experimental tests are: (1) the presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers; and (2) the estimated formant frequencies were used in a number of variants for recognition. The best set-up resulted in a string error rate of 4.2% on the adult corpus of the TI digit string data base.

121 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528