Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Word association method and apparatus

[...]

Eli Abir

29 Jan 2003

TL;DR: In this article, a method for associating words and word strings in a language by analyzing word formations around a word or word string to identify other words or word strings that are equivalent or near equivalent semantically is presented.

...read moreread less

Abstract: A method for creating and using a cross-idea association database (figure 1) that includes a method for associating words and word strings in a language by analyzing word formations around a word or word string to identify other words or word strings that are equivalent or near equivalent semantically. One method for associating words and word strings includes querying a collection of documents with a user-supplied word or word string input device 210), determining a user-defined amount of words or word strings to the left and right of the query string, determining the frequency of occurrence of words or word strings located on the left and right of the query string, and ranking the located words.

...read moreread less

122 citations

Proceedings Article•

Heterogeneous measurements and multiple classifiers for speech recognition.

[...]

Andrew K. Halberstadt¹, James Glass•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1998

TL;DR: In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates that are the lowest the authors have seen reported on these tasks.

...read moreread less

Abstract: This paper addresses the problem of acoustic phonetic modeling. First, heterogeneous acoustic measurements are chosen in order to maximize the acoustic-phonetic information extracted from the speech signal in preprocessing. Second, classifier systems are presented for successfully utilizing high-dimensional acoustic measurement spaces. The techniques used for achieving these two goals can be broadly categorized as hierarchical, committeebased, or a hybrid of these two. This paper presents committeebased and hybrid approaches. In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates of 18.3% and 24.4%, respectively. These error rates are the lowest we have seen reported on these tasks. In addition, experiments with a telephone-based weather information word recognition task led to word error rate reductions of 10–16%.

...read moreread less

121 citations

Journal Article•DOI•

Recursive estimation of nonstationary noise using iterative stochastic approximation for robust speech recognition

[...]

Li Deng¹, Jasha Droppo¹, Alejandro Acero¹•Institutions (1)

Microsoft¹

01 Nov 2003-IEEE Transactions on Speech and Audio Processing

TL;DR: The experimental results demonstrated the crucial importance of using the newly introduced iterations in improving the earlier stochastic approximation technique, and showed sensitivity of the noise estimation algorithm's performance to the forgetting factor embedded in the algorithm.

...read moreread less

Abstract: We describe a novel algorithm for recursive estimation of nonstationary acoustic noise which corrupts clean speech, and a successful application of the algorithm in the speech feature enhancement framework of noise-normalized SPLICE for robust speech recognition. The noise estimation algorithm makes use of a nonlinear model of the acoustic environment in the cepstral domain. Central to the algorithm is the innovative iterative stochastic approximation technique that improves piecewise linear approximation to the nonlinearity involved and that subsequently increases the accuracy for noise estimation. We report comprehensive experiments on SPLICE-based, noise-robust speech recognition for the AURORA2 task using the results of iterative stochastic approximation. The effectiveness of the new technique is demonstrated in comparison with a more traditional, MMSE noise estimation algorithm under otherwise identical conditions. The word error rate reduction achieved by iterative stochastic approximation for recursive noise estimation in the framework of noise-normalized SPLICE is 27.9% for the multicondition training mode, and 67.4% for the clean-only training mode, respectively, compared with the results using the standard cepstra with no speech enhancement and using the baseline HMM supplied by AURORA2. These represent the best performance in the clean-training category of the September-2001 AURORA2 evaluation. The relative error rate reduction achieved by using the same noise estimate is increased to 48.40% and 76.86%, respectively, for the two training modes after using a better designed HMM system. The experimental results demonstrated the crucial importance of using the newly introduced iterations in improving the earlier stochastic approximation technique, and showed sensitivity of the noise estimation algorithm's performance to the forgetting factor embedded in the algorithm.

...read moreread less

121 citations

Patent•

Portable acoustic interface for remote access to automatic speech/speaker recognition server

[...]

Dimitri Kanevsky¹, Stephane H. Maes¹, Peter Poon¹, Carl Purochiro¹•Institutions (1)

IBM¹

11 Jun 1997

TL;DR: A portable acoustic signal (speech signal) preprocessing (SSP) device for accessing an automatic speech/speaker recognition (ASSR) server comprises a microphone for converting sound including speech, silence and background noise signals to analog signals; an analog signals to digital converter for converting the analog signal to digital signals; a digital signal processor (DSP) for generating feature vector data representing the digitized speech and silence/background noise, and for generating channel characterization signals; and an acoustic coupler for converting feature vector signals and the characterization signals to acoustic signals and coupling the acoustic signals

...read moreread less

Abstract: A portable acoustic signal (speech signal) preprocessing (SSP) device for accessing an automatic speech/speaker recognition (ASSR) server comprises a microphone for converting sound including speech, silence and background noise signals to analog signals; an analog signals to digital converter for converting the analog signals to digital signals; a digital signal processor (DSP) for generating feature vector data representing the digitized speech and silence/background noise, and for generating channel characterization signals; and an acoustic coupler for converting the feature vector data and the characterization signals to acoustic signals and coupling the acoustic signals to a communication channel to access the ASSR server to perform speech and speaker recognition at a remote location. The SSP device may also be configured to compress and encrypt data transmitted to the ASSR server via the DSP and encryption keys stored in a memory. The ASSR server receives the preprocessed acoustic signals to perform speech/speaker recognition by setting references, selecting appropriate decoding models and algorithms to decode the acoustic signals by modeling the channel transfer function from the channel characterization signals and processing the silence/background noise data to reduce word error rate for speech recognition and to perform accurate speaker recognition. A client/server system having the portable SSP device and the ASSR server can be used to remotely activate, reset, or change personal identification numbers (PINs) or user passwords for smartcards, magnetic cards, or electronic money cards.

...read moreread less

121 citations

Journal Article•DOI•

Formant estimation for speech recognition

[...]

Lutz Welling¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

01 Jan 1998-IEEE Transactions on Speech and Audio Processing

TL;DR: The presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers and the estimated formantfrequency were used in a number of variants for recognition.

...read moreread less

Abstract: This paper presents a new method for estimating formant frequencies. The formant model is based on a digital resonator. Each resonator represents a segment of the short-time power spectrum. The complete spectrum is modeled by a set of digital resonators connected in parallel. An algorithm based on dynamic programming produces both the model parameters and the segment boundaries that optimally match the spectrum. We used this method in experimental tests that were carried out on the TI digit string data base. The main results of the experimental tests are: (1) the presented approach produces reliable estimates of formant frequencies across a wide range of sounds and speakers; and (2) the estimated formant frequencies were used in a number of variants for recognition. The best set-up resulted in a string error rate of 4.2% on the adult corpus of the TI digit string data base.

...read moreread less

121 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics