Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Error thresholds in genetic algorithms

[...]

Gabriela Ochoa¹•Institutions (1)

Simón Bolívar University¹

01 Jun 2006-Evolutionary Computation

TL;DR: This work uses a genetic algorithm, instead of the quasispecies model, as the underlying model of evolution, and investigates whether the phenomenon of error thresholds is found on finite populations of bit strings evolving on complex landscapes, and finds that error thresholds depend mainly on the selection pressure and genotype length.

...read moreread less

Abstract: The error threshold of replication is an important notion in the quasispecies evolution model; it is a critical mutation rate (error rate) beyond which structures obtained by an evolutionary process are destroyed more frequently than selection can reproduce them. With mutation rates above this critical value, an error catastrophe occurs and the genomic information is irretrievably lost. Therefore, studying the factors that alter this magnitude has important implications in the study of evolution. Here we use a genetic algorithm, instead of the quasispecies model, as the underlying model of evolution, and explore whether the phenomenon of error thresholds is found on finite populations of bit strings evolving on complex landscapes. Our empirical results verify the occurrence of error thresholds in genetic algorithms. In this way, this notion is brought from molecular evolution to evolutionary computation. We also study the effect of modifying the most prominent evolutionary parameters on the magnitude of this critical value, and found that error thresholds depend mainly on the selection pressure and genotype length.

...read moreread less

66 citations

Proceedings Article•DOI•

Data selection for speech recognition

[...]

Yi Wu¹, Rong Zhang¹, Alexander I. Rudnicky¹•Institutions (1)

Carnegie Mellon University¹

01 Dec 2007

TL;DR: In contrast to the common belief that "there is no data like more data", it is found possible to select a highly informative subset of data that produces recognition performance comparable to a system that makes use of a much larger amount of data.

...read moreread less

Abstract: This paper presents a strategy for efficiently selecting informative data from large corpora of transcribed speech. We propose to choose data uniformly according to the distribution of some target speech unit (phoneme, word, character, etc). In our experiment, in contrast to the common belief that "there is no data like more data", we found it possible to select a highly informative subset of data that produces recognition performance comparable to a system that makes use of a much larger amount of data. At the same time, our selection process is efficient and fast.

...read moreread less

66 citations

Proceedings Article•

Fabricating conversational speech data with acoustic models: a program to examine model-data mismatch.

[...]

Don McAllaster, L. Gillick, Francesco Scattone, Michael J. Newman

01 Jan 1998

TL;DR: There is a substantial mismatch between real speech and the combination of the authors' acoustic models and the pronunciations in their recognition dictionary, and the use of simulation appears to be a promising tool in the efforts to understand and reduce the size of this mismatch.

...read moreread less

Abstract: We present a study of data simulated using acoustic models trained on Switchboard data, and then recognized using various Switchboard-trained acoustic models. When we recognize real Switchboard conversations, simple development models give a word error rate (WER) of about 47 percent. If instead we simulate the speech data using word transcriptions of the conversation, obtaining the pronunciations for the words from our recognition dictionary, the WER drops by a factor of five to ten. In a third type of experiment, we use human-generated phonetic transcripts to fabricate data that more realistically represents conversational speech, and obtain WERs in the low 40’s, rates that are fairly similar to those seen in actual speech data. Taken as a whole, these and other experiments we describe in the paper suggest that there is a substantial mismatch between real speech and the combination of our acoustic models and the pronunciations in our recognition dictionary. The use of simulation appears to be a promising tool in our efforts to understand and reduce the size of this mismatch, and may prove to be a generally valuable diagnostic in speech recognition research .

...read moreread less

66 citations

Journal Article•DOI•

A speaker-independent digit-recognition system

[...]

M. Sambur, Lawrence R. Rabiner

01 Jan 1975-Bell System Technical Journal

TL;DR: An implementation of a speaker-independent digit-recognition system based on segmenting the unknown word into three regions and then making categorical judgments as to which of six broad acoustic classes each segment falls into.

...read moreread less

Abstract: This paper describes an implementation of a speaker-independent digit-recognition system The digit classification scheme is based on segmenting the unknown word into three regions and then making categorical judgments as to which of six broad acoustic classes each segment falls into The measurements made on the speech waveform include energy, zero crossings, two-pole linear predictive coding analysis, and normalized error of the linear predictive coding analysis A formal evaluation of the systems showed an error rate of 27 percent for a carefully controlled recording environment and a 56 percent error rate for on-line recordings in a noisy computer room

...read moreread less

66 citations

1998 Broadcast News Benchmark Test Results: English and Non-English Word Error Rate Performance Measures

[...]

David S. Pallett¹, Jonathan G. Fiscus, John S. Garofolo, Alvin F. Martin, Mark A. Przybocki - Show less +1 more•Institutions (1)

National Institute of Standards and Technology¹

20 Oct 1998

TL;DR: This paper documents the use of Broadcast News test materials in DARPA-sponsored Automatic Speech Recognition (ASR) Benchmark Tests conducted late in 1998, and results are reported on non-English language Broadcast News materials in Spanish and Mandarin.

...read moreread less

Abstract: This paper documents the use of Broadcast News test materials in DARPA-sponsored Automatic Speech Recognition (ASR) Benchmark Tests conducted late in 1998. As in last year’s tests [1], statistical selection procedures were used in selecting test materials. Two test epochs were used, each yielding (nominally) one and one-half hours of test material. One of the test sets was drawn from the same test epoch as was used for last year’s tests, and the other was drawn from a more recent period. Results are reported for two types of systems: one (the “Hub”, or “baseline” systems) for which there were no limits on computational resources, and another (the “less than 10X realtime spoke” systems) for systems that ran in less than 10 times real-time. The lowest word error rate reported this year for the “Hub” systems was 13.5%, contrasting with last year’s lowest word error rate of 16.2%. For the “less than 10X real-time spoke” systems, the lowest reported word error rate was 16.1%. Results are also reported, for the second year, on non-English language Broadcast News materials in Spanish and Mandarin.

...read moreread less

66 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics