scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Patent
Nils Klarlund1, Michael Riley1
26 Mar 2003
TL;DR: In this article, a QWERTY-based cluster keyboard is described, which consists of fourteen alphabet keys arranged such that all the letters in the alphabet are distributed in three rows of keys.
Abstract: A QWERTY-based cluster keyboard is disclosed. In the preferred embodiment, the keyboard comprises fourteen alphabet keys arranged such that all the letters in the alphabet are distributed in three rows of keys and in the standard QWERTY positions. Stochastic language models are used to reduce the error rate for typing on the keyboards. The language models consist of probability estimates of occurrences of n-grams (sequences of n consecutive words), wherein n is preferably 1, 2 or 3. A delay parameter d, which is related to the period of time the system displays the predicted intended word upon entry of a word boundary, is preferably zero to immediately display the primary word choice at a word boundary and provide the user the option to select the secondary candidate if necessary. Two disambiguation keys enable the user to identify which letter is intended as a secondary option to the language model predictions.

86 citations

Proceedings Article
01 Jun 2007
TL;DR: It is shown how a BF containing n-grams can enable us to use much larger corpora and higher-order models complementing a conventional n- gram LM within an SMT system.
Abstract: A Bloom filter (BF) is a randomised data structure for set membership queries. Its space requirements are significantly below lossless information-theoretic lower bounds but it produces false positives with some quantifiable probability. Here we explore the use of BFs for language modelling in statistical machine translation. We show how a BF containing n-grams can enable us to use much larger corpora and higher-order models complementing a conventional n-gram LM within an SMT system. We also consider (i) how to include approximate frequency information efficiently within a BF and (ii) how to reduce the error rate of these models by first checking for lower-order sub-sequences in candidate ngrams. Our solutions in both cases retain the one-sided error guarantees of the BF while takingadvantageof theZipf-likedistribution of word frequencies to reduce the space requirements.

86 citations

Proceedings ArticleDOI
06 Sep 2015
TL;DR: iVectors are used as an input to the neural network to perform instantaneous speaker and environment adaptation, providing 10% relative improvement in word error rate, and subsampling the outputs at TDNN layers across time steps, training time is reduced.
Abstract: In reverberant environments there are long term interactions between speech and corrupting sources. In this paper a time delay neural network (TDNN) architecture, capable of learning long term temporal relationships and translation invariant representations, is used for reverberation robust acoustic modeling. Further, iVectors are used as an input to the neural network to perform instantaneous speaker and environment adaptation, providing 10% relative improvement in word error rate. By subsampling the outputs at TDNN layers across time steps, training time is reduced. Using a parallel training algorithm we show that the TDNN can be trained on ∼ 5500 hours of speech data in 3 days using up to 32 GPUs. The TDNN is shown to provide results competitive with state of the art systems in the IARPA ASpIRE challenge, with 27.7% WER on the dev test set.

86 citations

Journal ArticleDOI
TL;DR: A comparative performance assessment of ten state-of-the-art error-correction methods for long reads, including sensitivity, accuracy, output rate, alignment rate, output read length, run time, and memory usage, as well as the effects of error correction on two downstream applications of long reads.
Abstract: Third-generation sequencing technologies have advanced the progress of the biological research by generating reads that are substantially longer than second-generation sequencing technologies. However, their notorious high error rate impedes straightforward data analysis and limits their application. A handful of error correction methods for these error-prone long reads have been developed to date. The output data quality is very important for downstream analysis, whereas computing resources could limit the utility of some computing-intense tools. There is a lack of standardized assessments for these long-read error-correction methods. Here, we present a comparative performance assessment of ten state-of-the-art error-correction methods for long reads. We established a common set of benchmarks for performance assessment, including sensitivity, accuracy, output rate, alignment rate, output read length, run time, and memory usage, as well as the effects of error correction on two downstream applications of long reads: de novo assembly and resolving haplotype sequences. Taking into account all of these metrics, we provide a suggestive guideline for method choice based on available data size, computing resources, and individual research goals.

86 citations

Proceedings ArticleDOI
14 Apr 1991
TL;DR: The authors present a novel technique for obtaining a phonetic transcription for a new word, which is needed to add the new word to the system, using DECtalk's text-to-sound rules.
Abstract: The authors report on the detection of new words for the speaker-dependent and speaker-independent paradigms. A useful operating point in a speaker-dependent paradigm is defined at 71% detection rate and 1% false alarm rate. The authors present a novel technique for obtaining a phonetic transcription for a new word, which is needed to add the new word to the system. The technique utilizes DECtalk's text-to-sound rules to obtain an initial phonetic transcription for the new word. Since these text-to-sound rules are imperfect, a probabilistic transformation technique is used that produces a phonetic pronunciation network of all possible pronunciations given DECtalk's transcription. The network is used to constrain a phonetic recognition process that results in an improved phonetic transcription for the new word. The resulting transcriptions are sufficient for speech recognition purposes. >

86 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528