scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, some schemes that modify the currently known variations of ARQ (Stop-and-Wait and continuous systems) are suggested with a view to obtaining higher throughput under high block error rate conditions.
Abstract: Large round trip delay associated with satellite channels reduces the throughput for automatic repeat-request (ARQ) system of error control rather drastically under high error rate conditians. Ground segments that usually accompany satellite circuits at both ends introduce bursts of errors, during which block error rates tend to be quite high, bringing down the throughput to very low values. In this paper, some schemes that modify the currently known variations of ARQ (Stop-and-Wait and continuous systems) are suggested with a view to obtaining higher throughput under high block error rate conditions. Specifically, the modified Go-Back- N system appears to be quite attractive, as it gives substantial improvement with little additional complexity in system implementation.

170 citations

Journal ArticleDOI
TL;DR: A speaker-independent phoneme and word recognition system based on a recurrent error propagation network trained on the TIMIT database and analysis of the phoneme recognition results shows that information available from bigram and durational constraints is adequately handled within the network allowing for efficient parsing of the network output.

170 citations

Proceedings ArticleDOI
01 Dec 2013
TL;DR: This paper replaces the filter bank with a filter bank layer that is learned jointly with the rest of a deep neural network, and shows that on a 50-hour English Broadcast News task, it can achieve a 5% relative improvement in word error rate using thefilter bank learning approach.
Abstract: Mel-filter banks are commonly used in speech recognition, as they are motivated from theory related to speech production and perception. While features derived from mel-filter banks are quite popular, we argue that this filter bank is not really an appropriate choice as it is not learned for the objective at hand, i.e. speech recognition. In this paper, we explore replacing the filter bank with a filter bank layer that is learned jointly with the rest of a deep neural network. Thus, the filter bank is learned to minimize cross-entropy, which is more closely tied to the speech recognition objective. On a 50-hour English Broadcast News task, we show that we can achieve a 5% relative improvement in word error rate (WER) using the filter bank learning approach, compared to having a fixed set of filters.

169 citations

Journal ArticleDOI
TL;DR: This work presents simulations showing how the Type-I error rate is affected under different conditions of intraclass correlation and sample size, and makes suggestions on how one should collect and analyze data bearing a hierarchical structure.
Abstract: Least squares analyses (e.g., ANOVAs, linear regressions) of hierarchical data leads to Type-I error rates that depart severely from the nominal Type-I error rate assumed. Thus, when least squares methods are used to analyze hierarchical data coming from designs in which some groups are assigned to the treatment condition, and others to the control condition (i.e., the widely used "groups nested under treatment" experimental design), the Type-I error rate is seriously inflated, leading too often to the incorrect rejection of the null hypothesis (i.e., the incorrect conclusion of an effect of the treatment). To highlight the severity of the problem, we present simulations showing how the Type-I error rate is affected under different conditions of intraclass correlation and sample size. For all simulations the Type-I error rate after application of the popular Kish (1965) correction is also considered, and the limitations of this correction technique discussed. We conclude with suggestions on how one should collect and analyze data bearing a hierarchical structure.

169 citations

Proceedings ArticleDOI
25 Mar 2012
TL;DR: This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech and investigated statistical machine translation (SMT) - based text generation approaches for building code- Switched language models.
Abstract: This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) - based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set.

167 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528