Topic
Word error rate
About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.
Papers published on a yearly basis
Papers
More filters
••
TL;DR: In this paper, some schemes that modify the currently known variations of ARQ (Stop-and-Wait and continuous systems) are suggested with a view to obtaining higher throughput under high block error rate conditions.
Abstract: Large round trip delay associated with satellite channels reduces the throughput for automatic repeat-request (ARQ) system of error control rather drastically under high error rate conditians. Ground segments that usually accompany satellite circuits at both ends introduce bursts of errors, during which block error rates tend to be quite high, bringing down the throughput to very low values. In this paper, some schemes that modify the currently known variations of ARQ (Stop-and-Wait and continuous systems) are suggested with a view to obtaining higher throughput under high block error rate conditions. Specifically, the modified Go-Back- N system appears to be quite attractive, as it gives substantial improvement with little additional complexity in system implementation.
170 citations
••
TL;DR: A speaker-independent phoneme and word recognition system based on a recurrent error propagation network trained on the TIMIT database and analysis of the phoneme recognition results shows that information available from bigram and durational constraints is adequately handled within the network allowing for efficient parsing of the network output.
170 citations
••
01 Dec 2013TL;DR: This paper replaces the filter bank with a filter bank layer that is learned jointly with the rest of a deep neural network, and shows that on a 50-hour English Broadcast News task, it can achieve a 5% relative improvement in word error rate using thefilter bank learning approach.
Abstract: Mel-filter banks are commonly used in speech recognition, as they are motivated from theory related to speech production and perception. While features derived from mel-filter banks are quite popular, we argue that this filter bank is not really an appropriate choice as it is not learned for the objective at hand, i.e. speech recognition. In this paper, we explore replacing the filter bank with a filter bank layer that is learned jointly with the rest of a deep neural network. Thus, the filter bank is learned to minimize cross-entropy, which is more closely tied to the speech recognition objective. On a 50-hour English Broadcast News task, we show that we can achieve a 5% relative improvement in word error rate (WER) using the filter bank learning approach, compared to having a fixed set of filters.
169 citations
••
TL;DR: This work presents simulations showing how the Type-I error rate is affected under different conditions of intraclass correlation and sample size, and makes suggestions on how one should collect and analyze data bearing a hierarchical structure.
Abstract: Least squares analyses (e.g., ANOVAs, linear regressions) of hierarchical data leads to Type-I error rates that depart severely from the nominal Type-I error rate assumed. Thus, when least squares methods are used to analyze hierarchical data coming from designs in which some groups are assigned to the treatment condition, and others to the control condition (i.e., the widely used "groups nested under treatment" experimental design), the Type-I error rate is seriously inflated, leading too often to the incorrect rejection of the null hypothesis (i.e., the incorrect conclusion of an effect of the treatment). To highlight the severity of the problem, we present simulations showing how the Type-I error rate is affected under different conditions of intraclass correlation and sample size. For all simulations the Type-I error rate after application of the popular Kish (1965) correction is also considered, and the limitations of this correction technique discussed. We conclude with suggestions on how one should collect and analyze data bearing a hierarchical structure.
169 citations
••
25 Mar 2012TL;DR: This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech and investigated statistical machine translation (SMT) - based text generation approaches for building code- Switched language models.
Abstract: This paper presents first steps toward a large vocabulary continuous speech recognition system (LVCSR) for conversational Mandarin-English code-switching (CS) speech. We applied state-of-the-art techniques such as speaker adaptive and discriminative training to build the first baseline system on the SEAME corpus [1] (South East Asia Mandarin-English). For acoustic modeling, we applied different phone merging approaches based on the International Phonetic Alphabet (IPA) and Bhattacharyya distance in combination with discriminative training to improve accuracy. On language model level, we investigated statistical machine translation (SMT) - based text generation approaches for building code-switching language models. Furthermore, we integrated the provided information from a language identification system (LID) into the decoding process by using a multi-stream approach. Our best 2-pass system achieves a Mixed Error Rate (MER) of 36.6% on the SEAME development set.
167 citations