scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
PatentDOI
TL;DR: A system and method for performing speaker adaptation in a speech recognition system which includes a set of reference models corresponding to speech data from a plurality of speakers that is adapted to account for speech pattern idiosyncrasies of the new speaker, thereby reducing the error rate of thespeech recognition system.
Abstract: A system and method for performing speaker adaptation in a speech recognition system which includes a set of reference models corresponding to speech data from a plurality of speakers. The speech data is represented by a plurality of acoustic models and corresponding sub-events, and each sub-event includes one or more observations of speech data. A degree of lateral tying is computed between each pair of sub-events, wherein the degree of tying indicates the degree to which a first observation in a first sub-event contributes to the remaining sub-events. When adaptation data from a new speaker becomes available, a new observation from adaptation data is assigned to one of the sub-events. Each of the sub-events is then populated with the observations contained in the assigned sub-event based on the degree of lateral tying that was computed between each pair of sub-events. The reference models corresponding to the populated sub-events are then adapted to account for speech pattern idiosyncrasies of the new speaker, thereby reducing the error rate of the speech recognition system.

141 citations

Journal ArticleDOI
TL;DR: A Gram-Charlier expansion is used to compute the error rate in the presence of intersymbol interference and additive Gaussian noise and the method presented is very useful for numerical computations.
Abstract: The error rate or the probability of error is an important parameter in the design of digital communication systems. In this paper a Gram-Charlier expansion is used to compute the error rate in the presence of intersymbol interference and additive Gaussian noise. The method presented is very useful for numerical computations. We also present expressions for the truncation errors. Rigorous proofs are presented in the Appendix.

140 citations

Posted Content
TL;DR: This paper proposes a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy and demonstrates that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate of 2.1%/4.6%.
Abstract: Convolutional neural networks (CNN) have shown promising results for end-to-end speech recognition, albeit still behind other state-of-the-art methods in performance. In this paper, we study how to bridge this gap and go beyond with a novel CNN-RNN-transducer architecture, which we call ContextNet. ContextNet features a fully convolutional encoder that incorporates global context information into convolution layers by adding squeeze-and-excitation modules. In addition, we propose a simple scaling method that scales the widths of ContextNet that achieves good trade-off between computation and accuracy. We demonstrate that on the widely used LibriSpeech benchmark, ContextNet achieves a word error rate (WER) of 2.1%/4.6% without external language model (LM), 1.9%/4.1% with LM and 2.9%/7.0% with only 10M parameters on the clean/noisy LibriSpeech test sets. This compares to the previous best published system of 2.0%/4.6% with LM and 3.9%/11.3% with 20M parameters. The superiority of the proposed ContextNet model is also verified on a much larger internal dataset.

140 citations

Journal ArticleDOI
TL;DR: This paper presents a new approach for the free text analysis of keystroke that combines monograph and digraph analysis, and uses a neural network to predict missing digraphs based on the relation between the monitored keystrokes.
Abstract: Accurate recognition of free text keystroke dynamics is challenging due to the unstructured and sparse nature of the data and its underlying variability As a result, most of the approaches published in the literature on free text recognition, except for one recent one, have reported extremely high error rates In this paper, we present a new approach for the free text analysis of keystrokes that combines monograph and digraph analysis, and uses a neural network to predict missing digraphs based on the relation between the monitored keystrokes Our proposed approach achieves an accuracy level comparable to the best results obtained through related techniques in the literature, while achieving a far lower processing time Experimental evaluation involving 53 users in a heterogeneous environment yields a false acceptance ratio (FAR) of 00152% and a false rejection ratio (FRR) of 482%, at an equal error rate (EER) of 246% Our follow-up experiment, in a homogeneous environment with 17 users, yields FAR=0% and FRR=501%, at EER=213%

138 citations

Proceedings ArticleDOI
13 May 2002
TL;DR: It is shown that one of the recent techniques used for speaker recognition, feature warping can be formulated within the framework of Gaussianization, and around 20% relative improvement in both equal error rate (EER) and minimum detection cost function (DCF) is obtained on NIST 2001 cellular phone data evaluation.
Abstract: In this paper, a novel approach for robust speaker verification, namely short-time Gaussianization, is proposed. Short-time Gaussianization is initiated by a global linear transformation of the features, followed by a short-time windowed cumulative distribution function (CDF) matching. First, the linear transformation in the feature space leads to local independence or decorrelation. Then the CDF matching is applied to segments of speech localized in time and tries to warp a given feature so that its CDF matches normal distribution. It is shown that one of the recent techniques used for speaker recognition, feature warping [l] can be formulated within the framework of Gaussianization. Compared to the baseline system with cepstral mean subtraction (CMS), around 20% relative improvement in both equal error rate(EER) and minimum detection cost function (DCF) is obtained on NIST 2001 cellular phone data evaluation.

138 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528