scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Proceedings ArticleDOI
12 Apr 2018
TL;DR: The authors explored different topologies and their variants of the attention layer, and compared different pooling methods on the attention weights, and showed that attention-based models can improve the Equal Error Rate (EER) of speaker verification system by relatively 14% compared to a non-attention LSTM baseline model.
Abstract: Attention-based models have recently shown great performance on a range of tasks, such as speech recognition, machine translation, and image captioning due to their ability to summarize relevant information that expands through the entire length of an input sequence. In this paper, we analyze the usage of attention mechanisms to the problem of sequence summarization in our end-to-end text-dependent speaker recognition system. We explore different topologies and their variants of the attention layer, and compare different pooling methods on the attention weights. Ultimately, we show that attention-based models can improves the Equal Error Rate (EER) of our speaker verification system by relatively 14% compared to our non-attention LSTM baseline model.

174 citations

Posted Content
Xiangyu Zhang1, Jianhua Zou1, Xiang Ming1, Kaiming He2, Jian Sun2 
TL;DR: In this article, the reconstruction error of the nonlinear responses is minimized subject to a low-rank constraint, which helps to reduce the complexity of filters and reduces the accumulated error when multiple layers are approximated.
Abstract: This paper aims to accelerate the test-time computation of deep convolutional neural networks (CNNs). Unlike existing methods that are designed for approximating linear filters or linear responses, our method takes the nonlinear units into account. We minimize the reconstruction error of the nonlinear responses, subject to a low-rank constraint which helps to reduce the complexity of filters. We develop an effective solution to this constrained nonlinear optimization problem. An algorithm is also presented for reducing the accumulated error when multiple layers are approximated. A whole-model speedup ratio of 4x is demonstrated on a large network trained for ImageNet, while the top-5 error rate is only increased by 0.9%. Our accelerated model has a comparably fast speed as the "AlexNet", but is 4.7% more accurate.

173 citations

Proceedings ArticleDOI
06 Jul 2002
TL;DR: A method for constructing a word graph to represent alternative hypotheses in an efficient way to ensure that these hypotheses can be rescored using a refined language or translation model.
Abstract: Statistical machine translation systems usually compute the single sentence that has the highest probability according to the models that are trained on data. We describe a method for constructing a word graph to represent alternative hypotheses in an efficient way. The advantage is that these hypotheses can be rescored using a refined language or translation model. Results are presented on the German-English Verbmobil corpus.

173 citations

Proceedings ArticleDOI
06 Sep 2015
TL;DR: Several integration architectures are proposed and tested, including a pipeline architecture of L STM-based SE and ASR with sequence training, an alternating estimation architecture, and a multi-task hybrid LSTM network architecture.
Abstract: Long Short-Term Memory (LSTM) recurrent neural network has proven effective in modeling speech and has achieved outstanding performance in both speech enhancement (SE) and automatic speech recognition (ASR). To further improve the performance of noise-robust speech recognition, a combination of speech enhancement and recognition was shown to be promising in earlier work. This paper aims to explore options for consistent integration of SE and ASR using LSTM networks. Since SE and ASR have different objective criteria, it is not clear what kind of integration would finally lead to the best word error rate for noise-robust ASR tasks. In this work, several integration architectures are proposed and tested, including: (1) a pipeline architecture of LSTM-based SE and ASR with sequence training, (2) an alternating estimation architecture, and (3) a multi-task hybrid LSTM network architecture. The proposed models were evaluated on the 2nd CHiME speech separation and recognition challenge task, and show significant improvements relative to prior results.

173 citations

Proceedings ArticleDOI
03 Oct 1996
TL;DR: A new highly parallel approach to automatic recognition of speech, inspired by early Fetcher's research on articulation index, and based on independent probability estimates in several sub-bands of the available speech spectrum, is presented.
Abstract: A new highly parallel approach to automatic recognition of speech, inspired by early Fetcher's research on articulation index, and based on independent probability estimates in several sub-bands of the available speech spectrum, is presented. The approach is especially suitable for situations when part of the spectrum of speech is computed. In such cases, it can yield an order-of-magnitude improvement in the error rate over a conventional full-band recognizer.

172 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528