Home
/
Topics
/
Word error rate

Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Very Deep Convolutional Neural Networks for Noise Robust Speech Recognition

[...]

Yanmin Qian¹, Mengxiao Bi¹, Tian Tan¹, Kai Yu¹•Institutions (1)

Shanghai Jiao Tong University¹

01 Dec 2016-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: The proposed very deep CNNs can significantly reduce word error rate (WER) for noise robust speech recognition and are competitive with the long short-term memory recurrent neural networks (LSTM-RNN) acoustic model.

...read moreread less

Abstract: Although great progress has been made in automatic speech recognition, significant performance degradation still exists in noisy environments. Recently, very deep convolutional neural networks (CNNs) have been successfully applied to computer vision and speech recognition tasks. Based on our previous work on very deep CNNs, in this paper this architecture is further developed to improve recognition accuracy for noise robust speech recognition. In the proposed very deep CNN architecture, we study the best configuration for the sizes of filters, pooling, and input feature maps: the sizes of filters and poolings are reduced and dimensions of input features are extended to allow for adding more convolutional layers. Then the appropriate pooling, padding, and input feature map selection strategies are investigated and applied to the very deep CNN to make it more robust for speech recognition. In addition, an in-depth analysis of the architecture reveals key characteristics, such as compact model scale, fast convergence speed, and noise robustness. The proposed new model is evaluated on two tasks: Aurora4 task with multiple additive noise types and channel mismatch, and the AMI meeting transcription task with significant reverberation. Experiments on both tasks show that the proposed very deep CNNs can significantly reduce word error rate (WER) for noise robust speech recognition. The best architecture obtains a 10.0% relative reduction over the traditional CNN on AMI, competitive with the long short-term memory recurrent neural networks (LSTM-RNN) acoustic model. On Aurora4, even without feature enhancement, model adaptation, and sequence training, it achieves a WER of 8.81%, a 17.0% relative improvement over the LSTM-RNN. To our knowledge, this is the best published result on Aurora4.

...read moreread less

311 citations

Journal Article•DOI•

Continuous Sign Language Recognition: Towards Large Vocabulary Statistical Recognition Systems Handling Multiple Signers

[...]

Oscar Koller¹, Jens Forster¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

01 Dec 2015-Computer Vision and Image Understanding

TL;DR: This work presents a statistical recognition approach performing large vocabulary continuous sign language recognition across different signers, and is the first time system design on a large data set with true focus on real-life applicability is thoroughly presented.

...read moreread less

309 citations

Patent•

Methods and systems for identifying errors in a speech recognition system

[...]

Keith Braho, Jeffrey Pike, Lori Pike

17 Oct 2014

TL;DR: In this article, a method for identifying possible errors made by a speech recognition system without using a transcript of words input to the system is described. But this method does not consider the use of a word-to-word model.

...read moreread less

Abstract: Methods are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. A method for model adaptation for a speech recognition system includes determining an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The method may further include adjusting an adaptation, of the model for the word or various models for the various words, based on the error rate. Apparatus are disclosed for identifying possible errors made by a speech recognition system without using a transcript of words input to the system. An apparatus for model adaptation for a speech recognition system includes a processor adapted to estimate an error rate, corresponding to either recognition of instances of a word or recognition of instances of various words, without using a transcript of words input to the system. The apparatus may further include a controller adapted to adjust an adaptation of the model for the word or various models for the various words, based on the error rate.

...read moreread less

306 citations

Proceedings Article•

Identifying Word Translation in Non_Parallel Texts.

[...]

Reinhard Rapp

01 Jan 1995

306 citations

Proceedings Article•DOI•

Learning for Semantic Parsing with Statistical Machine Translation

[...]

Yuk Wah Wong¹, Raymond J. Mooney¹•Institutions (1)

University of Texas at Austin¹

04 Jun 2006

TL;DR: It is shown that WASP performs favorably in terms of both accuracy and coverage compared to existing learning methods requiring similar amount of supervision, and shows better robustness to variations in task complexity and word order.

...read moreread less

Abstract: We present a novel statistical approach to semantic parsing, WASP, for constructing a complete, formal meaning representation of a sentence. A semantic parser is learned given a set of sentences annotated with their correct meaning representations. The main innovation of WASP is its use of state-of-the-art statistical machine translation techniques. A word alignment model is used for lexical acquisition, and the parsing model itself can be seen as a syntax-based translation model. We show that WASP performs favorably in terms of both accuracy and coverage compared to existing learning methods requiring similar amount of supervision, and shows better robustness to variations in task complexity and word order.

...read moreread less

306 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
…
22
23
24
25
26
27
28
…
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics