Home
/
Topics
/
Word error rate

Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973
1972
1971
1970
1969

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Towards increasing speech recognition error rates

[...]

Hervé Bourlard¹, Hervé Bourlard², Hynek Hermansky³, Hynek Hermansky², Nelson Morgan⁴, Nelson Morgan¹ - Show less +2 more•Institutions (4)

International Computer Science Institute¹, Faculté polytechnique de Mons², Oregon Health & Science University³, University of California, Berkeley⁴

01 May 1996-Speech Communication

TL;DR: In this article, the authors discuss some research directions for ASR that may not always yield an immediate and guaranteed decrease in error rate but which hold some promise for ultimately improving performance in the end applications, including discrimination between rival utterance models, the role of prior information in speech recognition, merging the language and acoustic models, feature extraction and temporal information, and decoding procedures reflecting human perceptual properties.

...read moreread less

182 citations

Proceedings Article•DOI•

Joshua: An Open Source Toolkit for Parsing-Based Machine Translation

[...]

Zhifei Li¹, Chris Callison-Burch¹, Chris Dyer², Sanjeev Khudanpur¹, Lane Schwartz³, Wren N. G. Thornton¹, Jonathan Weese¹, Omar F. Zaidan¹ - Show less +4 more•Institutions (3)

Johns Hopkins University¹, University of Maryland, College Park², University of Minnesota³

30 Mar 2009

TL;DR: In this article, a synchronous context free grammars (SCFGs) are used for statistical machine translation. And the toolkit achieves state-of-the-art performance on the WMT09 French-English translation task.

...read moreread less

Abstract: We describe Joshua, an open source toolkit for statistical machine translation. Joshua implements all of the algorithms required for synchronous context free grammars (SCFGs): chart-parsing, n-gram language model integration, beam-and cube-pruning, and k-best extraction. The toolkit also implements suffix-array grammar extraction and minimum error rate training. It uses parallel and distributed computing techniques for scalability. We demonstrate that the toolkit achieves state of the art translation performance on the WMT09 French-English translation task.

...read moreread less

182 citations

Proceedings Article•DOI•

Large vocabulary decoding and confidence estimation using word posterior probabilities

[...]

Gunnar Evermann¹, Philip C. Woodland¹•Institutions (1)

University of Cambridge¹

05 Jun 2000

TL;DR: The paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system and a novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented.

...read moreread less

Abstract: The paper investigates the estimation of word posterior probabilities based on word lattices and presents applications of these posteriors in a large vocabulary speech recognition system. A novel approach to integrating these word posterior probability distributions into a conventional Viterbi decoder is presented. The problem of the robust estimation of confidence scores from word posteriors is examined and a method based on decision trees is suggested. The effectiveness of these techniques is demonstrated on the broadcast news and the conversational telephone speech corpora where improvements both in terms of word error rate and normalised cross entropy were achieved compared to the baseline HTK evaluation systems.

...read moreread less

182 citations

Journal Article•DOI•

A weighted cepstral distance measure for speech recognition

[...]

Y. Tohkura

01 Oct 1987-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: The experimental results show that the weighted cepstral distance measure works substantially better than both the Euclidean cepStral distance and the log likelihood ratio distance measures across two different databases.

...read moreread less

Abstract: A weighted cepstral distance measure is proposed and is tested in a speaker-independent isolated word recognition system using standard DTW (dynamic time warping) techniques. The measure is a statistically weighted distance measure with weights equal to the inverse variance of the cepstral coefficients. The experimental results show that the weighted cepstral distance measure works substantially better than both the Euclidean cepstral distance and the log likelihood ratio distance measures across two different databases. The recognition error rate obtained using the weighted cepstral distance measure was about 1 percent for digit recognition. This result was less than one-fourth of that obtained using the simple Euclidean cepstral distance measure and about one-third of the results using the log likelihood ratio distance measure. The most significant performance characteristic of the weighted cepstral distance was that it tended to equalize the performance of the recognizer across different talkers.

...read moreread less

181 citations

Journal Article•DOI•

Low Latency Acoustic Modeling Using Temporal Convolution and LSTMs

[...]

Vijayaditya Peddinti¹, Yiming Wang¹, Daniel Povey¹, Sanjeev Khudanpur¹•Institutions (1)

Johns Hopkins University¹

01 Mar 2018-IEEE Signal Processing Letters

TL;DR: In this article, the authors proposed the use of temporal convolution, in the form of time-delay neural network (TDNN) layers, along with unidirectional LSTM layers to limit the latency to 200 ms.

...read moreread less

Abstract: Bidirectional long short-term memory (BLSTM) acoustic models provide a significant word error rate reduction compared to their unidirectional counterpart, as they model both the past and future temporal contexts. However, it is nontrivial to deploy bidirectional acoustic models for online speech recognition due to an increase in latency. In this letter, we propose the use of temporal convolution, in the form of time-delay neural network (TDNN) layers, along with unidirectional LSTM layers to limit the latency to 200 ms. This architecture has been shown to outperform the state-of-the-art low frame rate (LFR) BLSTM models. We further improve these LFR BLSTM acoustic models by operating them at higher frame rates at lower layers and show that the proposed model performs similar to these mixed frame rate BLSTMs. We present results on the Switchboard 300 h LVCSR task and the AMI LVCSR task, in the three microphone conditions.

...read moreread less

181 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
…
49
50
51
52
53
54
55
…
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics