Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Two-level DP-matching--A dynamic programming-based pattern matching algorithm for connected word recognition

[...]

H. Sakoe¹•Institutions (1)

NEC¹

01 Dec 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: A general principle of connected word recognition is given based on pattern matching between unknown continuous speech and artificially synthesized connected reference patterns and Computation time and memory requirement are both proved to be within reasonable limits.

...read moreread less

Abstract: This paper reports a pattern matching approach to connected word recognition. First, a general principle of connected word recognition is given based on pattern matching between unknown continuous speech and artificially synthesized connected reference patterns. Time-normalization capability is allowed by use of dynamic programming-based time-warping technique (DP-matching). Then, it is shown that the matching process is efficiently carried out by breaking it down into two steps. The derived algorithm is extensively subjected to recognition experiments. It is shown in a talker-adapted recognition experiment that digit data (one to four digits) connectedly spoken by five persons are recognized with as high as 99.6 percent accuracy. Computation time and memory requirement are both proved to be within reasonable limits.

...read moreread less

289 citations

Proceedings Article•

A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model

[...]

Libin Shen¹, Jinxi Xu¹, Ralph Weischedel¹•Institutions (1)

BBN Technologies¹

01 Jun 2008

TL;DR: A novel string-todependency algorithm for statistical machine translation that employs a target dependency language model during decoding to exploit long distance word relations, which are unavailable with a traditional n-gram language model.

...read moreread less

Abstract: In this paper, we propose a novel string-todependency algorithm for statistical machine translation. With this new framework, we employ a target dependency language model during decoding to exploit long distance word relations, which are unavailable with a traditional n-gram language model. Our experiments show that the string-to-dependency decoder achieves 1.48 point improvement in BLEU and 2.53 point improvement in TER compared to a standard hierarchical string-tostring system on the NIST 04 Chinese-English evaluation set.

...read moreread less

288 citations

Journal Article•DOI•

Recognition of isolated digits using hidden Markov models with continuous mixture densities

[...]

Lawrence R. Rabiner¹, Biing-Hwang Juang¹, Stephen E. Levinson¹, Man Mohan Sondhi¹•Institutions (1)

Bell Labs¹

08 Jul 1985-AT&T technical journal

TL;DR: This paper extends previous work on isolated-word recognition based on hidden Markov models by replacing the discrete symbol representation of the speech signal with a continuous Gaussian mixture density, thereby eliminating the inherent quantization error introduced by the discrete representation.

...read moreread less

Abstract: In this paper we extend previous work on isolated-word recognition based on hidden Markov models by replacing the discrete symbol representation of the speech signal with a continuous Gaussian mixture density. In this manner the inherent quantization error introduced by the discrete representation is essentially eliminated. The resulting recognizer was tested on a vocabulary of the ten digits across a wide range of talkers and test conditions and shown to have an error rate comparable to that of the best template recognizers and significantly lower than that of the discrete symbol hidden Markov model system. We discuss several issues involved in the training of the continuous density models and in the implementation of the recognizer.

...read moreread less

284 citations

Patent•DOI•

Method for interactive speech recognition and training

[...]

Jed M. Roberts, James K. Baker, Edward W. Porter

06 Dec 1988-Journal of the Acoustical Society of America

TL;DR: A method for creating word models for a large vocabulary, natural language dictation system that may be used for connected speech as well as for discrete utterances.

...read moreread less

Abstract: A method for creating word models for a large vocabulary, natural language dictation system. A user with limited typing skills can create documents with little or no advance training of word models. As the user is dictating, the user speaks a word which may or may not already be in the active vocabulary. The system displays a list of the words in the active vocabulary which best match the spoken word. By keyboard or voice command, the user may choose the correct word from the list or may choose to edit a similar word if the correct word is not on the list. Alternately, the user may type or speak the initial letters of the word. Then the recognition algorithm is called again satisfying the initial letters, and the choices displayed again. A word list is then also displayed from a large backup vocabulary. The best words to display from the backup vocabulary are chosen using a statistical language model and optionally word models derived from a phonemic dictionary. When the correct word is chosen by the user, the speech sample is used to create or update an acoustic model for the word, without further intervention by the user. As the system is used, it also constantly updates its statistical language model. The system gets more and more word models and keeps improving its performance the more it is used. The system may be used for connected speech as well as for discrete utterances.

...read moreread less

284 citations

Proceedings Article•DOI•

Extracting deep bottleneck features using stacked auto-encoders

[...]

Jonas Gehring¹, Yajie Miao², Florian Metze², Alex Waibel¹•Institutions (2)

Karlsruhe Institute of Technology¹, Carnegie Mellon University²

26 May 2013

TL;DR: It is found that increasing the number of auto-encoders in the network produces more useful features, but requires pre-training, especially when little training data is available.

...read moreread less

Abstract: In this work, a novel training scheme for generating bottleneck features from deep neural networks is proposed. A stack of denoising auto-encoders is first trained in a layer-wise, unsupervised manner. Afterwards, the bottleneck layer and an additional layer are added and the whole network is fine-tuned to predict target phoneme states. We perform experiments on a Cantonese conversational telephone speech corpus and find that increasing the number of auto-encoders in the network produces more useful features, but requires pre-training, especially when little training data is available. Using more unlabeled data for pre-training only yields additional gains. Evaluations on larger datasets and on different system setups demonstrate the general applicability of our approach. In terms of word error rate, relative improvements of 9.2% (Cantonese, ML training), 9.3% (Tagalog, BMMI-SAT training), 12% (Tagalog, confusion network combinations with MFCCs), and 8.7% (Switchboard) are achieved.

...read moreread less

282 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics