Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Improving Low-Resource CD-DNN-HMM Using Dropout and Multilingual DNN Training

[...]

Yajie Miao¹, Florian Metze¹•Institutions (1)

Carnegie Mellon University¹

25 Aug 2013

TL;DR: Two strategies to improve the context-dependent deep neural network hidden Markov model (CD-DNN-HMM) in low-resource speech recognition are investigated, exploiting dropout which prevents overfitting in DNN finetuning and improves model robustness under data sparseness.

...read moreread less

Abstract: We investigate two strategies to improve the context-dependent deep neural network hidden Markov model (CD-DNN-HMM) in low-resource speech recognition. Although outperforming the conventional Gaussian mixture model (GMM) HMM on various tasks, CD-DNN-HMM acoustic modeling becomes challenging with limited transcribed speech, e.g., less than 10 hours. To resolve this issue, we firstly exploit dropout which prevents overfitting in DNN finetuning and improves model robustness under data sparseness. Then, the effectiveness of multilingual DNN training is evaluated when additional auxiliary languages are available. The hidden layer parameters of the target language are shared and learned over multiple languages. Experiments show that both strategies boost the recognition performance significantly. Combining them results in further reduction in word error rate, achieving 11.6% and 6.2% relative improvement on two limited data conditions.

...read moreread less

69 citations

Proceedings Article•DOI•

Turkish LVCSR: towards better speech recognition for agglutinative languages

[...]

K. Carki¹, Petra Geutner¹, Tanja Schultz¹•Institutions (1)

Karlsruhe Institute of Technology¹

05 Jun 2000

TL;DR: This paper describes the first experiments in a speaker independent LVCSR engine for Modern Standard Turkish and proposes a morphem-based and the Hypothesis Driven Lexical Adaptation approach to overcome the OOV-problem.

...read moreread less

Abstract: The Turkish language belongs to the Turkic family. All members of this family are close to one another in terms of linguistic structure. Typological similarities are vowel harmony, verb-final word order and agglutinative morphology. This latter property causes a very fast vocabulary growth resulting in a large number of out-of-vocabulary words. In this paper we describe our first experiments in a speaker independent LVCSR engine for Modern Standard Turkish. First results on our Turkish speech recognition system are presented. The currently best system shows very promising results achieving 16.9% word error rate. To overcome the OOV-problem we propose a morphem-based and the Hypothesis Driven Lexical Adaptation approach. The final Turkish system is integrated into the multilingual recognition engine of the GlobalPhone project.

...read moreread less

69 citations

Proceedings Article•DOI•

Histogram based normalization in the acoustic feature space

[...]

Sirko Molau, Michael Pitz, Hermann Ney

09 Dec 2001

TL;DR: It is shown that histogram normalization performs best if applied both in training and recognition, and that smoothing the target histogram obtained on the training data is also helpful.

...read moreread less

Abstract: We describe a technique called histogram normalization that aims at normalizing feature space distributions at different stages in the signal analysis front-end, namely the log-compressed filterbank vectors, cepstrum coefficients, and LDA (local density approximation) transformed acoustic vectors. Best results are obtained at the filterbank, and in most cases there is a minor additional gain when normalization is applied sequentially at different stages. We show that histogram normalization performs best if applied both in training and recognition, and that smoothing the target histogram obtained on the training data is also helpful. On the VerbMobil II corpus, a German large-vocabulary conversational speech recognition task, we achieve an overall reduction in word error rate of about 10% relative.

...read moreread less

69 citations

Proceedings Article•DOI•

Investigating End-to-end Speech Recognition for Mandarin-english Code-switching

[...]

Changhao Shan¹, Chao Weng², Guangsen Wang², Dan Su², Min Luo², Dong Yu², Lei Xie¹ - Show less +3 more•Institutions (2)

Northwestern Polytechnical University¹, Tencent²

12 May 2019

TL;DR: Three approaches are investigated to improve end-to-end speech recognition on Mandarin-English code-switching task and multi-task learning (MTL) is introduced which enables the language identity information to facilitate Mandarin- English code- Switching ASR.

...read moreread less

Abstract: Code-switching is a common phenomenon in many multilingual communities and presents a challenge to automatic speech recognition (ASR). In this paper, three approaches are investigated to improve end-to-end speech recognition on Mandarin-English code-switching task. First, multi-task learning (MTL) is introduced which enables the language identity information to facilitate Mandarin-English code-switching ASR. Second, we explore wordpieces, as opposed to graphemes, as English modeling units to reduce the mod-eling unit gap between Mandarin and English. Third, we employ transfer learning to utilize larger amount of monolingual Mandarin and English data to compensate the data sparsity issue of a code-switching task. Significant improvements are observed from all three approaches. With all three approaches combined, the final system achieves a character error rate (CER) of 6.49% on a real Mandarin-English code-switching task.

...read moreread less

69 citations

Journal Article•DOI•

Language model and speaking rate adaptation for spontaneous presentation speech recognition

[...]

Hiroaki Nanjo¹, Tatsuya Kawahara¹•Institutions (1)

Kyoto University¹

21 Jun 2004-IEEE Transactions on Speech and Audio Processing

TL;DR: The paper addresses adaptation methods to language model and speaking rate of individual speakers which are two major problems in automatic transcription of spontaneous presentation speech and proposes a SR-dependent decoding strategy that applies the most appropriate acoustic analysis, phone models, and decoding parameters according to the SR.

...read moreread less

Abstract: The paper addresses adaptation methods to language model and speaking rate (SR) of individual speakers which are two major problems in automatic transcription of spontaneous presentation speech. To cope with a large variation in expression and pronunciation of words depending on the speaker, firstly, we investigate the effect of statistical and context-dependent pronunciation modeling. Secondly, we present unsupervised methods of language model adaptation to a specific speaker and a topic by 1) selecting similar texts based on the word perplexity and TF-IDF measure and 2) making direct use of the initial recognition result for generating an enhanced model. We confirm that all proposed adaptation methods and their combinations reduce the perplexity and word error rate. We also present a decoding strategy adapted to the SR. In spontaneous speech, SR is generally fast and may vary a lot. We also observe different error tendencies for portions of presentations where speech is fast or slow. Therefore, we propose a SR-dependent decoding strategy that applies the most appropriate acoustic analysis, phone models, and decoding parameters according to the SR. Several methods are investigated and their selective application leads to improved accuracy. The combined effect of the two proposed adaptation methods is also confirmed in transcription of real academic presentation.

...read moreread less

69 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics