scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: Experimental results in speaker-independent, continuous speech recognition over Italian digit-strings validate the novel hybrid framework, allowing for improved recognition performance over HMMs with mixtures of Gaussian components, as well as over Bourlard and Morgan's paradigm.
Abstract: Acoustic modeling in state-of-the-art speech recognition systems usually relies on hidden Markov models (HMMs) with Gaussian emission densities. HMMs suffer from intrinsic limitations, mainly due to their arbitrary parametric assumption. Artificial neural networks (ANNs) appear to be a promising alternative in this respect, but they historically failed as a general solution to the acoustic modeling problem. This paper introduces algorithms based on a gradient-ascent technique for global training of a hybrid ANN/HMM system, in which the ANN is trained for estimating the emission probabilities of the states of the HMM. The approach is related to the major hybrid systems proposed by Bourlard and Morgan and by Bengio, with the aim of combining their benefits within a unified framework and to overcome their limitations. Several viable solutions to the "divergence problem"-that may arise when training is accomplished over the maximum-likelihood (ML) criterion-are proposed. Experimental results in speaker-independent, continuous speech recognition over Italian digit-strings validate the novel hybrid framework, allowing for improved recognition performance over HMMs with mixtures of Gaussian components, as well as over Bourlard and Morgan's paradigm. In particular, it is shown that the maximum a posteriori (MAP) version of the algorithm yields a 46.34% relative word error rate reduction with respect to standard HMMs.

76 citations

Proceedings ArticleDOI
Christoph Tillmann1
11 Jul 2003
TL;DR: A phrase- based unigram model for statistical machine translation that uses a much simpler set of model parameters than similar phrase-based models that has been successfully test on a Chinese-English and an Arabic-English translation task.
Abstract: In this paper, we describe a phrase-based unigram model for statistical machine translation that uses a much simpler set of model parameters than similar phrase-based models. The units of translation are blocks -- pairs of phrases. During decoding, we use a block unigram model and a word-based trigram language model. During training, the blocks are learned from source interval projections using an underlying high-precision word alignment. The system performance is significantly increased by applying a novel block extension algorithm using an additional high-recall word alignment. The blocks are further filtered using unigram-count selection criteria. The system has been successfully test on a Chinese-English and an Arabic-English translation task.

76 citations

Journal ArticleDOI
TL;DR: An adaptive time-frequency (ATF) parameter is proposed for extracting both the time and frequency features of noisy speech signals and a new word boundary detection algorithm is proposed by using a neural fuzzy network for identifying islands of word signals in a noisy environment.
Abstract: This paper addresses the problem of automatic word boundary detection in the presence of noise. We first propose an adaptive time-frequency (ATF) parameter for extracting both the time and frequency features of noisy speech signals. The ATF parameter extends the TF parameter proposed by Junqua et al. (1994) from single band to multiband spectrum analysis, where the frequency bands help to make the distinction of speech and noise signals clear. The ATF parameter can extract useful frequency information by adaptively choosing proper bands of the mel-scale frequency bank. The ATF parameter increased the recognition rate by about 3% of a TF-based robust algorithm which has been shown to outperform several commonly used algorithms for word boundary detection in the presence of noise. The ATF parameter also reduced the recognition error rate due to endpoint detection to about 20%. Based on the ATF parameter, we further propose a new word boundary detection algorithm by using a neural fuzzy network (called SONFIN) for identifying islands of word signals in a noisy environment. Due to the self-learning ability of SONFIN, the proposed algorithm avoids the need of empirically determining thresholds and ambiguous rules in normal word boundary detection algorithms. As compared to normal neural networks, the SONFIN can always find itself an economic network size in high learning speed. Our results also showed that the SONFIN's performance is not significantly affected by the size of training set. The ATF-based SONFIN achieved higher recognition rate than the TF-based robust algorithm by about 5%. It also reduced the recognition error rate due to endpoint detection to about 10%, compared to an average of approximately 30% obtained with the TF-based robust algorithm, and 50% obtained with the modified version of the Lamel et al. (1981) algorithm.

76 citations

Journal ArticleDOI
TL;DR: A new distance is proposed which permits tighter bounds to be set on the error probability of the Bayesian decision rule and which is shown to be closely related to several certainty or separability measures.
Abstract: An important measure concerning the use of statistical decision schemes is the error probability associated with the decision rule. Several methods giving bounds on the error probability are presently available, but, most often, the bounds are loose. Those methods generally make use of so-cailed distances between statistical distributions. In this paper a new distance is proposed which permits tighter bounds to be set on the error probability of the Bayesian decision rule and which is shown to be closely related to several certainty or separability measures. Among these are the nearest neighbor error rate and the average conditional quadratic entropy of Vajda. Moreover, our distance bears much resemblance to the information theoretic concept of equivocation. This relationship is discussed. Comparison is made between the bounds on the Bayes risk obtained with the Bhattacharyya coefficient, the equivocation, and the new measure which we have named the Bayesian distance.

76 citations

Journal ArticleDOI
TL;DR: In this paper, the effects of age-of-acquisition on word naming speed and auditory recognition of words presented at a low volume were investigated and the results are interpreted as supporting the view that the age of acquisition variable mainly affects word production and has little effect on word recognition processes.
Abstract: This paper reports two experiments concerning the effects of word age-of-acquisition on word naming speed and auditory recognition of words presented at a low volume. The first experiment found significant facilitating effects of word age-of-acquisition in word naming even when word length, frequency and familiarity were taken into account. The second experiment found no evidence of age-of-acquisition effects in auditory word recognition. The results are interpreted as supporting the view that the age-of-acquisition variable mainly affects word production and has little effect on word recognition processes.

76 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528