scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Book ChapterDOI
01 Jan 1999
TL;DR: This chapter provides an analytical framework to quantify the improvements in classification results due to combining, and derives expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance.
Abstract: Several researchers have experimentally shown that substantial improvements can be obtained in difficult pattern recognition problems by combining or integrating the outputs of multiple classifiers. This chapter provides an analytical framework to quantify the improvements in classification results due to combining. The results apply to both linear combiners and order statistics combiners. We first show that to a first order approximation, the error rate obtained over and above the Bayes error rate, is directly proportional to the variance of the actual decision boundaries around the Bayes optimum boundary. Combining classifiers in output space reduces this variance, and hence reduces the “added” error. If N unbiased classifiers are combined by simple averaging, the added error rate can be reduced by a factor of N if the individual errors in approximating the decision boundaries are uncorrelated. Expressions are then derived for linear combiners which are biased or correlated, and the effect of output correlations on ensemble performance is quantified. For order statistics based non-linear combiners, we derive expressions that indicate how much the median, the maximum and in general the ith order statistic can improve classifier performance. The analysis presented here facilitates the understanding of the relationships among error rates, classifier boundary distributions, and combining in output space. Experimental results on several public domain data sets are provided to illustrate the benefits of combining and to support the analytical results.

75 citations

Journal ArticleDOI
01 Sep 1989
TL;DR: The error rates of linear classifiers that utilize various criterion functions are investigated for the case of two normal distributions with different variances and a priori probabilities, finding that the classifier based on the least mean squares criterion often performs considerably worse than the Bayes rate.
Abstract: The error rates of linear classifiers that utilize various criterion functions are investigated for the case of two normal distributions with different variances and a priori probabilities. It is found that the classifier based on the least mean squares (LMS) criterion often performs considerably worse than the Bayes rate. The perceptron criterion (with suitable safety margin) and the linearized sigmoid generally lead to lower error rates than the LMS criterion, with the sigmoid usually the better of the two. Also investigated are the exceptions to the general trends: only if one class is known to have much larger a priori probability or variance than the other should one expect the LMS or perceptron criteria to be slightly preferable as far as error rate is concerned. The analysis is related to the performance of the back-propagation (BP) classifier, giving some understanding of the success of BP. A neural-net classifier, the adaptive-clustering classifier, suggested by this analysis is compared with BP (modified by using a conjugate-gradient optimization technique) for two problems. It is found that BP usually takes significantly longer to train than the adaptive-clustering technique. >

75 citations

Journal ArticleDOI
TL;DR: An MCMC algorithm is presented that achieves significantly lower logical error rates than MWPM at the cost of a runtime complexity increased by a factor O(L-2) for depolarizing noise with error rate p, an exponential improvement over all previously existing efficient algorithms.
Abstract: Minimum-weight perfect matching (MWPM) has been the primary classical algorithm for error correction in the surface code, since it is of low runtime complexity and achieves relatively low logical error rates [Phys. Rev. Lett. 108, 180501 (2012)]. A Markov chain Monte Carlo (MCMC) algorithm [Phys. Rev. Lett. 109, 160503 (2012)] is able to achieve lower logical error rates and higher thresholds than MWPM, but requires a classical runtime complexity, which is super-polynomial in L, the linear size of the code. In this work we present an MCMC algorithm that achieves significantly lower logical error rates than MWPM at the cost of a runtime complexity increased by a factor O(L-2). This advantage is due to taking correlations between bit-and phase-flip errors (as they appear, for example, in depolarizing noise) as well as entropic factors (i.e., the numbers of likely error paths in different equivalence classes) into account. For depolarizing noise with error rate p, we present an efficient algorithm for which the logical error rate is suppressed as O((p/3)(L/2)) for p -< 0-an exponential improvement over all previously existing efficient algorithms. Our algorithm allows for tradeoffs between runtime and achieved logical error rates as well as for parallelization, and can be also used for correction in the case of imperfect stabilizer measurements.

75 citations

Patent
09 Mar 2001
TL;DR: In this article, a system and a method for encoding and decoding ultra-wideband information are provided, where an ultra wideband transmission is encoded by positioning bipolar pulse pairs, which assist in detecting errors in the transmission, before the entire transmission has been received.
Abstract: A system and a method for encoding and decoding ultra-wideband information are provided. An ultra-wideband transmission is encoded by positioning bipolar pulse pairs. The bipolar pulse pairs assist in detecting errors in the ultra-wideband transmission, before the entire transmission has been received. The transmission is analyzed for errors and an error rate is calculated. The calculated error rate is compared to one or more predefined acceptable error rate levels to determine whether the calculated error rate of the transmission is within at least one of the predefined acceptable error rate levels.

75 citations

Journal ArticleDOI
TL;DR: This paper investigated syntactic and semantic features for factored language models for code-switching speech and their effect on automatic speech recognition (ASR) performance and found that the best language model can significantly reduce the perplexity on the SEAME evaluation set by up to 10.8% relative and the mixed error rate by 3.4% relative.
Abstract: This paper presents our latest investigations on different features for factored language models for Code-Switching speech and their effect on automatic speech recognition (ASR) performance. We focus on syntactic and semantic features which can be extracted from Code-Switching text data and integrate them into factored language models. Different possible factors, such as words, part-of-speech tags, Brown word clusters, open class words and clusters of open class word embeddings are explored. The experimental results reveal that Brown word clusters, part-of-speech tags and open-class words are the most effective at reducing the perplexity of factored language models on the Mandarin-English Code-Switching corpus SEAME. In ASR experiments, the model containing Brown word clusters and part-of-speech tags and the model also including clusters of open class word embeddings yield the best mixed error rate results. In summary, the best language model can significantly reduce the perplexity on the SEAME evaluation set by up to 10.8% relative and the mixed error rate by up to 3.4% relative.

74 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528