Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Improvements on the pronunciation prefix tree search organization

[...]

F. Alleva¹, Xuedong Huang, Mei-Yuh Hwang•Institutions (1)

Microsoft¹

07 May 1996

TL;DR: This work addresses efficiency issues associated with a search organization based on pronunciation prefix trees (PPTs) by presenting a mechanism that eliminates redundant computations in non-reentrant trees, a comparison of two methods for distributing language model probabilities in PPTs, and results on two look ahead pruning strategies.

...read moreread less

Abstract: The need for ever more efficient search organizations persists as the size and complexity of the knowledge sources used in continuous speech recognition (CSR) tasks continues to increase. We address efficiency issues associated with a search organization based on pronunciation prefix trees (PPTs). In particular we present (1) a mechanism that eliminates redundant computations in non-reentrant trees, (2) a comparison of two methods for distributing language model probabilities in PPTs, and (3) report results on two look ahead pruning strategies. Using the 1994 DARPA 20 k NAB word bigram for the male segment of si dev5m 92 (the 5k speaker independent development test set for the WSJ), the error rate was 12.2% with a real-time factor of 1.0 on a 120 MHz Pentium.

...read moreread less

66 citations

Proceedings Article•

A study of multilingual speech recognition.

[...]

Fuliang Weng, Harry Bratt, Leonardo Neumeyer, Andreas Stolcke

01 Jan 1997

TL;DR: Preliminary experiments show that sharing acoustic models across the two languages has not resulted in improved performance, while sharing a backoff node at the LM component provides flexibility and ease in recognizing bilingual sentences at the expense of a slight increase in word error rate in some cases.

...read moreread less

Abstract: This paper describes our work in developing multilingual (Swedish and English) speech recognition systems in the ATIS domain. The acoustic component of the multilingual systems is realized through sharing Gaussian codebooks across Swedish and English allophones. The language model (LM) components are constructed by training a statistical bigram model, with a common backoff node, on bilingual texts, and by combining two monolingual LMs into a probabilistic finite state grammar. This system uses a single decoder for Swedish and English sentences, and is capable of recognizing sentences with words from both languages. Preliminary experiments show that sharing acoustic models across the two languages has not resulted in improved performance, while sharing a backoff node at the LM component provides flexibility and ease in recognizing bilingual sentences at the expense of a slight increase in word error rate in some cases. As a by-product, the bilingual decoder also achieves good performance on language identification (LID).

...read moreread less

66 citations

Journal Article•DOI•

Gene selection and classification using Taguchi chaotic binary particle swarm optimization

[...]

Li-Yeh Chuang¹, Cheng-San Yang, Kuo-Chuan Wu², Cheng-Hong Yang²•Institutions (2)

I-Shou University¹, National Kaohsiung University of Applied Sciences²

01 Sep 2011-Expert Systems With Applications

TL;DR: Experimental results show that this hybrid method effectively simplifies features selection by reducing the number of features needed, and could constitute a valuable tool for gene expression analysis in future studies.

...read moreread less

Abstract: The purpose of gene expression analysis is to discriminate between classes of samples, and to predict the relative importance of each gene for sample classification. Microarray data with reference to gene expression profiles have provided some valuable results related to a variety of problems and contributed to advances in clinical medicine. Microarray data characteristically have a high dimension and a small sample size. This makes it difficult for a general classification method to obtain correct data for classification. However, not every gene is potentially relevant for distinguishing the sample class. Thus, in order to analyze gene expression profiles correctly, feature (gene) selection is crucial for the classification process, and an effective gene extraction method is necessary for eliminating irrelevant genes and decreasing the classification error rate. In this paper, correlation-based feature selection (CFS) and the Taguchi chaotic binary particle swarm optimization (TCBPSO) were combined into a hybrid method. The K-nearest neighbor (K-NN) with leave-one-out cross-validation (LOOCV) method served as a classifier for ten gene expression profiles. Experimental results show that this hybrid method effectively simplifies features selection by reducing the number of features needed. The classification error rate obtained by the proposed method had the lowest classification error rate for all of the ten gene expression data set problems tested. For six of the gene expression profile data sets a classification error rate of zero could be reached. The introduced method outperformed five other methods from the literature in terms of classification error rate. It could thus constitute a valuable tool for gene expression analysis in future studies.

...read moreread less

66 citations

Patent•

Minimum error rate training of combined string models

[...]

Wu Chou¹, Biing-Hwang Juang¹, Chin-Hui Lee¹•Institutions (1)

Alcatel-Lucent¹

13 Jul 1994

TL;DR: In this article, a method of making a speech recognition model database is disclosed, which is formed based on a training string utterance signal and a plurality of sets of current speech recognition models.

...read moreread less

Abstract: A method of making a speech recognition model database is disclosed. The database is formed based on a training string utterance signal and a plurality of sets of current speech recognition models. The sets of current speech recognition models may include acoustic models, language models, and other knowledge sources. In accordance with an illustrative embodiment of the invention, a set of confusable string models is generated, each confusable string model comprising speech recognition models from two or more sets of speech recognition models (such as acoustic and language models). A first scoring signal is generated based on the training string utterance signal and a string model for that utterance, wherein the string model for the utterance comprises speech recognition models from two or more sets of speech recognition models. One or more second scoring signals are also generated, wherein a second scoring signal is based on the training string utterance signal and a confusable string model. A misrecognition signal is generated based on the first scoring signal and the one or more second scoring signals. Current speech recognition models are modified, based on the misrecognition signal to increase the probability that a correct string model will have a rank order higher than other confusable string models.

...read moreread less

66 citations

Book Chapter•DOI•

Combination of tangent distance and an image distortion model for appearance-based sign language recognition

[...]

Morteza Zahedi¹, Daniel Keysers¹, Thomas Deselaers¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

31 Aug 2005

TL;DR: A zero-order local deformation model is employed to model the visual variability of video streams of American sign language (ASL) words and two possible ways of combining the model with the tangent distance used to compensate for affine global transformations are discussed.

...read moreread less

Abstract: In this paper, we employ a zero-order local deformation model to model the visual variability of video streams of American sign language (ASL) words. We discuss two possible ways of combining the model with the tangent distance used to compensate for affine global transformations. The integration of the deformation model into our recognition system improves the error rate on a database of ASL words from 22.2% to 17.2%.

...read moreread less

66 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics