Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A support vector machines-based rejection technique for speech recognition

[...]

Changxue Ma¹, M.A. Randolph¹, J. Drish²•Institutions (2)

Motorola¹, University of California, San Diego²

07 May 2001

TL;DR: This paper trains and test a support vector machines classifier and compares the results with other statistical classification methods and proposes a new approach to training support vector Machines.

...read moreread less

Abstract: Support vector machines represent a new approach to pattern classification developed from the theory of structural risk minimization. In this paper, we present an investigation into the application of support vector machines to the confidence measurement problem in speech recognition. Specifically, based on the results from an initial decoding of an utterance during speech recognition, we derive a feature vector consisting of parameters such as word score density, N-best word score density differences, relative word score and relative word duration as input to the confidence measurement process in which hypothetically correct utterances are accepted and utterances determined to be incorrect are rejected. We propose a new approach to training support vector machines. In this paper, we train and test a support vector machines classifier and compare the results with other statistical classification methods.

...read moreread less

62 citations

Journal Article•DOI•

Implementing a fuzzy inference system in a multi-objective EEG channel selection model for imagined speech classification

[...]

Alejandro A. Torres-García¹, Carlos A. Reyes-García¹, Luis Villaseñor-Pineda¹, Gregorio Garcia-Aguilar²•Institutions (2)

National Institute of Astrophysics, Optics and Electronics¹, Benemérita Universidad Autónoma de Puebla²

15 Oct 2016-Expert Systems With Applications

TL;DR: The present research is focused on the recognition of five Spanish words corresponding to the English words "up," "down," "left," "right" and "select", with which a computer cursor could be controlled, and shows a dependence relationship between EEG data and imagined words.

...read moreread less

Abstract: It was searched the minimal subset of channels for imagined speech.Channel selection was approached as multi-objective to obtain a Pareto front.A fuzzy system inference was applied to find a promising solution from Pareto front.Channel selection had a statistically similar performance to the use of all channels.It was observed a dependence between features and classes of imagined speech. One of the main purposes of brain-computer interfaces (BCI) is to provide persons of an alternative communication channel. This objective was firstly focused on handicapped subjects but nowadays its scope has increased to healthy persons. Usually, BCIs record brain activity using electroencephalograms (EEG), according to four main neuro-paradigms (slow cortical potentials, motor imagery, P300 component and visual evoked potentials). These analytical paradigms are not intuitive and are difficult to implement. Accordingly, this work researches an alternative neuro-paradigm called imagined speech, which refers to the internal pronunciation of words without emitting sounds or doing facial movements. Specifically, the present research is focused on the recognition of five Spanish words corresponding to the English words "up," "down," "left," "right" and "select", with which a computer cursor could be controlled. We perform an offline computer automatic classification procedure of a dataset of EEG signals from 27 subjects. The method implements a channel selection composed of two stages; the first one obtains a Pareto front and is approached as a multi-objective optimization problem dealing with the error rate and the number of channels; the second stage selects a single solution (channel combination) from the front, applying a fuzzy inference system (FIS). We assess the method's performance through a channel combination and a test set not used to generate the front. Several FIS configurations were explored to evaluate if a FIS is able to select channel combinations that improve or, at least, keep the obtained accuracies using all channels for each subject's data. We found that a FIS configuration, FIS3×3 (three membership functions for both input variables: error rate and the number of channels), obtained the best trade-off between the number of fuzzy rules and its accuracy (68.18% using around 7 channels). Also, the FIS3×3 obtained a similar statistically accuracy compared to the use of all channels (70.33%). Results of our method demonstrate the feasibility of using a FIS to automatically select a solution from the Pareto front to select channels applied to a problem of imagined speech classification. The presented method outperforms previous works in accuracy and showed a dependence relationship between EEG data and imagined words.

...read moreread less

62 citations

Proceedings Article•

Combination of acoustic models in continuous speech recognition hybrid systems.

[...]

Hugo Meinedo, João Paulo Neto

01 Jan 2000

TL;DR: This work developed a method combining phoneme probabilities generated by the different acoustic models trained on distinct feature extraction processes, which was possible to obtain relative improvements on word error rate larger than 20% for a large vocabulary speaker independent continuous speech recognition task.

...read moreread less

Abstract: The combination of multiple sources of information has been an attractive approach in different areas. That is the case of speech recognition area where several combination methods have been presented. Our hybrid MLP/HMM systems use acoustic models based on different set of features and different MLP classifier structures. In this work we developed a method combining phoneme probabilities generated by the different acoustic models trained on distinct feature extraction processes. Two different algorithms were implemented for combining the acoustic models probabilities. The first covers the combination in the probability domain and the second one in the log-probability domain. We made combinations of two and three alternative baseline systems where was possible to obtain relative improvements on word error rate larger than 20% for a large vocabulary speaker independent continuous speech recognition task.

...read moreread less

62 citations

Proceedings Article•DOI•

Speaker Verification using Support Vector Machines

[...]

S. Raghavan¹, Georgios Y. Lazarou¹, Joseph Picone¹•Institutions (1)

Mississippi State University¹

15 May 2006

TL;DR: This paper describes an application of SVMs to speaker verification and shows a 9% absolute improvement in equal error rate and a 33% relative improvement in minimum detection cost function when compared to a comparable HMM baseline system.

...read moreread less

Abstract: Support vector machines (SVM) have become a very popular pattern recognition algorithm for speech processing. In this paper we describe an application of SVMs to speaker verification. Traditionally speaker verification systems have used hidden Markov models (HMM) and Gaussian mixture models (GMM). These classifiers are based on generative models and are prone to overfitting. They do not directly optimize discrimination. SVMs, which are based on the principle of structural risk minimization, consist of binary classifiers that maximize the margin between two classes. The power of SVMs lie in their ability to transform data to a higher dimensional space and to construct a linear binary classifier in this space. Experiments were conducted on the NIST 2003 speaker recognition evaluation dataset. The SVM training was made computationally feasible by selecting only a small subset of vectors for building the out-of-class data. The results obtained using the SVMs showed a 9% absolute improvement in equal error rate and a 33% relative improvement in minimum detection cost function when compared to a comparable HMM baseline system

...read moreread less

62 citations

Posted Content•

Very Deep Convolutional Neural Networks for Robust Speech Recognition

[...]

Yanmin Qian¹, Philip C. Woodland²•Institutions (2)

Shanghai Jiao Tong University¹, University of Cambridge²

02 Oct 2016-arXiv: Computation and Language

TL;DR: The extension and optimisation of previous work on very deep convolutional neural networks for effective recognition of noisy speech in the Aurora 4 task are described and it is shown that state-level weighted log likelihood score combination in a joint acoustic model decoding scheme is very effective.

...read moreread less

Abstract: This paper describes the extension and optimization of our previous work on very deep convolutional neural networks (CNNs) for effective recognition of noisy speech in the Aurora 4 task. The appropriate number of convolutional layers, the sizes of the filters, pooling operations and input feature maps are all modified: the filter and pooling sizes are reduced and dimensions of input feature maps are extended to allow adding more convolutional layers. Furthermore appropriate input padding and input feature map selection strategies are developed. In addition, an adaptation framework using joint training of very deep CNN with auxiliary features i-vector and fMLLR features is developed. These modifications give substantial word error rate reductions over the standard CNN used as baseline. Finally the very deep CNN is combined with an LSTM-RNN acoustic model and it is shown that state-level weighted log likelihood score combination in a joint acoustic model decoding scheme is very effective. On the Aurora 4 task, the very deep CNN achieves a WER of 8.81%, further 7.99% with auxiliary feature joint training, and 7.09% with LSTM-RNN joint decoding.

...read moreread less

62 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics