Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Towards a Continuous Biometric System Based on ECG Signals Acquired on the Steering Wheel

[...]

Joao Ribeiro Pinto¹, Jaime S. Cardoso¹, André Lourenço², Carlos Carreiras•Institutions (2)

University of Porto¹, Instituto Superior de Engenharia de Lisboa²

28 Sep 2017-Sensors

TL;DR: The enhancement of the unprecedented lesser quality of electrocardiogram signals through the combination of Savitzky-Golay and moving average filters, followed by outlier detection and removal based on normalised cross-correlation and clustering was able to render ensemble heartbeats of significantly higher quality.

...read moreread less

Abstract: Electrocardiogram signals acquired through a steering wheel could be the key to seamless, highly comfortable, and continuous human recognition in driving settings. This paper focuses on the enhancement of the unprecedented lesser quality of such signals, through the combination of Savitzky-Golay and moving average filters, followed by outlier detection and removal based on normalised cross-correlation and clustering, which was able to render ensemble heartbeats of significantly higher quality. Discrete Cosine Transform (DCT) and Haar transform features were extracted and fed to decision methods based on Support Vector Machines (SVM), k-Nearest Neighbours (kNN), Multilayer Perceptrons (MLP), and Gaussian Mixture Models - Universal Background Models (GMM-UBM) classifiers, for both identification and authentication tasks. Additional techniques of user-tuned authentication and past score weighting were also studied. The method's performance was comparable to some of the best recent state-of-the-art methods (94.9% identification rate (IDR) and 2.66% authentication equal error rate (EER)), despite lesser results with scarce train data (70.9% IDR and 11.8% EER). It was concluded that the method was suitable for biometric recognition with driving electrocardiogram signals, and could, with future developments, be used on a continuous system in seamless and highly noisy settings.

...read moreread less

86 citations

Journal Article•DOI•

A neural syntactic language model

[...]

Ahmad Emami¹, Frederick Jelinek¹•Institutions (1)

Johns Hopkins University¹

01 Sep 2005-Machine Learning

TL;DR: The neural syntactic based model achieves the best published results in perplexity and WER for the given data sets and comparisons with the standard and neural net based N-gram models with arbitrarily long contexts show that the syntactic information is in fact very helpful in estimating the word string probability.

...read moreread less

Abstract: This paper presents a study of using neural probabilistic models in a syntactic based language model. The neural probabilistic model makes use of a distributed representation of the items in the conditioning history, and is powerful in capturing long dependencies. Employing neural network based models in the syntactic based language model enables it to use efficiently the large amount of information available in a syntactic parse in estimating the next word in a string. Several scenarios of integrating neural networks in the syntactic based language model are presented, accompanied by the derivation of the training procedures involved. Experiments on the UPenn Treebank and the Wall Street Journal corpus show significant improvements in perplexity and word error rate over the baseline SLM. Furthermore, comparisons with the standard and neural net based N-gram models with arbitrarily long contexts show that the syntactic information is in fact very helpful in estimating the word string probability. Overall, our neural syntactic based model achieves the best published results in perplexity and WER for the given data sets.

...read moreread less

86 citations

Journal Article•DOI•

Stochastic language adaptation over time and state in natural spoken dialog systems

[...]

Giuseppe Riccardi¹, Allen Louis Gorin²•Institutions (2)

Bell Labs¹, AT&T²

01 Jan 2000-IEEE Transactions on Speech and Audio Processing

TL;DR: This work describes a novel adaptation algorithm for language models with time and dialog-state varying parameters that allows for recognizing and understanding unconstrained speech at each stage of the dialog, enabling context-switching and error recovery.

...read moreread less

Abstract: We are interested in adaptive spoken dialog systems for automated services. Peoples' spoken language usage varies over time for a given task, and furthermore varies depending on the state of the dialog. Thus, it is crucial to adapt automatic speech recognition (ASR) language models to these varying conditions. We characterize and quantify these variations based on a database of 30 K user-transactions with AT&T's experimental How May I Help You? spoken dialog system. We describe a novel adaptation algorithm for language models with time and dialog-state varying parameters. Our language adaptation framework allows for recognizing and understanding unconstrained speech at each stage of the dialog, enabling context-switching and error recovery. These models have been used to train state-dependent ASR language models. We have evaluated their performance with respect to word accuracy and perplexity over time and dialog states. We have achieved a reduction of 40% in perplexity and of 8.4% in word error rate over the baseline system, averaged across all dialog states.

...read moreread less

85 citations

Proceedings Article•DOI•

Comparison of part-of-speech and automatically derived category-based language models for speech recognition

[...]

Thomas Niesler¹, E.W.D. Whittaker¹, Philip C. Woodland¹•Institutions (1)

University of Cambridge¹

12 May 1998

TL;DR: This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation to find the largest improvement with a model using automatically determined categories.

...read moreread less

Abstract: This paper compares various category-based language models when used in conjunction with a word-based trigram by means of linear interpolation. Categories corresponding to parts-of-speech as well as automatically clustered groupings are considered. The category-based model employs variable-length n-grams and permits each word to belong to multiple categories. Relative word error rate reductions of between 2 and 7% over the baseline are achieved in N-best rescoring experiments on the Wall Street Journal corpus. The largest improvement is obtained with a model using automatically determined categories. Perplexities continue to decrease as the number of different categories is increased, but improvements in the word error rate reach an optimum.

...read moreread less

85 citations

Posted Content•

Exploring wav2vec 2.0 on speaker verification and language identification

[...]

Zhiyun Fan¹, Meng Li, Shiyu Zhou¹, Bo Xu¹•Institutions (1)

Chinese Academy of Sciences¹

11 Dec 2020-arXiv: Sound

TL;DR: This work uses some preliminary experiments to indicate that wav2vec 2.0 can capture the information about the speaker and language and utilizes one model to achieve the unified modeling by the multi-task learning for the two tasks.

...read moreread less

Abstract: Wav2vec 2.0 is a recently proposed self-supervised framework for speech representation learning. It follows a two-stage training process of pre-training and fine-tuning, and performs well in speech recognition tasks especially ultra-low resource cases. In this work, we attempt to extend self-supervised framework to speaker verification and language identification. First, we use some preliminary experiments to indicate that wav2vec 2.0 can capture the information about the speaker and language. Then we demonstrate the effectiveness of wav2vec 2.0 on the two tasks respectively. For speaker verification, we obtain a new state-of-the-art result, Equal Error Rate (EER) of 3.61% on the VoxCeleb1 dataset. For language identification, we obtain an EER of 12.02% on 1 second condition and an EER of 3.47% on full-length condition of the AP17-OLR dataset. Finally, we utilize one model to achieve the unified modeling by the multi-task learning for the two tasks.

...read moreread less

85 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics