Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Pairwise Discriminative Speaker Verification in the ${\rm I}$ -Vector Space

[...]

Sandro Cumani, Niko Brümmer, Lukas Burget¹, Pietro Laface, Oldřich Plchot¹, V. Vasilakakis - Show less +2 more•Institutions (1)

Brno University of Technology¹

22 May 2011

TL;DR: It is shown that it is possible to train a gender-independent discriminative model that achieves state-of-the-art accuracy, comparable to the one of aGender-dependent system, saving memory and execution time both in training and in testing.

...read moreread less

Abstract: This work presents a new and efficient approach to discriminative speaker verification in the i-vector space. We illustrate the development of a linear discriminative classifier that is trained to discriminate between the hypothesis that a pair of feature vectors in a trial belong to the same speaker or to different speakers. This approach is alternative to the usual discriminative setup that discriminates between a speaker and all the other speakers. We use a discriminative classifier based on a Support Vector Machine (SVM) that is trained to estimate the parameters of a symmetric quadratic function approximating a log-likelihood ratio score without explicit modeling of the i-vector distributions as in the generative Probabilistic Linear Discriminant Analysis (PLDA) models. Training these models is feasible because it is not necessary to expand the i -vector pairs, which would be expensive or even impossible even for medium sized training sets. The results of experiments performed on the tel-tel extended core condition of the NIST 2010 Speaker Recognition Evaluation are competitive with the ones obtained by generative models, in terms of normalized Detection Cost Function and Equal Error Rate. Moreover, we show that it is possible to train a gender-independent discriminative model that achieves state-of-the-art accuracy, comparable to the one of a gender-dependent system, saving memory and execution time both in training and in testing.

...read moreread less

110 citations

Proceedings Article•DOI•

Building Competitive Direct Acoustics-to-Word Models for English Conversational Speech Recognition

[...]

Kartik Audhkhasi¹, Brian Kingsbury¹, Bhuvana Ramabhadran¹, George Saon¹, Michael Picheny¹ - Show less +1 more•Institutions (1)

IBM¹

15 Apr 2018

TL;DR: In this paper, a joint word-character A2W model was proposed to learn to first spell the word and then recognize it, achieving a word error rate of 8.8%/13.9% on the Hub5-2000 Switchboard/CallHome test sets without any decoder, pronunciation lexicon, or externally-trained language model.

...read moreread less

Abstract: Direct acoustics-to-word (A2W) models in the end-to-end paradigm have received increasing attention compared to conventional subword based automatic speech recognition models using phones, characters, or context-dependent hidden Markov model states. This is because A2W models recognize words from speech without any decoder, pronunciation lexicon, or externally-trained language model, making training and decoding with such models simple. Prior work has shown that A2W models require orders of magnitude more training data in order to perform comparably to conventional models. Our work also showed this accuracy gap when using the English Switchboard-Fisher data set. This paper describes a recipe to train an A2W model that closes this gap and is at-par with state-of-the-art sub-word based models. We achieve a word error rate of 8.8.8%/13.9% on the Hub5-2000 Switchboard/CallHome test sets without any decoder or language model. We find that model initialization, training data order, and regularization have the most impact on the A2W model performance. Next, we present a joint word-character A2W model that learns to first spell the word and then recognize it. This model provides a rich output to the user instead of simple word hypotheses, making it especially useful in the case of words unseen or rarely-seen during training.

...read moreread less

110 citations

Proceedings Article•DOI•

Is physics-based liveness detection truly possible with a single image?

[...]

Jiamin Bai¹, Tian-Tsong Ng², Xinting Gao², Yun Q. Shi³•Institutions (3)

University of California, Berkeley¹, Institute for Infocomm Research Singapore², New Jersey Institute of Technology³

03 Aug 2010

TL;DR: This work proposes and validate a novel physics-based method to detect images recaptured from printed material using only a single image, and shows that the classifier can be generalizable to contrast enhanced recapture images and LCD screen recaptured images without re-training, demonstrating the robustness of the approach.

...read moreread less

Abstract: Face recognition is an increasingly popular method for user authentication. However, face recognition is susceptible to playback attacks. Therefore, a reliable way to detect malicious attacks is crucial to the robustness of the system. We propose and validate a novel physics-based method to detect images recaptured from printed material using only a single image. Micro-textures present in printed paper manifest themselves in the specular component of the image. Features extracted from this component allows a linear SVM classifier to achieve 2.2% False Acceptance Rate and 13% False Rejection Rate (6.7% Equal Error Rate). We also show that the classifier can be generalizable to contrast enhanced recaptured images and LCD screen recaptured images without re-training, demonstrating the robustness of our approach.1

...read moreread less

110 citations

Journal Article•DOI•

i-vector representation based on bottleneck features for language identification

[...]

Yan Song¹, Bing Jiang¹, Yebo Bao¹, Si Wei¹, Li-Rong Dai¹ - Show less +1 more•Institutions (1)

University of Science and Technology of China¹

21 Nov 2013-Electronics Letters

TL;DR: An i-vector representation based on bottleneck (BN) features is presented for language identification (LID) and the resulting performance of LID has been significantly improved with the proposed BN feature based i- vector representation.

...read moreread less

Abstract: An i-vector representation based on bottleneck (BN) features is presented for language identification (LID). In the proposed system, the BN features are extracted from a deep neural network, which can effectively mine the contextual information embedded in speech frames. The i-vector representation of each utterance is then obtained by applying a total variability approach on the BN features. The resulting performance of LID has been significantly improved with the proposed BN feature based i-vector representation. Compared with the stateof- the-art techniques, the equal error rate is relatively reduced by about 40% on the National Institute of Standards and Technology (NIST) 2009 evaluation sets.

...read moreread less

110 citations

Journal Article•DOI•

REAK: Reliability analysis through Error rate-based Adaptive Kriging

[...]

Zeyu Wang¹, Abdollah Shafieezadeh¹•Institutions (1)

Ohio State University¹

01 Feb 2019-Reliability Engineering & System Safety

TL;DR: An extension of the Central Limit Theorem based on Lindeberg condition is adopted here to derive the distribution of the number of design samples with wrong sign estimate and subsequently determine the maximum error rate for failure probability estimates.

...read moreread less

110 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics