Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Competition and segmentation in spoken word recognition

[...]

Dennis Norris, James M. McQueen¹, Anne Cutler¹•Institutions (1)

Max Planck Society¹

01 Sep 1995-Journal of Experimental Psychology: Learning, Memory and Cognition

TL;DR: This paper showed that competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition.

...read moreread less

Abstract: Spoken utterances contain few reliable cues to word boundaries, but listeners nonetheless experience little difficulty identifying words in continuous speech. The authors present data and simulations that suggest that this ability is best accounted for by a model of spoken-word recognition combining competition between alternative lexical candidates and sensitivity to prosodic structure. In a word-spotting experiment, stress pattern effects emerged most clearly when there were many competing lexical candidates for part of the input. Thus, competition between simultaneously active word candidates can modulate the size of prosodic effects, which suggests that spoken-word recognition must be sensitive both to prosodic structure and to the effects of competition. A version of the Shortlist model (D. G. Norris, 1994b) incorporating the Metrical Segmentation Strategy (A. Cutler & D. Norris, 1988) accurately simulates the results using a lexicon of more than 25,000 words.

...read moreread less

267 citations

Posted Content•

The Microsoft 2017 Conversational Speech Recognition System

[...]

Wayne Xiong¹, Lingfeng Wu¹, Fileno A. Alleva¹, Jasha Droppo¹, Xuedong Huang¹, Andreas Stolcke¹ - Show less +2 more•Institutions (1)

Microsoft¹

21 Aug 2017-arXiv: Computation and Language

TL;DR: The 2017 version of Microsoft's conversational speech recognition system is described in this article, which adds a CNN-BLSTM acoustic model to the set of model architectures we combined previously, and includes character-based and dialog session aware LSTM language models in rescoring.

...read moreread less

Abstract: We describe the 2017 version of Microsoft's conversational speech recognition system, in which we update our 2016 system with recent developments in neural-network-based acoustic and language modeling to further advance the state of the art on the Switchboard speech recognition task. The system adds a CNN-BLSTM acoustic model to the set of model architectures we combined previously, and includes character-based and dialog session aware LSTM language models in rescoring. For system combination we adopt a two-stage approach, whereby subsets of acoustic models are first combined at the senone/frame level, followed by a word-level voting via confusion networks. We also added a confusion network rescoring step after system combination. The resulting system yields a 5.1\% word error rate on the 2000 Switchboard evaluation set.

...read moreread less

266 citations

Journal Article•DOI•

Twin-field quantum key distribution with large misalignment error

[...]

Xiang-Bin Wang¹, Xiang-Bin Wang², Zong-Wen Yu, Xiaolong Hu²•Institutions (2)

University of Science and Technology of China¹, Tsinghua University²

18 Dec 2018-Physical Review A

TL;DR: In this article, the authors proposed a sending or not sending (Sending or Not sending) protocol based on the twin-field quantum key distribution (TF-QKD), which can tolerate large misalignment error.

...read moreread less

Abstract: Based on the novel idea of twin-field quantum key distribution [TF-QKD; Lucamarini et al., Nature (London) 557, 400 (2018)], we present a protocol named the ``sending or not sending TF-QKD'' protocol, which can tolerate large misalignment error. A revolutionary theoretical breakthrough in quantum communication, TF-QKD changes the channel-loss dependence of the key rate from linear to square root of channel transmittance. However, it demands the challenging technology of long-distance single-photon interference, and also, as stated in the original paper, the security proof was not finalized there due to the possible effects of the later announced phase information. Here we show by a concrete eavesdropping scheme that the later phase announcement does have important effects and the traditional formulas of the decoy-state method do not apply to the original protocol. We then present our ``sending or not sending'' protocol. Our protocol does not take postselection for the bits in $Z$-basis (signal pulses), and hence the traditional decoy-state method directly applies and automatically resolves the issue of security proof. Most importantly, our protocol presents a negligibly small error rate in $Z$-basis because it does not request any single-photon interference in this basis. Thus our protocol greatly improves the tolerable threshold of misalignment error in single-photon interference from the original a few percent to more than $45%$. As shown numerically, our protocol exceeds a secure distance of 700, 600, 500, or 300 km even though the single-photon interference misalignment error rate is as large as $15%, 25%, 35%$, or $45%$.

...read moreread less

266 citations

Posted Content•

Wav2Letter: an End-to-End ConvNet-based Speech Recognition System

[...]

Ronan Collobert, Christian Puhrsch, Gabriel Synnaeve

11 Sep 2016-arXiv: Learning

TL;DR: A simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding, trained to output letters, without the need for force alignment of phonemes is presented.

...read moreread less

Abstract: This paper presents a simple end-to-end model for speech recognition, combining a convolutional network based acoustic model and a graph decoding. It is trained to output letters, with transcribed speech, without the need for force alignment of phonemes. We introduce an automatic segmentation criterion for training from sequence annotation without alignment that is on par with CTC while being simpler. We show competitive results in word error rate on the Librispeech corpus with MFCC features, and promising results from raw waveform.

...read moreread less

266 citations

Proceedings Article•DOI•

WaldBoost - learning for time constrained sequential detection

[...]

Jan Sochman, Jiri Matas

20 Jun 2005

TL;DR: An algorithm with near optimal time and error rate trade-off is proposed, called WaldBoost, which integrates the AdaBoost algorithm for measurement selection and ordering and the joint probability density estimation with the optimal SPRT decision strategy.

...read moreread less

Abstract: In many computer vision classification problems, both the error and time characterizes the quality of a decision. We show that such problems can be formalized in the framework of sequential decision-making. If the false positive and false negative error rates are given, the optimal strategy in terms of the shortest average time to decision (number of measurements used) is the Wald's sequential probability ratio test (SPRT). We built on the optimal SPRT test and enlarge its capabilities to problems with dependent measurements. We show how to overcome the requirements of SPRT - (i) a priori ordered measurements and (ii) known joint probability density functions. We propose an algorithm with near optimal time and error rate trade-off, called WaldBoost, which integrates the AdaBoost algorithm for measurement selection and ordering and the joint probability density estimation with the optimal SPRT decision strategy. The WaldBoost algorithm is tested on the face detection problem. The results are superior to the state-of-the-art methods in the average evaluation time and comparable in detection rates.

...read moreread less

264 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics