scispace - formally typeset
Search or ask a question
Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.


Papers
More filters
Proceedings Article
01 Jan 1999
TL;DR: This report reports on data collected during a study of three commercially available ASR systems that show how initial users of speech systems tend to fixate on a single strategy for error correction, coupled with application assumptions about how error correction features will be used, make a very frustrating, and unsatisfying user experience.
Abstract: Automatic Speech Recognition (ASR) systems have improved greatly over the last three decades. However, even with 98% reported accuracy, error correction still consumes a significant portion of user effort in text creation tasks. We report on data collected during a study of three commercially available ASR systems that show how initial users of speech systems tend to fixate on a single strategy for error correction. This tendency coupled with application assumptions about how error correction features will be used, combine to make a very frustrating, and unsatisfying user experience. We observe two distinct error correction patterns: spiral depth (Oviatt & van Gent, 1996) and cascades. In contrast, users with more extensive experience learn to switch correction strategies more quickly.

84 citations

Journal ArticleDOI
TL;DR: This paper builds upon a state-of-the-art SED method that performs frame-by-frame detection using a bidirectional LSTM recurrent neural network, and incorporates a duration-controlled modeling technique based on a hidden semi-Markov model that makes it possible to model the duration of each sound event precisely and to perform sequence- by-sequence detection without having to resort to thresholding.
Abstract: This paper presents a new hybrid approach called duration-controlled long short-term memory (LSTM) for polyphonic sound event detection (SED). It builds upon a state-of-the-art SED method that performs frame-by-frame detection using a bidirectional LSTM recurrent neural network (BLSTM), and incorporates a duration-controlled modeling technique based on a hidden semi-Markov model. The proposed approach makes it possible to model the duration of each sound event precisely and to perform sequence-by-sequence detection without having to resort to thresholding, as in conventional frame-by-frame methods. Furthermore, to effectively reduce sound event insertion errors, which often occur under noisy conditions, we also introduce a binary-mask-based postprocessing that relies on a sound activity detection network to identify segments with any sound event activity, an approach inspired by the well-known benefits of voice activity detection in speech recognition systems. We conduct an experiment using the DCASE2016 task 2 dataset to compare our proposed method with typical conventional methods, such as nonnegative matrix factorization and standard BLSTM. Our proposed method outperforms the conventional methods both in an event-based evaluation, achieving a 75.3% F1 score and a 44.2% error rate, and in a segment-based evaluation, achieving an 81.1% F1 score, and a 32.9% error rate, outperforming the best results reported in the DCASE2016 task 2 Challenge.

84 citations

Journal ArticleDOI
TL;DR: This paper examined the relation between word frequency, repetition and stimulus quality in a lexical decision experiment, and found that frequency and repetition were correlated with stimulus quality, and the implications of this result for models of word recognition are discussed within the framework of Becker's verification model.
Abstract: This paper describes a lexical decision experiment, which examined the relation between word frequency, repetition and stimulus quality. In contrast to earlier studies (Stanners, Jastrzembski and Westbrook, 1975; Becker and Killion, 1977), frequency and stimulus quality were found to interact. The implications of this result for models of word recognition are discussed within the framework of Becker's verification model.

83 citations

Journal ArticleDOI
TL;DR: Adapted MFCC and PLP coefficients improve human activity recognition and segmentation accuracies while reducing feature vector size considerably, overcome significantly baseline error rates and contribute significantly to reduce the segmentation error rate.

83 citations

Journal ArticleDOI
TL;DR: The bit-error rate (BER) performance of multilevel quadrature amplitude modulation with pilot-symbol-assisted modulation channel estimation in static and Rayleigh fading channels is derived, both for single branch reception and maximal ratio combining diversity receiver systems.
Abstract: The bit-error rate (BER) performance of multilevel quadrature amplitude modulation with pilot-symbol-assisted modulation channel estimation in static and Rayleigh fading channels is derived, both for single branch reception and maximal ratio combining diversity receiver systems. The effects of noise and estimator decorrelation on the received BER are examined. The high sensitivity of diversity systems to channel estimation error is investigated and quantified. The influence of the pilot-symbol interpolation filter windowing is also considered.

83 citations


Network Information
Related Topics (5)
Deep learning
79.8K papers, 2.1M citations
88% related
Feature extraction
111.8K papers, 2.1M citations
86% related
Convolutional neural network
74.7K papers, 2M citations
85% related
Artificial neural network
207K papers, 4.5M citations
84% related
Cluster analysis
146.5K papers, 2.9M citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023271
2022562
2021640
2020643
2019633
2018528