Topic

Word error rate

About: Word error rate is a research topic. Over the lifetime, 11939 publications have been published within this topic receiving 298031 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Considerations in applying clustering techniques to speaker-independent word recognition.

[...]

Lawrence R. Rabiner¹, Jay G. Wilpon•Institutions (1)

Bell Labs¹

01 Sep 1979-Journal of the Acoustical Society of America

TL;DR: The next important step is to investigate fully automatic techniques for clustering multiple versions of a single word into a set of speaker‐independent word templates.

...read moreread less

Abstract: Recent work at Bell Laboratories has demonstrated the utility of applying sophisticated pattern recognition techniques to obtain a set of speaker‐independent word templates for an isolated word recognition system [Levinson et al., IEEE Trans. Acoust. Speech Signal Process. ASSP‐27 (2), 134–141 (1979); Rabiner et al., IEEE Trans. Acoust. Speech Signal Process.(in press)]. In these studies, it was shown that a careful experimenter could guide the clustering algorithms to choose a small set of templates that were representative of a large number of replications for each word in the vocabulary. Subsequent word recognition tests verified that the templates chosen were indeed representative of a fairly large population of talkers. Given the success of this approach, the next important step is to investigate fully automatic techniques for clustering multiple versions of a single word into a set of speaker‐independent word templates. Two such techniques are described in this paper. The first method uses distance data (between replications of a word) to segment the population into stable clusters. The word template is obtained as either the cluster minimax, or as an averaged version of all the elements in the cluster. The second method is a variation of the one described by Rabiner [IEEE Trans. Acoust. Speech Signal Process. ASSP‐26 (3), 34–42 (1978)] in which averaging techniques are directly combined with the nearest neighbor rule to simultaneously define both the word template (i.e., the cluster center) and the elements in the cluster. Experimental data show the first method to be superior to the second method when three or more clusters per word are used in the recognition task.

...read moreread less

71 citations

Proceedings Article•DOI•

Bayesian Semi-Supervised Chinese Word Segmentation for Statistical Machine Translation

[...]

Jia Xu¹, Jianfeng Gao², Kristina Toutanova², Hermann Ney¹•Institutions (2)

RWTH Aachen University¹, Microsoft²

18 Aug 2008

TL;DR: A Bayesian semi-supervised Chinese word segmentation model which uses both monolingual and bilingual information to derive a segmentation suitable for MT is proposed and improves a state-of-the-art MT system in a small and a large data environment.

...read moreread less

Abstract: Words in Chinese text are not naturally separated by delimiters, which poses a challenge to standard machine translation (MT) systems. In MT, the widely used approach is to apply a Chinese word segmenter trained from manually annotated data, using a fixed lexicon. Such word segmentation is not necessarily optimal for translation. We propose a Bayesian semi-supervised Chinese word segmentation model which uses both monolingual and bilingual information to derive a segmentation suitable for MT. Experiments show that our method improves a state-of-the-art MT system in a small and a large data environment.

...read moreread less

71 citations

Proceedings Article•

Session-independent emg-based speech recognition

[...]

Michael Wand¹, Tanja Schultz¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Jan 2011

TL;DR: This paper shows that session-independent training methods may be used to obtain robust EMGbased speech recognizers which cope well with unseen recording sessions as well as with speaking mode variations.

...read moreread less

Abstract: This paper reports on our recent research in speech recognition by surface electromyography (EMG), which is the technology of recording the electric activation potentials of the human articulatory muscles by surface electrodes in order to recognize speech. This method can be used to create Silent Speech Interfaces, since the EMG signal is available even when no audible signal is transmitted or captured. Several past studies have shown that EMG signals may vary greatly between different recording sessions, even of one and the same speaker. This paper shows that session-independent training methods may be used to obtain robust EMGbased speech recognizers which cope well with unseen recording sessions as well as with speaking mode variations. Our best session-independent recognition system, trained on 280 utterances of 7 different sessions, achieves an average 21.93% Word Error Rate (WER) on a testing vocabulary of 108 words. The overall best session-adaptive recognition system, based on a session-independent system and adapted towards the test session with 40 adaptation sentences, achieves an average WER of 15.66%, which is a relative improvement of 21% compared to the baseline average WER of 19.96% of a session-dependent recognition system trained only on a single session of 40 sentences.

...read moreread less

71 citations

Proceedings Article•DOI•

Denoising linear models with permuted data

[...]

Ashwin Pananjady¹, Martin J. Wainwright¹, Thomas A. Courtade¹•Institutions (1)

University of California, Berkeley¹

24 Apr 2017

TL;DR: In this article, the authors consider the multivariate linear regression model with shuffled data and additive noise, which arises in various correspondence estimation and matching problems and characterize the minimax error rate up to logarithmic factors.

...read moreread less

Abstract: We consider the multivariate linear regression model with shuffled data and additive noise, which arises in various correspondence estimation and matching problems. We focus on the denoising problem and characterize the minimax error rate up to logarithmic factors. We also analyze the performance of two versions of a computationally efficient estimator that are consistent for a large range of input parameters. Finally, we provide an exact algorithm for the noiseless problem and demonstrate its performance on an image point-cloud matching task. Our analysis also extends to datasets with missing data.

...read moreread less

71 citations

Proceedings Article•DOI•

A speech enhancement algorithm by iterating single- and multi-microphone processing and its application to robust ASR

[...]

Xueliang Zhang¹, Zhong-Qiu Wang², DeLiang Wang²•Institutions (2)

Inner Mongolia University¹, Ohio State University²

05 Mar 2017

TL;DR: The core of the algorithm estimates a time-frequency mask which represents the target speech and use masking-based beamforming to enhance corrupted speech and propose a masked-based post-filter to further suppress the noise in the output of beamforming.

...read moreread less

Abstract: We propose a speech enhancement algorithm based on single- and multi-microphone processing techniques. The core of the algorithm estimates a time-frequency mask which represents the target speech and use masking-based beamforming to enhance corrupted speech. Specifically, in single-microphone processing, the received signals of a microphone array are treated as individual signals and we estimate a mask for the signal of each microphone using a deep neural network (DNN). With these masks, in multi-microphone processing, we calculate a spatial covariance matrix of noise and steering vector for beamforming. In addition, we propose a masking-based post-filter to further suppress the noise in the output of beamforming. Then, the enhanced speech is sent back to DNN for mask re-estimation. When these steps are iterated for a few times, we obtain the final enhanced speech. The proposed algorithm is evaluated as a frontend for automatic speech recognition (ASR) and achieves a 5.05% average word error rate (WER) on the real environment test set of CHiME-3, outperforming the current best algorithm by 13.34%.

...read moreread less

71 citations

Collapse

Network Information

Performance

Metrics

12,777

Papers

335,740

Citations

No. of papers in the topic in previous years
Year	Papers
2023	271
2022	562
2021	640
2020	643
2019	633
2018	528

Word error rate

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics