Heterogeneous measurements and multiple classifiers for speech recognition.

Open AccessProceedings Article

Heterogeneous measurements and multiple classifiers for speech recognition.

Chats0

TLDR

In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates that are the lowest the authors have seen reported on these tasks.

Abstract:

This paper addresses the problem of acoustic phonetic modeling. First, heterogeneous acoustic measurements are chosen in order to maximize the acoustic-phonetic information extracted from the speech signal in preprocessing. Second, classifier systems are presented for successfully utilizing high-dimensional acoustic measurement spaces. The techniques used for achieving these two goals can be broadly categorized as hierarchical, committeebased, or a hybrid of these two. This paper presents committeebased and hybrid approaches. In context-independent classification and context-dependent recognition on the TIMIT core test set using 39 classes, the system achieved error rates of 18.3% and 24.4%, respectively. These error rates are the lowest we have seen reported on these tasks. In addition, experiments with a telephone-based weather information word recognition task led to word error rate reductions of 10–16%.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups

Geoffrey E. Hinton, +10 more

- 18 Oct 2012 -

IEEE Signal Processing Magazine

TL;DR: This article provides an overview of progress and represents the shared views of four research groups that have had recent successes in using DNNs for acoustic modeling in speech recognition.

...read moreread less

Journal Article

Deep Neural Networks for Acoustic Modeling in Speech Recognition

Geoffrey E. Hinton, +10 more

- 01 Nov 2012 -

IEEE Signal Processing Magazine

TL;DR: This paper provides an overview of this progress and repres nts the shared views of four research groups who have had recent successes in using deep neural networks for a coustic modeling in speech recognition.

...read moreread less

Journal ArticleDOI

Acoustic Modeling Using Deep Belief Networks

Abdelrahman Mohamed, +2 more

- 01 Jan 2012 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: It is shown that better phone recognition on the TIMIT dataset can be achieved by replacing Gaussian mixture models by deep neural networks that contain many layers of features and a very large number of parameters.

...read moreread less

Journal ArticleDOI

JUPlTER: a telephone-based conversational interface for weather information

Victor W. Zue, +6 more

- 01 Jan 2000 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: The purpose of this paper is to describe the development effort of JUPITER in terms of the underlying human language technologies as well as other system-related issues such as utterance rejection and content harvesting.

...read moreread less

Journal ArticleDOI

Discriminative Training for Large-Vocabulary Speech Recognition Using Minimum Classification Error

Erik McDermott, +4 more

- 01 Jan 2007 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This article reports significant gains in recognition performance and model compactness as a result of discriminative training based on MCE training applied to HMMs, in the context of three challenging large-vocabulary speech recognition tasks.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Speaker-independent phone recognition using hidden Markov models

Kai-Fu Lee, +1 more

- 01 Nov 1989 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: The authors introduce the co-occurrence smoothing algorithm, which enables accurate recognition even with very limited training data, and can be used as benchmarks to evaluate future systems.

...read moreread less

Journal ArticleDOI

Speech recognition by machines and humans

Richard P. Lippmann

- 01 Jul 1997 -

Speech Communication

TL;DR: Comparisons suggest that the human-machine performance gap can be reduced by basic research on improving low-level acoustic-phonetic modeling, on improving robustness with noise and channel variability, and on more accurately modeling spontaneous speech.

...read moreread less

Proceedings ArticleDOI

A probabilistic framework for feature-based speech recognition

James Glass, +2 more

TL;DR: This paper examines a maximum a-posteriori decoding strategy for feature-based recognizers and develops a normalization criterion that is useful for a segment-based Viterbi or A* search.

...read moreread less

Proceedings Article

High performance speaker-independent phone recognition using CDHMM.

Lori Lamel, +1 more

TL;DR: It is shown that it is worthwhile to perform phone recognition experiments as opposed to only focusing attention on word recognition results, and high phone accuracies on three corpora: WSJ0, BREF and TIMIT are reported.

...read moreread less

Proceedings ArticleDOI

Improved phone recognition using Bayesian triphone models

Ji Ming, +1 more

TL;DR: A new statistical framework, derived from the Bayesian principle, is introduced to perform a triphone model from less context-dependent models, and the potential power of this new framework is explored, both algorithmically and experimentally, by an implementation with hidden Markov modeling techniques.

...read moreread less