Speech recognition using noise-adaptive prototypes

doi:10.1109/29.35387

Journal ArticleDOI

Speech recognition using noise-adaptive prototypes

A. Nadas, +2 more

- 01 Oct 1989 -

IEEE Transactions on Acoustics, Speech, ...

- Vol. 37, Iss: 10, pp 1495-1503

Chats0

TLDR

A probabilistic mixture mode is described for a frame (the short term spectrum) of speech to be used in speech recognition and each component is regarded as a prototype for the labeling phase of a hidden Markov model based speech recognition system.

Abstract:

A probabilistic mixture mode is described for a frame (the short term spectrum) of speech to be used in speech recognition. Each component of the mixture is regarded as a prototype for the labeling phase of a hidden Markov model based speech recognition system. Since the ambient noise during recognition can differ from that present in the training data, the model is designed for convenient updating in changing noise. Based on the observation that the energy in a frequency band is at any fixed time dominated either by signal energy or by noise energy, the energy is modeled as the larger of the separate energies of signal and noise in the band. Statistical algorithms are given for training this as a hidden variables model. The hidden variables are the prototype identities and the separate signal and noise components. Speech recognition experiments that successfully utilize this model are described. >

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Speech recognition in noisy environments: a survey

Yifan Gong

- 01 Apr 1995 -

Speech Communication

TL;DR: The survey indicates that the essential points in noisy speech recognition consist of incorporating time and frequency correlations, giving more importance to high SNR portions of speech in decision making, exploiting task-specific a priori knowledge both of speech and of noise, using class-dependent processing, and including auditory models in speech processing.

...read moreread less

Proceedings ArticleDOI

Hidden Markov model decomposition of speech and noise

Andrew Varga, +1 more

TL;DR: A technique of signal decomposition using hidden Markov models is described that provides an optimal method of decomposing simultaneous processes and has wide implications for signal separation in general and improved speech modeling in particular.

...read moreread less

Journal ArticleDOI

Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

Emre Cakir, +4 more

- 01 Jun 2017 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: In this paper, a convolutional recurrent neural network (CRNN) was proposed for polyphonic sound event detection task and compared with CNN, RNN and other established methods, and observed a considerable improvement for four different datasets consisting of everyday sound events.

...read moreread less

Journal ArticleDOI

A maximum-likelihood approach to stochastic matching for robust speech recognition

Ananth Sankar, +1 more

- 01 May 1996 -

IEEE Transactions on Speech and Audio Pr...

TL;DR: A maximum-likelihood (ML) stochastic matching approach to decrease the acoustic mismatch between a test utterance and a given set of speech models so as to reduce the recognition performance degradation caused by distortions in the test utterances and/or the model set.

...read moreread less

Journal ArticleDOI

Statistical-model-based speech enhancement systems

Yariv Ephraim

TL;DR: A unified statistical approach for the three basic problems of speech enhancement is developed, using composite source models for the signal and noise and a fairly large set of distortion measures.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Journal ArticleDOI

Mixture densities, maximum likelihood, and the EM algorithm

Richard A. Redner, +1 more

- 01 Apr 1984 -

Siam Review

TL;DR: This work discusses the formulation and theoretical and practical properties of the EM algorithm, a specialization to the mixture density context of a general algorithm used to approximate maximum-likelihood estimates for incomplete data problems.

...read moreread less

Journal ArticleDOI

Digital filtering using logarithmic arithmetic

Nick Kingsbury, +1 more

- 28 Jan 1971 -

Electronics Letters

TL;DR: A method of computation is described in which all signals are encoded logarithmically, giving a great improvement in dynamic range compared with fixed-point linearly encoded arithmetic.

...read moreread less

Journal ArticleDOI

Noise adaptation in a hidden Markov model speech recognition system

Dirk Van Compernolle

- 01 Apr 1989 -

Computer Speech & Language

TL;DR: Several ways for making the signal processing in an isolated word speech recognition system more robust against large variations in the background noise level are presented.

...read moreread less

Journal ArticleDOI

Application of an auditory model to speech recognition

Jordan Cohen

- 01 Jun 1989 -

Journal of the Acoustical Society of Ame...

TL;DR: A new process includes adaptation, loudness scaling, and mel warping in a front end for the IBM speech-recognition system and tests show that the design is an improvement over previous algorithms.

...read moreread less

Speech recognition using noise-adaptive prototypes

Citations

Speech recognition in noisy environments: a survey

Hidden Markov model decomposition of speech and noise

Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

A maximum-likelihood approach to stochastic matching for robust speech recognition

Statistical-model-based speech enhancement systems

References

Maximum likelihood from incomplete data via the EM algorithm

Mixture densities, maximum likelihood, and the EM algorithm

Digital filtering using logarithmic arithmetic

Noise adaptation in a hidden Markov model speech recognition system

Application of an auditory model to speech recognition

Related Papers (5)

Hidden Markov model decomposition of speech and noise

Maximum likelihood from incomplete data via the EM algorithm

Suppression of acoustic noise in speech using spectral subtraction

Factorial Hidden Markov Models

An audio-visual corpus for speech perception and automatic speech recognition