scispace - formally typeset
Journal ArticleDOI

Speech recognition using noise-adaptive prototypes

A. Nadas, +2 more
- 01 Oct 1989 - 
- Vol. 37, Iss: 10, pp 1495-1503
Reads0
Chats0
TLDR
A probabilistic mixture mode is described for a frame (the short term spectrum) of speech to be used in speech recognition and each component is regarded as a prototype for the labeling phase of a hidden Markov model based speech recognition system.
Abstract
A probabilistic mixture mode is described for a frame (the short term spectrum) of speech to be used in speech recognition. Each component of the mixture is regarded as a prototype for the labeling phase of a hidden Markov model based speech recognition system. Since the ambient noise during recognition can differ from that present in the training data, the model is designed for convenient updating in changing noise. Based on the observation that the energy in a frequency band is at any fixed time dominated either by signal energy or by noise energy, the energy is modeled as the larger of the separate energies of signal and noise in the band. Statistical algorithms are given for training this as a hidden variables model. The hidden variables are the prototype identities and the separate signal and noise components. Speech recognition experiments that successfully utilize this model are described. >

read more

Citations
More filters
Journal ArticleDOI

Speech recognition in noisy environments: a survey

TL;DR: The survey indicates that the essential points in noisy speech recognition consist of incorporating time and frequency correlations, giving more importance to high SNR portions of speech in decision making, exploiting task-specific a priori knowledge both of speech and of noise, using class-dependent processing, and including auditory models in speech processing.
Proceedings ArticleDOI

Hidden Markov model decomposition of speech and noise

TL;DR: A technique of signal decomposition using hidden Markov models is described that provides an optimal method of decomposing simultaneous processes and has wide implications for signal separation in general and improved speech modeling in particular.
Journal ArticleDOI

Convolutional Recurrent Neural Networks for Polyphonic Sound Event Detection

TL;DR: In this paper, a convolutional recurrent neural network (CRNN) was proposed for polyphonic sound event detection task and compared with CNN, RNN and other established methods, and observed a considerable improvement for four different datasets consisting of everyday sound events.
Journal ArticleDOI

A maximum-likelihood approach to stochastic matching for robust speech recognition

TL;DR: A maximum-likelihood (ML) stochastic matching approach to decrease the acoustic mismatch between a test utterance and a given set of speech models so as to reduce the recognition performance degradation caused by distortions in the test utterances and/or the model set.
Journal ArticleDOI

Statistical-model-based speech enhancement systems

TL;DR: A unified statistical approach for the three basic problems of speech enhancement is developed, using composite source models for the signal and noise and a fairly large set of distortion measures.
References
More filters
Journal ArticleDOI

Mixture densities, maximum likelihood, and the EM algorithm

Richard A. Redner, +1 more
- 01 Apr 1984 - 
TL;DR: This work discusses the formulation and theoretical and practical properties of the EM algorithm, a specialization to the mixture density context of a general algorithm used to approximate maximum-likelihood estimates for incomplete data problems.
Journal ArticleDOI

Digital filtering using logarithmic arithmetic

TL;DR: A method of computation is described in which all signals are encoded logarithmically, giving a great improvement in dynamic range compared with fixed-point linearly encoded arithmetic.
Journal ArticleDOI

Noise adaptation in a hidden Markov model speech recognition system

TL;DR: Several ways for making the signal processing in an isolated word speech recognition system more robust against large variations in the background noise level are presented.
Journal ArticleDOI

Application of an auditory model to speech recognition

TL;DR: A new process includes adaptation, loudness scaling, and mel warping in a front end for the IBM speech-recognition system and tests show that the design is an improvement over previous algorithms.
Related Papers (5)