RASTA processing of speech

doi:10.1109/89.326616

Journal ArticleDOI

RASTA processing of speech

Hynek Hermansky, +1 more

- 01 Oct 1994 -

IEEE Transactions on Speech and Audio Pr...

- Vol. 2, Iss: 4, pp 578-589

TLDR

The theoretical and experimental foundations of the RASTA method are reviewed, the relationship with human auditory perception is discussed, the original method is extended to combinations of additive noise and convolutional noise, and an application is shown to speech enhancement.

Abstract:

Performance of even the best current stochastic recognizers severely degrades in an unexpected communications environment. In some cases, the environmental effect can be modeled by a set of simple transformations and, in particular, by convolution with an environmental impulse response and the addition of some environmental noise. Often, the temporal properties of these environmental effects are quite different from the temporal properties of speech. We have been experimenting with filtering approaches that attempt to exploit these differences to produce robust representations for speech recognition and enhancement and have called this class of representations relative spectra (RASTA). In this paper, we review the theoretical and experimental foundations of the method, discuss the relationship with human auditory perception, and extend the original method to combinations of additive noise and convolutional noise. We discuss the relationship between RASTA features and the nature of the recognition models that are required and the relationship of these features to delta features and to cepstral mean subtraction. Finally, we show an application of the RASTA technique to speech enhancement. >

Citations

PDF

Open Access

More filters

Journal ArticleDOI

An overview of text-independent speaker recognition: From features to supervectors

Tomi Kinnunen, +1 more

- 01 Jan 2010 -

Speech Communication

TL;DR: This paper starts with the fundamentals of automatic speaker recognition, concerning feature extraction and speaker modeling and elaborate advanced computational techniques to address robustness and session variability.

...read moreread less

Journal ArticleDOI

Supervised Speech Separation Based on Deep Learning: An Overview

DeLiang Wang, +1 more

- 01 Oct 2018 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: A comprehensive overview of deep learning-based supervised speech separation can be found in this paper, where three main components of supervised separation are discussed: learning machines, training targets, and acoustic features.

...read moreread less

Book

Application of Hidden Markov Models in Speech Recognition

Mark J. F. Gales, +1 more

TL;DR: The aim of this review is first to present the core architecture of a HMM-based LVCSR system and then to describe the various refinements which are needed to achieve state-of-the-art performance.

...read moreread less

Journal ArticleDOI

Multiresolution spectrotemporal analysis of complex sounds

Tai-Shih Chi, +2 more

- 04 Aug 2005 -

Journal of the Acoustical Society of Ame...

TL;DR: A computational model of auditory analysis is described that is inspired by psychoacoustical and neurophysiological findings in early and central stages of the auditory system and provides a unified multiresolution representation of the spectral and temporal features likely critical in the perception of sound.

...read moreread less

Journal ArticleDOI

Audio-visual speech modeling for continuous speech recognition

Stéphane Dupont, +1 more

- 01 Sep 2000 -

IEEE Transactions on Multimedia

TL;DR: A speech recognition system that uses both acoustic and visual speech information to improve recognition performance in noisy environments and is demonstrated on a large multispeaker database of continuously spoken digits.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Suppression of acoustic noise in speech using spectral subtraction

S. Boll

- 01 Apr 1979 -

IEEE Transactions on Acoustics, Speech, ...

TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

...read moreread less

Journal ArticleDOI

Perceptual linear predictive (PLP) analysis of speech

Hynek Hermansky

- 01 Apr 1990 -

Journal of the Acoustical Society of Ame...

TL;DR: A new technique for the analysis of speech, the perceptual linear predictive (PLP) technique, which uses three concepts from the psychophysics of hearing to derive an estimate of the auditory spectrum, and yields a low-dimensional representation of speech.

...read moreread less

Journal ArticleDOI

Blind deconvolution through digital signal processing

Thomas G. Stockham, +2 more

TL;DR: In this paper, the blind deconvolution problem of two signals when both are unknown is addressed and two related solutions which can be applied through digital signal processing in certain practical cases are discussed.

...read moreread less

Journal ArticleDOI

Differential Intensity Sensitivity of the Ear for Pure Tones

R. R. Riesz

- 01 May 1928 -

Physical Review

TL;DR: In this article, the authors measured the differential sensitivity of the ear as a function of frequency and intensity and found that the ear can distinguish 370 separate tones between the threshold of audition and threshold of feeling at about 1300 c.p.s.

...read moreread less

Proceedings ArticleDOI

RASTA-PLP speech analysis technique

Hynek Hermansky, +3 more

TL;DR: The authors have developed a technique that is more robust to such steady-state spectral factors in speech that is conceptually simple and computationally efficient.

...read moreread less

RASTA processing of speech

Citations

An overview of text-independent speaker recognition: From features to supervectors

Supervised Speech Separation Based on Deep Learning: An Overview

Application of Hidden Markov Models in Speech Recognition

Multiresolution spectrotemporal analysis of complex sounds

Audio-visual speech modeling for continuous speech recognition

References

Suppression of acoustic noise in speech using spectral subtraction

Perceptual linear predictive (PLP) analysis of speech

Blind deconvolution through digital signal processing

Differential Intensity Sensitivity of the Ear for Pure Tones

RASTA-PLP speech analysis technique

Related Papers (5)

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

Suppression of acoustic noise in speech using spectral subtraction

Speaker Verification Using Adapted Gaussian Mixture Models

Fundamentals of speech recognition

A tutorial on hidden Markov models and selected applications in speech recognition