Accurate compensation in the log-spectral domain for noisy speech recognition

doi:10.1109/TSA.2005.845811

Journal ArticleDOI

Accurate compensation in the log-spectral domain for noisy speech recognition

Mohamed Afify

- 18 Apr 2005 -

IEEE Transactions on Speech and Audio Pr...

- Vol. 13, Iss: 3, pp 388-398

TLDR

Experimental results for digit recognition in the car reveal that the proposed technique significantly outperform the baseline, and first order VTS, and the compensation algorithm is found to be more accurate and faster than an approximate numerical integration technique.

Abstract:

This paper presents a new algorithm for noise compensation in the log-spectral domain. We first note that using a Gaussian mixture assumption a compensation algorithm in the log-spectral domain is completely defined by three parameters for each Gaussian component: the noisy speech mean, the noisy speech variance, and the covariance of clean and noisy speech. Starting from a well known mismatch function we propose two new approximations which allow deriving analytical expressions for the above mentioned parameters, and hence develop a new noise compensation algorithm in the log-spectral domain. In addition to theoretical derivations we discuss implementation issues of the proposed method and analyze its computational complexity. Experimental results for digit recognition in the car reveal that the proposed technique significantly outperform the baseline, and first order VTS. For example at 10 db signal to noise ratio the baseline, first order VTS, and the proposed method lead to recognition accuracies 82.6%, 85.5%, and 90.1%. The superiority of the proposed method to VTS can be attributed to the accuracy of the employed approximations. The compensation algorithm is also found to be more accurate and faster than an approximate numerical integration technique.

Accurate compensation in the log-spectral domain for noisy speech recognition

Citations

Normalization of the Speech Modulation Spectra for Robust Speech Recognition

Stereo-Based Stochastic Mapping for Robust Speech Recognition

Stereo-Based Stochastic Mapping for Robust Speech Recognition

A Study on the Generalization Capability of Acoustic Models for Robust Speech Recognition

Robust speech features and acoustic models for speech recognition

References

Probability, random variables, and stochastic processes

Probability, random variables and stochastic processes

Random variables and stochastic processes

Speech recognition in noisy environments: a survey

A useful theorem for nonlinear devices having Gaussian inputs

Related Papers (5)

Extended VTS for Noise-Robust Speech Recognition

ALGONQUIN: iterating laplace's method to remove multiple types of acoustic distortion for robust speech recognition.

Speech recognition in noisy environments

Speech recognition in noisy environments: a survey

On-line Gaussian mixture modeling in the log-power domain for signal-to-noise ratio estimation and speech enhancement