scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Suppression of acoustic noise in speech using spectral subtraction

S. Boll1
01 Apr 1979-IEEE Transactions on Acoustics, Speech, and Signal Processing (IEEE)-Vol. 27, Iss: 2, pp 113-120
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Citations
More filters
Journal ArticleDOI
TL;DR: In this article, a system which utilizes a minimum mean square error (MMSE) estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm.
Abstract: This paper focuses on the class of speech enhancement systems which capitalize on the major importance of the short-time spectral amplitude (STSA) of the speech signal in its perception. A system which utilizes a minimum mean-square error (MMSE) STSA estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm. In this paper we derive the MMSE STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables. We analyze the performance of the proposed STSA estimator and compare it with a STSA estimator derived from the Wiener estimator. We also examine the MMSE STSA estimator under uncertainty of signal presence in the noisy observations. In constructing the enhanced signal, the MMSE STSA estimator is combined with the complex exponential of the noisy phase. It is shown here that the latter is the MMSE estimator of the complex exponential of the original phase, which does not affect the STSA estimation. The proposed approach results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise. The complexity of the proposed algorithm is approximately that of other systems in the discussed class.

3,905 citations

09 Mar 2012
TL;DR: Artificial neural networks (ANNs) constitute a class of flexible nonlinear models designed to mimic biological neural systems as mentioned in this paper, and they have been widely used in computer vision applications.
Abstract: Artificial neural networks (ANNs) constitute a class of flexible nonlinear models designed to mimic biological neural systems. In this entry, we introduce ANN using familiar econometric terminology and provide an overview of ANN modeling approach and its implementation methods. † Correspondence: Chung-Ming Kuan, Institute of Economics, Academia Sinica, 128 Academia Road, Sec. 2, Taipei 115, Taiwan; ckuan@econ.sinica.edu.tw. †† I would like to express my sincere gratitude to the editor, Professor Steven Durlauf, for his patience and constructive comments on early drafts of this entry. I also thank Shih-Hsun Hsu and Yu-Lieh Huang for very helpful suggestions. The remaining errors are all mine.

2,069 citations

Journal ArticleDOI
TL;DR: The theoretical and experimental foundations of the RASTA method are reviewed, the relationship with human auditory perception is discussed, the original method is extended to combinations of additive noise and convolutional noise, and an application is shown to speech enhancement.
Abstract: Performance of even the best current stochastic recognizers severely degrades in an unexpected communications environment. In some cases, the environmental effect can be modeled by a set of simple transformations and, in particular, by convolution with an environmental impulse response and the addition of some environmental noise. Often, the temporal properties of these environmental effects are quite different from the temporal properties of speech. We have been experimenting with filtering approaches that attempt to exploit these differences to produce robust representations for speech recognition and enhancement and have called this class of representations relative spectra (RASTA). In this paper, we review the theoretical and experimental foundations of the method, discuss the relationship with human auditory perception, and extend the original method to combinations of additive noise and convolutional noise. We discuss the relationship between RASTA features and the nature of the recognition models that are required and the relationship of these features to delta features and to cepstral mean subtraction. Finally, we show an application of the RASTA technique to speech enhancement. >

2,002 citations

Proceedings ArticleDOI
02 Apr 1979
TL;DR: This paper describes a method for enhancing speech corrupted by broadband noise based on the spectral noise subtraction method, which can automatically adapt to a wide range of signal-to-noise ratios, as long as a reasonable estimate of the noise spectrum can be obtained.
Abstract: This paper describes a method for enhancing speech corrupted by broadband noise. The method is based on the spectral noise subtraction method. The original method entails subtracting an estimate of the noise power spectrum from the speech power spectrum, setting negative differences to zero, recombining the new power spectrum with the original phase, and then reconstructing the time waveform. While this method reduces the broadband noise, it also usually introduces an annoying "musical noise". We have devised a method that eliminates this "musical noise" while further reducing the background noise. The method consists in subtracting an overestimate of the noise power spectrum, and preventing the resultant spectral components from going below a preset minimum level (spectral floor). The method can automatically adapt to a wide range of signal-to-noise ratios, as long as a reasonable estimate of the noise spectrum can be obtained. Extensive listening tests were performed to determine the quality and intelligibility of speech enhanced by our method. Listeners unanimously preferred the quality of the processed speech. Also, for an input signal-to-noise ratio of 5 dB, there was no loss of intelligibility associated with the enhancement technique.

1,352 citations

Journal ArticleDOI
Alan R. Jones1

1,349 citations

References
More filters
Journal ArticleDOI
24 Mar 1975
TL;DR: It is shown that in treating periodic interference the adaptive noise canceller acts as a notch filter with narrow bandwidth, infinite null, and the capability of tracking the exact frequency of the interference; in this case the canceller behaves as a linear, time-invariant system, with the adaptive filter converging on a dynamic rather than a static solution.
Abstract: This paper describes the concept of adaptive noise cancelling, an alternative method of estimating signals corrupted by additive noise or interference. The method uses a "primary" input containing the corrupted signal and a "reference" input containing noise correlated in some unknown way with the primary noise. The reference input is adaptively filtered and subtracted from the primary input to obtain the signal estimate. Adaptive filtering before subtraction allows the treatment of inputs that are deterministic or stochastic, stationary or time variable. Wiener solutions are developed to describe asymptotic adaptive performance and output signal-to-noise ratio for stationary stochastic inputs, including single and multiple reference inputs. These solutions show that when the reference input is free of signal and certain other conditions are met noise in the primary input can be essentiany eliminated without signal distortion. It is further shown that in treating periodic interference the adaptive noise canceller acts as a notch filter with narrow bandwidth, infinite null, and the capability of tracking the exact frequency of the interference; in this case the canceller behaves as a linear, time-invariant system, with the adaptive filter converging on a dynamic rather than a static solution. Experimental results are presented that illustrate the usefulness of the adaptive noise cancelling technique in a variety of practical applications. These applications include the cancelling of various forms of periodic interference in electrocardiography, the cancelling of periodic interference in speech signals, and the cancelling of broad-band interference in the side-lobes of an antenna array. In further experiments it is shown that a sine wave and Gaussian noise can be separated by using a reference input that is a delayed version of the primary input. Suggested applications include the elimination of tape hum or turntable rumble during the playback of recorded broad-band signals and the automatic detection of very-low-level periodic signals masked by broad-band noise.

4,165 citations

Journal ArticleDOI
Alan R. Jones1

1,349 citations

Journal ArticleDOI
Jont B. Allen1
TL;DR: In this article, a theory of short term spectral analysis, synthesis, and modification is presented with an attempt at pointing out certain practical and theoretical questions, which are useful in designing filter banks when the filter bank outputs are to be used for synthesis after multiplicative modifications are made to the spectrum.
Abstract: A theory of short term spectral analysis, synthesis, and modification is presented with an attempt at pointing out certain practical and theoretical questions. The methods discussed here are useful in designing filter banks when the filter bank outputs are to be used for synthesis after multiplicative modifications are made to the spectrum.

899 citations

Journal ArticleDOI
TL;DR: This paper considers the estimation of speech parameters in an all-pole model when the speech has been degraded by additive background noise and develops a procedure based on maximum a posteriori (MAP) estimation techniques which is related to linear prediction analysis of speech.
Abstract: This paper considers the estimation of speech parameters in an all-pole model when the speech has been degraded by additive background noise. The procedure, based on maximum a posteriori (MAP) estimation techniques is first developed in the absence of noise and related to linear prediction analysis of speech. The modification in the presence of background noise is shown to be nonlinear. Two suboptimal procedures are suggested which have linear iterative implementations. A preliminary illustration and discussion based both on a synthetic example and real speech data are given.

590 citations

Journal ArticleDOI
TL;DR: The book that the authors will offer right here is the soft file concept, which make you can easily find and get this linear prediction of speech by reading this site.
Abstract: If you get the printed book in on-line book store, you may also find the same problem. So, you must move store to store and search for the available there. But, it will not happen here. The book that we will offer right here is the soft file concept. This is what make you can easily find and get this linear prediction of speech by reading this site. We offer you the best product, always and always.

517 citations