scispace - formally typeset
Search or ask a question
Author

S. Boll

Bio: S. Boll is an academic researcher from University of Utah. The author has contributed to research in topics: Noise & Noise measurement. The author has an hindex of 4, co-authored 4 publications receiving 4721 citations.

Papers
More filters
Journal ArticleDOI
S. Boll1
TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Abstract: A stand-alone noise suppression algorithm is presented for reducing the spectral effects of acoustically added noise in speech. Effective performance of digital speech processors operating in practical environments may require suppression of noise from the digital wave-form. Spectral subtraction offers a computationally efficient, processor-independent approach to effective digital speech analysis. The method, requiring about the same computation as high-speed convolution, suppresses stationary noise from speech by subtracting the spectral noise bias calculated during nonspeech activity. Secondary procedures are then applied to attenuate the residual noise left after subtraction. Since the algorithm resynthesizes a speech waveform, it can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.

4,862 citations

Journal ArticleDOI
TL;DR: Two approaches to adaptive noise cancellation are compared to reduce ambient noise power by at least 20 dB with minimal speech distortion and thus to be potentially powerful as noise suppression preprocessors for voice communication in severe noise environments.
Abstract: Acoustic noise with energy greater or equal to the speech can be suppressed by adaptively filtering a separately recorded correlated version of the noise signal and subtracting it from the speech waveform. It is shown that for this application of adaptive noise cancellation, large filter lengths are required to account for a highly reverberant recording environment and that there is a direct relation between filter misadjustment and induced echo in the output speech. The second reference noise signal is adaptively filtered using the least mean squares, LMS, and the lattice gradient algorithms. These two approaches are compared in terms of degree of noise power reduction, algorithm convergence time, and degree of speech enhancement. Both methods were shown to reduce ambient noise power by at least 20 dB with minimal speech distortion and thus to be potentially powerful as noise suppression preprocessors for voice communication in severe noise environments.

151 citations

Journal ArticleDOI
TL;DR: A parameterized family of constant-Q analysis-synthesis transform pairs is developed from a property of homogeneous functions that allows for a wide choice of selections for center frequencies, bandwidths, and filter shapes.
Abstract: The formal derivation of a transformation which models the frequency selective properties (critical bandwidths) of the auditory system is developed. A parameterized family of constant-Q analysis-synthesis transform pairs is developed from a property of homogeneous functions. This formulation allows for a wide choice of selections for center frequencies, bandwidths, and filter shapes. A particular member of the transform family is implemented to model the frequency selective properties of the peripheral auditory system. With this transform, short-time spectral analysis using critical band filter shapes can be implemented. In the absence of spectral modification, the analysis-synthesis transform can be made arbitrarily close to an identity system. This new approach to analysis-synthesis provides the necessary mathematical support needed to design and optimize both constant-Q and critical band analysis-synthesis transforms.

15 citations

Proceedings ArticleDOI
01 Apr 1981
TL;DR: A parameterized family of analysis-synthesis transform pairs which behave as identities in the absence of perceptual modification is developed from a property of homogeneous functions to facilitate a flexible choice of analysis frequencies and frequency selective response characteristics.
Abstract: The formal derivation of an integral transformation which can simulate certain frequency selective (critical bandwidth) properties of the auditory system is given. A parameterized family of analysis-synthesis transform pairs which behave as identities in the absence of perceptual modification is developed from a property of homogeneous functions. The formulation facilitates a flexible choice of analysis frequencies and frequency selective response characteristics. A particular member of the transform famiy is then implemented to simulate frequency selective properties of the peripheral auditory system.

9 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: In this article, a system which utilizes a minimum mean square error (MMSE) estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm.
Abstract: This paper focuses on the class of speech enhancement systems which capitalize on the major importance of the short-time spectral amplitude (STSA) of the speech signal in its perception. A system which utilizes a minimum mean-square error (MMSE) STSA estimator is proposed and then compared with other widely used systems which are based on Wiener filtering and the "spectral subtraction" algorithm. In this paper we derive the MMSE STSA estimator, based on modeling speech and noise spectral components as statistically independent Gaussian random variables. We analyze the performance of the proposed STSA estimator and compare it with a STSA estimator derived from the Wiener estimator. We also examine the MMSE STSA estimator under uncertainty of signal presence in the noisy observations. In constructing the enhanced signal, the MMSE STSA estimator is combined with the complex exponential of the noisy phase. It is shown here that the latter is the MMSE estimator of the complex exponential of the original phase, which does not affect the STSA estimation. The proposed approach results in a significant reduction of the noise, and provides enhanced speech with colorless residual noise. The complexity of the proposed algorithm is approximately that of other systems in the discussed class.

3,905 citations

09 Mar 2012
TL;DR: Artificial neural networks (ANNs) constitute a class of flexible nonlinear models designed to mimic biological neural systems as mentioned in this paper, and they have been widely used in computer vision applications.
Abstract: Artificial neural networks (ANNs) constitute a class of flexible nonlinear models designed to mimic biological neural systems. In this entry, we introduce ANN using familiar econometric terminology and provide an overview of ANN modeling approach and its implementation methods. † Correspondence: Chung-Ming Kuan, Institute of Economics, Academia Sinica, 128 Academia Road, Sec. 2, Taipei 115, Taiwan; ckuan@econ.sinica.edu.tw. †† I would like to express my sincere gratitude to the editor, Professor Steven Durlauf, for his patience and constructive comments on early drafts of this entry. I also thank Shih-Hsun Hsu and Yu-Lieh Huang for very helpful suggestions. The remaining errors are all mine.

2,069 citations

Journal ArticleDOI
TL;DR: The theoretical and experimental foundations of the RASTA method are reviewed, the relationship with human auditory perception is discussed, the original method is extended to combinations of additive noise and convolutional noise, and an application is shown to speech enhancement.
Abstract: Performance of even the best current stochastic recognizers severely degrades in an unexpected communications environment. In some cases, the environmental effect can be modeled by a set of simple transformations and, in particular, by convolution with an environmental impulse response and the addition of some environmental noise. Often, the temporal properties of these environmental effects are quite different from the temporal properties of speech. We have been experimenting with filtering approaches that attempt to exploit these differences to produce robust representations for speech recognition and enhancement and have called this class of representations relative spectra (RASTA). In this paper, we review the theoretical and experimental foundations of the method, discuss the relationship with human auditory perception, and extend the original method to combinations of additive noise and convolutional noise. We discuss the relationship between RASTA features and the nature of the recognition models that are required and the relationship of these features to delta features and to cepstral mean subtraction. Finally, we show an application of the RASTA technique to speech enhancement. >

2,002 citations

Proceedings ArticleDOI
02 Apr 1979
TL;DR: This paper describes a method for enhancing speech corrupted by broadband noise based on the spectral noise subtraction method, which can automatically adapt to a wide range of signal-to-noise ratios, as long as a reasonable estimate of the noise spectrum can be obtained.
Abstract: This paper describes a method for enhancing speech corrupted by broadband noise. The method is based on the spectral noise subtraction method. The original method entails subtracting an estimate of the noise power spectrum from the speech power spectrum, setting negative differences to zero, recombining the new power spectrum with the original phase, and then reconstructing the time waveform. While this method reduces the broadband noise, it also usually introduces an annoying "musical noise". We have devised a method that eliminates this "musical noise" while further reducing the background noise. The method consists in subtracting an overestimate of the noise power spectrum, and preventing the resultant spectral components from going below a preset minimum level (spectral floor). The method can automatically adapt to a wide range of signal-to-noise ratios, as long as a reasonable estimate of the noise spectrum can be obtained. Extensive listening tests were performed to determine the quality and intelligibility of speech enhanced by our method. Listeners unanimously preferred the quality of the processed speech. Also, for an input signal-to-noise ratio of 5 dB, there was no loss of intelligibility associated with the enhancement technique.

1,352 citations

Journal ArticleDOI
Alan R. Jones1

1,349 citations