scispace - formally typeset
Search or ask a question

Showing papers by "Ronald W. Schafer published in 2009"


Patent
15 Jun 2009
TL;DR: In this article, the authors proposed a method for de-verberating reverberant digital signals by inverse transforming the Fourier domain signals with filtered magnitudes into an approximate dereverberated digital signal.
Abstract: Various embodiments of the present invention are directed to methods for dereverberation of audio generated in a room. In one aspect, a method for dereverberating reverberant digital signals comprises transforming a reverberant digital signal from the time domain into Fourier domain signals using a computing device, each Fourier domain signal corresponding to a subband. For each subband of the Fourier domain signal, the method computes autoregressive model coefficients of the reverberation with the current and previous magnitudes of the Fourier digital signal, and inverse filters the magnitude of the Fourier domain signal using the computing device, based on the autoregressive model coefficients and previous magnitudes of the Fourier digital signal. The method includes inverse transforming the Fourier domain signals with filtered magnitudes into an approximate dereverberated digital signal.

12 citations


Patent
30 Apr 2009
TL;DR: In this article, the authors proposed an adaptive method for reducing acoustic echoes in multichannel audio communication systems, which adapts to changes in the room by inferring approximate impulse responses that lie within a model of an impulse response space.
Abstract: Various embodiments of the present invention are directed to adaptive methods for reducing acoustic echoes in multichannel audio communication systems. Acoustic echo cancellation methods determine approximate impulse responses characterizing each echo path between loudspeakers and microphones within a room and improve performance based on previously determined impulse responses. In particular, the methods adapt to changes in the room by inferring approximate impulse responses that lie within a model of an impulse response space. Over time the method improves performance by evolving the model into a more accurate space from which to select subsequent approximate impulse responses.

10 citations


Proceedings ArticleDOI
23 Oct 2009
TL;DR: This paper analyzes the ability of several measurements to quantify the reverberation effect in speech signals and considers an intrusive scheme, in which the clean and reverberated signals are available, allowing one to estimate the corresponding room impulse response signal.
Abstract: This paper analyzes the ability of several measurements to quantify the reverberation effect in speech signals. We consider an intrusive scheme, in which the clean and reverberated signals are available, allowing one to estimate the corresponding room impulse response (RIR) signal. An artificial neural network (ANN) is trained for all features and used in a regression approach to estimate the human perceptual evaluation in a mean opinion score (MOS) 1–5 scale. Dimensionality reduction approaches are applied to generate a simpler ANN regression, establishing the most representative features for the problem at hand. A correlation level of 85% with subjective test scores was achieved by reducing the input-vector dimension from 10 to 3, including only the features of reverberation time, room spectral variance, and direct-to-reverberant energy ratio.

9 citations


Proceedings ArticleDOI
23 Oct 2009
TL;DR: This work approaches quality evaluation of full-band (24 kHz) high-quality speech corrupted by echo by proposing a simple metric singled out from a standardized double-ended tool for audio QA as a solution for the problem at hand.
Abstract: Modern telepresence systems can deliver multimedia signals of unprecedentedly high quality of experience to the user. Setting and maintaining such services call for reliable and automatic tools for multimedia quality probing, in special those targeted at speech data along the transmission path. Most of the objective methods for sound quality assessment (QA) in the literature are intended for either speech signals of 4- to 8-kHz bandwidth or general audio until 24 kHz, but are not specifically designed for speech at high sampling-rates. This work approaches quality evaluation of full-band (24 kHz) high-quality speech corrupted by echo. A simple metric singled out from a standardized double-ended tool for audio QA is proposed as a solution for the problem at hand. Quality measures from a set of speech stimuli corrupted by echo under controlled conditions were obtained via listening tests to allow calibration and evaluation of the proposed method. Experimental results reveal an overall correlation of 0.94 between objective and subjective scores, even in the presence of moderate additive noise.

4 citations


Proceedings ArticleDOI
23 Oct 2009
TL;DR: A threshold-based hierarchical classification system is proposed, being completely defined from pre-processing of the input signals, passing through the estimation of a few characteristic features, up to data clustering criteria.
Abstract: In this paper the design of a double-ended (intrusive) diagnostic tool for identifying five types of degradation in audio signals is reported. The impairment types taken into consideration are additive contamination with pink noise, occurrences of signal mutes, distortion by magnitude clipping, and the previous two types mixed with pink noise. As a simple solution to accomplish the established goal, a threshold-based hierarchical classification system is proposed, being completely defined from pre-processing of the input signals, passing through the estimation of a few characteristic features, up to data clustering criteria. Performance evaluation of the classifier is carried out via a validation database containing 60 impaired signals for each type of impairment, with five distinct degradation intensity levels. Considering the types and range of degradation levels considered in this work, excellent results are achieved, scoring above 96% of correctly classified data in the worst case. System performance in identifying mixed impairment types tends to deteriorate as the strength of the noise component increases.

3 citations