scispace - formally typeset
Search or ask a question
Author

Ronald W. Schafer

Bio: Ronald W. Schafer is an academic researcher from Hewlett-Packard. The author has contributed to research in topics: Speech processing & Digital signal processing. The author has an hindex of 17, co-authored 53 publications receiving 16192 citations. Previous affiliations of Ronald W. Schafer include Massachusetts Institute of Technology & Georgia Institute of Technology.


Papers
More filters
Proceedings ArticleDOI
22 May 2011
TL;DR: Experimental results show that the correlograms-based time delay estimation method outperforms a traditional correlogram-based method as well as the GCC-PHAT, especially for short analysis windows in a moderately reverberant environment.
Abstract: We propose a correlogram-based time delay estimation method using signals modeled as the output of the cochlea, where the low-level signal processing happens in the human auditory system. With a normalized correlogram that preserves time-delay patterns that are invariant to speech features such as formants, we employ two-dimensional template matching for time-delay estimation. Experimental results show that our method outperforms a traditional correlogram-based method as well as the GCC-PHAT, especially for short analysis windows in a moderately reverberant environment.

1 citations

Posted Content
TL;DR: This framework assigns higher priority in removing spectral elements that strongly deviate from a typical spoken unit stored in the trained dictionary, while effectively preserving the underlying speech in more challenging noisy environments.
Abstract: Distortion of the underlying speech is a common problem for single-channel speech enhancement algorithms, and hinders such methods from being used more extensively. A dictionary based speech enhancement method that emphasizes preserving the underlying speech is proposed. Spectral patches of clean speech are sampled and clustered to train a dictionary. Given a noisy speech spectral patch, the best matching dictionary entry is selected and used to estimate the noise power at each time-frequency bin. The noise estimation step is formulated as an outlier detection problem, where the noise at each bin is assumed present only if it is an outlier to the corresponding bin of the best matching dictionary entry. This framework assigns higher priority in removing spectral elements that strongly deviate from a typical spoken unit stored in the trained dictionary. Even without the aid of a separate noise model, this method can achieve significant noise reduction for various non-stationary noises, while effectively preserving the underlying speech in more challenging noisy environments.

1 citations

Posted Content
TL;DR: In this paper, the capacity analysis for wireless mobile systems with multiple antenna architectures is considered, and the results of the first part are applied to a commonly known baseband, discrete-time multiple antenna system where both the transmitter and receiver know the channel's statistical law.
Abstract: In this part, we consider the capacity analysis for wireless mobile systems with multiple antenna architectures. We apply the results of the first part to a commonly known baseband, discrete-time multiple antenna system where both the transmitter and receiver know the channel's statistical law. We analyze the capacity for additive white Gaussian noise (AWGN) channels, fading channels with full channel state information (CSI) at the receiver, fading channels with no CSI, and fading channels with partial CSI at the receiver. For each type of channels, we study the capacity value as well as issues such as the existence, uniqueness, and characterization of the capacity-achieving measures for different types of moment constraints. The results are applicable to both Rayleigh and Rician fading channels in the presence of arbitrary line-of-sight and correlation profiles.

1 citations


Cited by
More filters
Book
16 Mar 2001

7,058 citations

Journal ArticleDOI
John Makhoul1
01 Apr 1975
TL;DR: This paper gives an exposition of linear prediction in the analysis of discrete signals as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal.
Abstract: This paper gives an exposition of linear prediction in the analysis of discrete signals The signal is modeled as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal In the frequency domain, this is equivalent to modeling the signal spectrum by a pole-zero spectrum The major part of the paper is devoted to all-pole models The model parameters are obtained by a least squares analysis in the time domain Two methods result, depending on whether the signal is assumed to be stationary or nonstationary The same results are then derived in the frequency domain The resulting spectral matching formulation allows for the modeling of selected portions of a spectrum, for arbitrary spectral shaping in the frequency domain, and for the modeling of continuous as well as discrete spectra This also leads to a discussion of the advantages and disadvantages of the least squares error criterion A spectral interpretation is given to the normalized minimum prediction error Applications of the normalized error are given, including the determination of an "optimal" number of poles The use of linear prediction in data compression is reviewed For purposes of transmission, particular attention is given to the quantization and encoding of the reflection (or partial correlation) coefficients Finally, a brief introduction to pole-zero modeling is given

4,206 citations

Book
01 Jan 2000
TL;DR: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora, to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation.
Abstract: From the Publisher: This book takes an empirical approach to language processing, based on applying statistical and other machine-learning algorithms to large corpora.Methodology boxes are included in each chapter. Each chapter is built around one or more worked examples to demonstrate the main idea of the chapter. Covers the fundamental algorithms of various fields, whether originally proposed for spoken or written language to demonstrate how the same algorithm can be used for speech recognition and word-sense disambiguation. Emphasis on web and other practical applications. Emphasis on scientific evaluation. Useful as a reference for professionals in any of the areas of speech and language processing.

3,794 citations

Journal ArticleDOI
TL;DR: The method introduces complexity parameters for time series based on comparison of neighboring values and shows that its complexity behaves similar to Lyapunov exponents, and is particularly useful in the presence of dynamical or observational noise.
Abstract: We introduce complexity parameters for time series based on comparison of neighboring values. The definition directly applies to arbitrary real-world data. For some well-known chaotic dynamical systems it is shown that our complexity behaves similar to Lyapunov exponents, and is particularly useful in the presence of dynamical or observational noise. The advantages of our method are its simplicity, extremely fast calculation, robustness, and invariance with respect to nonlinear monotonous transformations.

3,433 citations

Journal ArticleDOI
TL;DR: This work explores both traditional and novel techniques for addressing the data-hiding process and evaluates these techniques in light of three applications: copyright protection, tamper-proofing, and augmentation data embedding.
Abstract: Data hiding, a form of steganography, embeds data into digital media for the purpose of identification, annotation, and copyright. Several constraints affect this process: the quantity of data to be hidden, the need for invariance of these data under conditions where a "host" signal is subject to distortions, e.g., lossy compression, and the degree to which the data must be immune to interception, modification, or removal by a third party. We explore both traditional and novel techniques for addressing the data-hiding process and evaluate these techniques in light of three applications: copyright protection, tamper-proofing, and augmentation data embedding.

3,037 citations