Journal ArticleDOI
Relative Transfer Function Identification Using Convolutive Transfer Function Approximation
Reads0
Chats0
TLDR
An unbiased RTF estimator is developed that exploits the nonstationarity and presence probability of the speech signal and derive an analytic expression for the estimator variance.Abstract:
In this paper, we present a relative transfer function (RTF) identification method for speech sources in reverberant environments. The proposed method is based on the convolutive transfer function (CTF) approximation, which enables to represent a linear convolution in the time domain as a linear convolution in the short-time Fourier transform (STFT) domain. Unlike the restrictive and commonly used multiplicative transfer function (MTF) approximation, which becomes more accurate when the length of a time frame increases relative to the length of the impulse response, the CTF approximation enables representation of long impulse responses using short time frames. We develop an unbiased RTF estimator that exploits the nonstationarity and presence probability of the speech signal and derive an analytic expression for the estimator variance. Experimental results show that the proposed method is advantageous compared to common RTF identification methods in various acoustic environments, especially when identifying long RTFs typical to real rooms.read more
Citations
More filters
Journal ArticleDOI
Machine learning in acoustics: theory and applications
Michael J. Bianco,Peter Gerstoft,James Traer,Emma Ozanich,Marie A. Roch,Sharon Gannot,Charles-Alban Deledalle +6 more
TL;DR: In this paper, the authors survey the recent advances and transformative potential of machine learning (ML) including deep learning, in the field of acoustics and highlight ML developments in four acoustICS research areas: source localization in speech processing, source localization from ocean acoustic, bioacoustics, and environmental sounds in everyday scenes.
Book ChapterDOI
Acoustic Beamforming for Hearing Aid Applications
TL;DR: This chapter contains sections titled: Introduction Overview of noise reduction techniques Monaural beamforming Binaural beamforms Conclusion References.
Journal ArticleDOI
Multi-channel linear prediction-based speech dereverberation with sparse priors
TL;DR: This paper proposes to model the desired speech signal using a general sparse prior that can be represented in a convex form as a maximization over scaled complex Gaussian distributions, which can be interpreted as a generalization of the commonly used time-varying Gaussian model.
Journal ArticleDOI
The binaural LCMV beamformer and its performance analysis
TL;DR: A theoretical analysis of the BLCMV beamformer is presented and several decompositions are introduced that reveal its capabilities in terms of interference and noise reduction, while controlling the binaural cues of the desired and the interfering sources.
Journal ArticleDOI
Theoretical analysis of binaural transfer function MVDR beamformers with interference cue preservation constraints
TL;DR: Among all beamformers which are distortionless with respect to the desired source and preserve the binaural cues of the interfering source, the newly proposed BMVDR-RTF beamformer is optimal in terms of SINR.
References
More filters
Journal ArticleDOI
Image method for efficiently simulating small‐room acoustics
Jont B. Allen,David A. Berkley +1 more
TL;DR: The theoretical and practical use of image techniques for simulating the impulse response between two points in a small rectangular room, when convolved with any desired input signal, simulates room reverberation of the input signal.
Darpa Timit Acoustic-Phonetic Continuous Speech Corpus CD-ROM {TIMIT} | NIST
John S. Garofolo,Lori Lamel,W M. Fisher,Jonathan G. Fiscus,David S. Pallett,Nancy L. Dahlgren +5 more
Dataset
TIMIT Acoustic-Phonetic Continuous Speech Corpus
John S. Garofolo,Lori Lamel,William M. Fisher,Jonathan C. Fiscus,David S. Pallett,Nancy L. Dahlgren,Victor W. Zue +6 more
TL;DR: The TIMIT corpus as mentioned in this paper contains broadband recordings of 630 speakers of eight major dialects of American English, each reading ten phonetically rich sentences, including time-aligned orthographic, phonetic and word transcriptions as well as a 16-bit, 16kHz speech waveform file for each utterance.
Journal ArticleDOI
Noise spectrum estimation in adverse environments: improved minima controlled recursive averaging
TL;DR: In this article, an improved minima controlled recursive averaging (IMCRA) approach is proposed for noise estimation in adverse environments involving nonstationary noise, weak speech components, and low input signal-to-noise ratio (SNR).