Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model

doi:10.1109/TASL.2010.2050716

Open AccessJournal ArticleDOI

Under-Determined Reverberant Audio Source Separation Using a Full-Rank Spatial Covariance Model

Ngoc Q. K. Duong, +2 more

- 01 Sep 2010 -

IEEE Transactions on Audio, Speech, and ...

- Vol. 18, Iss: 7, pp 1830-1840

TLDR

In this article, the contribution of each source to all mixture channels in the time-frequency domain was modeled as a zero-mean Gaussian random variable whose covariance encodes the spatial characteristics of the source.

Abstract:

This paper addresses the modeling of reverberant recording environments in the context of under-determined convolutive blind source separation. We model the contribution of each source to all mixture channels in the time-frequency domain as a zero-mean Gaussian random variable whose covariance encodes the spatial characteristics of the source. We then consider four specific covariance models, including a full-rank unconstrained model. We derive a family of iterative expectation-maximization (EM) algorithms to estimate the parameters of each model and propose suitable procedures adapted from the state-of-the-art to initialize the parameters and to align the order of the estimated sources across all frequency bins. Experimental results over reverberant synthetic mixtures and live recordings of speech data show the effectiveness of the proposed approach.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

A Consolidated Perspective on Multimicrophone Speech Enhancement and Source Separation

Sharon Gannot, +3 more

- 01 Apr 2017 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This paper proposes to analyze a large number of established and recent techniques according to four transverse axes: 1) the acoustic impulse response model, 2) the spatial filter design criterion, 3) the parameter estimation algorithm, and 4) optional postfiltering.

...read moreread less

Journal ArticleDOI

An analysis of environment, microphone and data simulation mismatches in robust speech recognition

Emmanuel Vincent, +4 more

- 01 Nov 2017 -

Computer Speech & Language

TL;DR: It is found that training on different noise environments and different microphones barely affects the ASR performance, especially when several environments are present in the training data: only the number of microphones has a significant impact.

...read moreread less

Journal ArticleDOI

Multichannel Audio Source Separation With Deep Neural Networks

Aditya Arie Nugraha, +2 more

- 16 Jun 2016 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This article proposes a framework where deep neural networks are used to model the source spectra and combined with the classical multichannel Gaussian model to exploit the spatial information and presents its application to a speech enhancement problem.

...read moreread less

Journal ArticleDOI

A General Flexible Framework for the Handling of Prior Information in Audio Source Separation

Alexey Ozerov, +2 more

- 01 May 2012 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This paper introduces a general audio source separation framework based on a library of structured source models that enable the incorporation of prior knowledge about each source via user-specifiable constraints.

...read moreread less

Journal ArticleDOI

Determined blind source separation unifying independent vector analysis and nonnegative matrix factorization

Daichi Kitamura, +4 more

- 01 Sep 2016 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: This paper addresses the determined blind source separation problem and proposes a new effective method unifying independent vector analysis (IVA) and nonnegative matrix factorization (NMF) based on conventional multichannel NMF (MNMF), which reveals the relationship between MNMF and IVA.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Maximum likelihood from incomplete data via the EM algorithm

Arthur P. Dempster, +2 more

- 01 Sep 1977 -

Journal of the royal statistical society...

Book

The EM algorithm and extensions

Geoffrey J. McLachlan, +1 more

TL;DR: The EM Algorithm and Extensions describes the formulation of the EM algorithm, details its methodology, discusses its implementation, and illustrates applications in many statistical contexts, opening the door to the tremendous potential of this remarkably versatile statistical tool.

...read moreread less

Journal Article

Maximum likelihood estimation from incomplete data via the EM algorithm

A. Dempster

- 01 Jan 1977 -

Journal of the Royal Statistical Society

Journal ArticleDOI

Beamforming: a versatile approach to spatial filtering

B.D. Van Veen, +1 more

- 01 Apr 1988 -

IEEE Assp Magazine

TL;DR: An overview of beamforming from a signal-processing perspective is provided, with an emphasis on recent research.

...read moreread less

Journal ArticleDOI

Blind separation of speech mixtures via time-frequency masking

Ozgur Yilmaz, +1 more

- 01 Jul 2004 -

IEEE Transactions on Signal Processing

TL;DR: The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture and show that the W-disjoint orthogonality of speech can be approximate in the case where two anechoic mixtures are provided.

...read moreread less