scispace - formally typeset
Journal ArticleDOI

A Binaural Scene Analyzer for Joint Localization and Recognition of Speakers in the Presence of Interfering Noise Sources and Reverberation

Reads0
Chats0
TLDR
A binaural scene analyzer that is able to simultaneously localize, detect and identify a known number of target speakers in the presence of spatially positioned noise sources and reverberation is presented.
Abstract
In this study, we present a binaural scene analyzer that is able to simultaneously localize, detect and identify a known number of target speakers in the presence of spatially positioned noise sources and reverberation. In contrast to many other binaural cocktail party processors, the proposed system does not require a priori knowledge about the azimuth position of the target speakers. The proposed system consists of three main building blocks: binaural localization, speech source detection, and automatic speaker identification. First, a binaural front-end is used to robustly localize relevant sound source activity. Second, a speech detection module based on missing data classification is employed to determine whether detected sound source activity corresponds to a speaker or to an interfering noise source using a binary mask that is based on spatial evidence supplied by the binaural front-end. Third, a second missing data classifier is used to recognize the speaker identities of all detected speech sources. The proposed system is systematically evaluated in simulated adverse acoustic scenarios. Compared to state-of-the art MFCC recognizers, the proposed model achieves significant speaker recognition accuracy improvements.

read more

Citations
More filters
Journal ArticleDOI

CASA-Based Robust Speaker Identification

TL;DR: This work investigates CASA for robust speaker identification and introduces a novel speaker feature, gammatone frequency cepstral coefficient (GFCC), based on an auditory periphery model, and shows that this feature captures speaker characteristics and performs substantially better than conventional speaker features under noisy conditions.
Journal ArticleDOI

Robust speaker identification in noisy and reverberant conditions

TL;DR: A robust SID with speaker models trained in selected reverberant conditions is performed, on the basis of bounded marginalization and direct masking, which substantially improves SID performance over related systems in a wide range of reverberation time and signal-to-noise ratios.
Journal ArticleDOI

Binaural classification for reverberant speech segregation using deep neural networks

TL;DR: Evaluations and comparisons show that DNN-based binaural classification produces superior segregation performance in a variety of multisource and reverberant conditions.
Journal ArticleDOI

Overview and Evaluation of Sound Event Localization and Detection in DCASE 2019

TL;DR: An overview of the first international evaluation on sound event localization and detection, organized as a task of the DCASE 2019 Challenge, presents in detail how the systems were evaluated and ranked and the characteristics of the best-performing systems.
Journal ArticleDOI

Online Noisy Single-Channel Source Separation Using Adaptive Spectrum Amplitude Estimator and Masking

TL;DR: A novel single-channel source separation method is presented to recover the original signals given only a single observed mixture in noisy environment by formulating an artificial mixture from the observed mixture where the signals are modeled by the autoregressive process.
References
More filters
Journal ArticleDOI

An introduction to ROC analysis

TL;DR: The purpose of this article is to serve as an introduction to ROC graphs and as a guide for using them in research.
Journal ArticleDOI

Least squares quantization in PCM

TL;DR: In this article, the authors derived necessary conditions for any finite number of quanta and associated quantization intervals of an optimum finite quantization scheme to achieve minimum average quantization noise power.

Least Squares Quantization in PCM

TL;DR: The corresponding result for any finite number of quanta is derived; that is, necessary conditions are found that the quanta and associated quantization intervals of an optimum finite quantization scheme must satisfy.
Journal ArticleDOI

Speaker Verification Using Adapted Gaussian Mixture Models

TL;DR: The major elements of MIT Lincoln Laboratory's Gaussian mixture model (GMM)-based speaker verification system used successfully in several NIST Speaker Recognition Evaluations (SREs) are described.
Related Papers (5)