scispace - formally typeset
Open AccessJournal ArticleDOI

Sound source segregation based on estimating incident angle of each frequency component of input signals acquired by multiple microphones

TLDR
A method of segregating desired speech from concurrent sounds received by two microphones that improved the signal-to-noise ratio by over 18dB and clarified the effect of frequency resolution on the proposed method.
Abstract
We have developed a method of segregating desired speech from concurrent sounds received by two microphones. In this method, which we call SAFIA, signals received by two microphones are analyzed by discrete Fourier transformation. For each frequency component, differences in the amplitude and phase between channels are calculated. These differences are used to select frequency components of the signal that come from the desired direction and to reconstruct these components as the desired source signal. To clarify the effect of frequency resolution on the proposed method, we conducted three experiments. First, we analyzed the relationship between frequency resolition and the power spectrum’s cumulative distribution. We found that the speech-signal power was concentrated on specific frequency components when the frequency resolution was about 10-Hz. Second, we determined whether a given frequency resolution decreased the overlap between the frequency components of two speech signals. A 10-Hz frequency resolution minimized the overlap. Third, we analyzed the relationship between sound quality and frequency resolution through subjective tests. The best frequency resolution in terms of sound quality corresponded to the frequency resolutions that concentrated the speech signal power on specific frequency components and that minimized the degree of overlap. Finally, we demonstrated that this method improved the signal-to-noise ratio by over 18dB.

read more

Citations
More filters
Journal ArticleDOI

Blind separation of speech mixtures via time-frequency masking

TL;DR: The results demonstrate that there exist ideal binary time-frequency masks that can separate several speech signals from one mixture and show that the W-disjoint orthogonality of speech can be approximate in the case where two anechoic mixtures are provided.
Journal ArticleDOI

Underdetermined Convolutive Blind Source Separation via Frequency Bin-Wise Clustering and Permutation Alignment

TL;DR: A blind source separation method for convolutive mixtures of speech/audio sources that can be applied to an underdetermined case where there are fewer microphones than sources is presented.
Journal ArticleDOI

A time-frequency blind signal separation method applicable to underdetermined mixtures of dependent sources

TL;DR: A new blind source separation (BSS) method called Time-Frequency Ratio Of Mixtures (TIFROM) which uses time-frequency information to cancel source signal contributions from a set of linear instantaneous mixtures of these sources.
Journal ArticleDOI

Underdetermined blind sparse source separation for arbitrarily arranged multiple sensors

TL;DR: This paper presents a new method for blind sparse source separation that can be clustered by the k-means algorithm and easily applied to more than three sensors arranged non-linearly, and has obtained promising results for two- and three-dimensionally distributed speech separation with non-linear/non-uniform sensor arrays in a real room even in underdetermined situations.
Journal ArticleDOI

Survey of sparse and non-sparse methods in source separation

TL;DR: This paper surveys the recent arrival of sparse blind source separation methods and the previously existing non‐sparse methods, providing insights and appropriate hooks into the literature along the way.
References
More filters
Journal ArticleDOI

An information-maximization approach to blind separation and blind deconvolution

TL;DR: It is suggested that information maximization provides a unifying framework for problems in "blind" signal processing and dependencies of information transfer on time delays are derived.
Journal ArticleDOI

Suppression of acoustic noise in speech using spectral subtraction

TL;DR: A stand-alone noise suppression algorithm that resynthesizes a speech waveform and can be used as a pre-processor to narrow-band voice communications systems, speech recognition systems, or speaker authentication systems.
Journal ArticleDOI

Introduction to the Psychology of Hearing

TL;DR: In this paper, the authors provide an account of current trends in auditory research on a level not too technical for the novice, by relating psychological and perceptual aspects of sound to the underlying physiological mechanisms of hearing in a way that the material can be used as a text to accompany an advanced undergraduate or graduate level course in auditory perception.
Journal ArticleDOI

Enhancement and bandwidth compression of noisy speech

TL;DR: An overview of the variety of techniques that have been proposed for enhancement and bandwidth compression of speech degraded by additive background noise is provided to suggest a unifying framework in terms of which the relationships between these systems is more visible and which hopefully provides a structure which will suggest fruitful directions for further research.
Journal ArticleDOI

Inverse filtering of room acoustics

TL;DR: In this article, a novel method is proposed for realizing exact inverse filtering of acoustic impulse responses in room, based on the principle called the multiple-input/output inverse theorem (MINT).
Related Papers (5)