Separation of speech from interfering speech by means of harmonic selection

doi:10.1121/1.381172

Journal ArticleDOI

Separation of speech from interfering speech by means of harmonic selection

Thomas W. Parsons

- 01 Oct 1976 -

Journal of the Acoustical Society of Ame...

- Vol. 60, Iss: 4, pp 911-918

Chats0

TLDR

In this paper, the harmonics of the desired voice in the Fourier transform of the input were selected to distinguish between two different voices. But the authors focus on the principal subproblem, the separation of vocalic speech.

Abstract:

A common type of interference in speech transmission is that caused by the speech of a competing talker. Although the brain is adept at clarifying such speech, it relies heavily on binaural data. When voices interfere over a single channel, separation is much more difficult and intelligibility suffers. Clarifying such speech is a complex and varied problem whose nature changes with the moment‐to‐moment variation in the types of sound which interfere. This paper describes an attack on the principal subproblem, the separation of vocalic speech. Separation is done by selecting the harmonics of the desired voice in the Fourier transform of the input. In implementing this process, techniques have been developed for resolving overlapping spectrum components, for determining pitches of both talkers, and for assuring consistent separation. These techniques are described, their performance on test utterances is summarized, and the possibility of using this process as a basis for the solution of the general two‐tal...

Citations

PDF

Open Access

More filters

A non-negative framework for joint modeling of spectral structure and temporal dynamics in sound mixtures

Gautham J. Mysore, +4 more

TL;DR: A new model of single sound sources, the non-negative hidden Markov model (N-HMM), is proposed that jointly models the spectral structure and temporal dynamics of a given source.

...read moreread less

Journal ArticleDOI

Separation of several speakers recorded by two microphones (cocktail-party processing)

Hans Werner Strube

- 01 Oct 1981 -

Signal Processing

TL;DR: In this article, a signal processing method for enhancing the directional separation of an ordinary (dummy-head) stereo speech recording is described that, after initial adaptation to a certain direction, simulates the human ability to concentrate on speech coming from this direction and to suppress disturbing speakers from other directions.

...read moreread less

Journal ArticleDOI

Monaural speech segregation based on fusion of source-driven with model-driven techniques

M.H. Radfar, +2 more

- 01 Jun 2007 -

Speech Communication

TL;DR: The results show that although for the speaker-dependent case, model-based separation delivers the best quality, for a speaker independent scenario the integrated model outperforms the individual approaches and supports the idea that the human auditory system takes on both grouping cues and a priori knowledge to segregate speech signals.

...read moreread less

Journal ArticleDOI

Capturing frequency components of glided tones: frequency separation, orientation, and alignment.

Ward Steiger, +1 more

- 01 Sep 1981 -

Attention Perception & Psychophysics

TL;DR: There appears to be no special capturing effect when the captor and target glides are aligned on a common trajectory.

...read moreread less

Journal ArticleDOI

BaNa: a noise resilient fundamental frequency detection algorithm for speech and music

Na Yang, +4 more

- 01 Dec 2014 -

IEEE Transactions on Audio, Speech, and ...

TL;DR: A hybrid noise resilient F0 detection algorithm named BaNa that combines the approaches of harmonic ratios and Cepstrum analysis is presented that achieves the lowest Gross Pitch Error (GPE) rate among all the algorithms.

...read moreread less

Collapse

Separation of speech from interfering speech by means of harmonic selection

Citations

A non-negative framework for joint modeling of spectral structure and temporal dynamics in sound mixtures

Separation of several speakers recorded by two microphones (cocktail-party processing)

Monaural speech segregation based on fusion of source-driven with model-driven techniques

Capturing frequency components of glided tones: frequency separation, orientation, and alignment.

BaNa: a noise resilient fundamental frequency detection algorithm for speech and music

Related Papers (5)

Computational auditory scene analysis

Some Experiments on the Recognition of Speech, with One and with Two Ears

Auditory Scene Analysis

Computational Auditory Scene Analysis: Principles, Algorithms, and Applications

Auditory Scene Analysis: The Perceptual Organization of Sound