Speech perception based algorithm for the separation of overlapping speech signal

doi:10.1109/ANZIIS.2001.974101

Proceedings ArticleDOI

Speech perception based algorithm for the separation of overlapping speech signal

Michael Christopher Orr, +3 more

- pp 341-344

Chats0

TLDR

Preliminary results show that some phonetic information, such as articulation placement and identification of voiced/unvoiced sections, can be extracted from the kurtosis analysis.

Abstract:

An algorithm for the analysis of speech utilising the time frequency properties of wavelets is introduced. The extracted wavelet coefficients are analysed using two techniques, firstly a covariance matrix is generated to provide information about speaker characteristics. Second, the kurtosis of the wavelet coefficients is used to facilitate the detection of multiple speakers. Preliminary results show that some phonetic information, such as articulation placement and identification of voiced/unvoiced sections, can be extracted from the kurtosis analysis.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Comparison of techniques for environmental sound recognition

Michael A. Cowling, +1 more

- 01 Nov 2003 -

Pattern Recognition Letters

TL;DR: A comprehensive comparative study of artificial neural networks, learning vector quantization and dynamic time warping classification techniques combined with stationary/non-stationary feature extraction for environmental sound recognition shows 70% recognition using mel frequency cepstral coefficients or continuous wavelet transform with dynamic time Warping.

...read moreread less

Journal ArticleDOI

Using One-Class SVMs and Wavelets for Audio Surveillance

Asma Rabaoui, +3 more

- 01 Dec 2008 -

IEEE Transactions on Information Forensi...

TL;DR: 1-SVM-based multiclass classification approach overperforms the conventional hidden Markov model-based system in the experiments conducted, the improvement in the error rate can reach 50%.

...read moreread less

Posted Content

Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks.

Muhammad Huzaifah

- 22 Jun 2017 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This study supports the hypothesis that time-frequency representations are valuable in learning useful features for sound classification and observes that the optimal window size during transformation is dependent on the characteristics of the audio signal and architecturally, 2D convolution yielded better results in most cases compared to 1D.

...read moreread less

Dissertation

Non-Speech Environmental Sound Classification System for Autonomous Surveillance

Michael Cowling

TL;DR: This thesis investigates techniques to recognise environmental non-speech sounds and their direction, with the purpose of using these techniques in an autonomous mobile surveillance robot, and presents advanced methods to improve the accuracy and efficiency of these techniques.

...read moreread less

Journal ArticleDOI

Audio sounds classification using scattering features and support vectors machines for medical surveillance

Sameh Souli, +1 more

- 15 Jan 2018 -

Applied Acoustics

TL;DR: The method integrates ability of PCA to de-correlate the coefficients by extracting a linear relationship with what of scatter transform analysis to derive feature vectors used for environmental sounds classification, and shows the superiority of this novel sound recognition method.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Robust identification of a nonminimum phase system: Blind adjustment of a linear equalizer in data communications

A. Benveniste, +2 more

- 01 Jun 1980 -

IEEE Transactions on Automatic Control

TL;DR: In this paper, an unknown linear time-invariant system without control, driven by a white noise with known distribution, is considered, and the identification of both gain and phase of the system, observing only the output, is presented.

...read moreread less

Journal ArticleDOI

Description and generation of spherically invariant speech-model signals

H. Brehm, +1 more

- 01 Mar 1987 -

Signal Processing

TL;DR: In this paper, spherically invariant random processes (SIRPs) are used as stationary models for speech signals in telephone channels and a comprehensive mathematical treatment is achieved by means of Meijer's G -functions.

...read moreread less

Proceedings ArticleDOI

The Australian National Database of Spoken Language

J.B. Millar, +3 more

TL;DR: The novel collaborative structure and procedures for selecting speakers, the material recorded, the recording environment, and subsequent annotation and descriptive procedures are described.

...read moreread less

Proceedings ArticleDOI

Speech separation by kurtosis maximization

J.P. LeBlanc, +1 more

TL;DR: A computationally efficient method of separating mixed speech signals using a recursive adaptive gradient descent technique with the cost function designed to maximize the kurtosis of the output (separated) signals is presented.

...read moreread less

Journal ArticleDOI

Algorithms for separating the speech of interfering talkers: Evaluations with voiced sentences, and normal‐hearing and hearing‐impaired listeners

Richard J. Stubbs, +1 more

- 01 Jan 1990 -

Journal of the Acoustical Society of Ame...

TL;DR: Two signal-processing algorithms, derived from those described by Stubbs and Summerfield, were used to separate the voiced speech of two talkers speaking simultaneously, at similar intensities, in a single channel and gave significant increases in intelligibility to both groups of listeners.

...read moreread less

Speech perception based algorithm for the separation of overlapping speech signal

Citations

Comparison of techniques for environmental sound recognition

Using One-Class SVMs and Wavelets for Audio Surveillance

Comparison of Time-Frequency Representations for Environmental Sound Classification using Convolutional Neural Networks.

Non-Speech Environmental Sound Classification System for Autonomous Surveillance

Audio sounds classification using scattering features and support vectors machines for medical surveillance

References

Robust identification of a nonminimum phase system: Blind adjustment of a linear equalizer in data communications

Description and generation of spherically invariant speech-model signals

The Australian National Database of Spoken Language

Speech separation by kurtosis maximization

Algorithms for separating the speech of interfering talkers: Evaluations with voiced sentences, and normal‐hearing and hearing‐impaired listeners

Related Papers (5)

Wavelet-based voiced/unvoiced classification algorithm

Wavelet transform based automatic speaker recognition

The usage of wavelet packet transformation in automatic noisy speech recognition systems

The Speech Recognition System Based On Bark Wavelet MFCC

Wavelet based Cepstral Coefficients for neural network speech recognition