Robust speech/non-speech detection using LDA applied to MFCC

doi:10.1109/ICASSP.2001.940811

Proceedings ArticleDOI

Robust speech/non-speech detection using LDA applied to MFCC

- Vol. 1, pp 237-240

TLDR

In this article, a method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented.

Abstract:

In speech recognition, speech/non-speech detection must be robust to,noise. In the paper, a method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented. The energy is the most discriminant parameter between noise and speech. But with this single parameter, the speech/non-speech detection system detects too many noise segments. The LDA applied to MFCC and the associated test reduces the detection of noise segments. This new algorithm is compared to the one based on signal to noise ratio (Mauuary and Monne, 1993).

Citations

PDF

Open Access

More filters

Proceedings Article

Mahimahi: accurate record-and-replay for HTTP

Ravi Netravali, +6 more

TL;DR: Mahimahi is a framework to record traffic from HTTP-based applications, and later replay it under emulated network conditions, designed as a set of composable shells, providing ease-of-use and extensibility.

...read moreread less

Proceedings Article

Wishbone: profile-based partitioning for sensornet applications

Ryan R. Newton, +4 more

TL;DR: Wishbone is a system that takes a dataflow graph of operators and produces an optimal partitioning, which shows that the system can quickly identify good trade-offs given limitations in CPU and network capacity.

...read moreread less

Proceedings Article

Robust Speech/Non-Speech Detection using LDA applied to MFCC for Continuous Speech Recognition

Arnaud Martin, +2 more

TL;DR: A method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented, which reduces the detection of noise segments.

...read moreread less

Patent

Voice activity detection system and method

Zica Valsan

TL;DR: In this paper, a set of frames containing an input signal is received, and at least two different feature vectors are determined for each of said frames by applying at least one weighting factor to each feature vector.

...read moreread less

Proceedings ArticleDOI

Real-time speaker identification.

Pasi Fränti, +2 more

TL;DR: The number of test vectors is reduced by pre-quantizing the test sequence prior to matching, and the number of speakers are reduced by pruning out unlikely speakers during the identification process by optimizing vector quantization (VQ) based speaker identification.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

A novel approach to robust speech endpoint detection in car environments

Liang-Sheng Huang, +1 more

TL;DR: A novel approach is proposed that finds robust features for endpoint detection in a noisy in-car environment by integrating both the widely used energy and entropy to form a new feature that possesses advantages of each individual while compensating for the drawback of each other.

...read moreread less

Proceedings ArticleDOI

Speech/non-speech classification using multiple features for robust endpoint detection

Won-Ho Shin, +3 more

TL;DR: A new speech/non-speech classification method that improves the endpoint detection performance for speech recognition in noisy environments and the classification and regression tree (CART) technique is applied to effectively combine these multiple features for classification of each frame.

...read moreread less

Journal ArticleDOI

Towards improving ASR robustness for PSN and GSM telephone applications

Chafic Mokbel, +6 more

- 01 Oct 1997 -

Speech Communication

TL;DR: The results obtained prove that HMM adaptation and preprocessing techniques can be advantageously combined to improve Automatic Speech Recognition (ASR) robustness and show that spectral subtraction improves speech detection under noisy GSM conditions.

...read moreread less

Journal ArticleDOI

Evaluation of a statistical approach to voiced-unvoiced-silence analysis for telephone-quality speech

Lawrence R. Rabiner, +2 more

- 01 Mar 1977 -

Bell System Technical Journal

TL;DR: An investigation was undertaken to determine a suitable set of parameters that would provide a reliable voiced-unvoiced-silence decision across a variety of standard telephone connections, and the use of the Itakura two-pole spectral normalization was investigated to see its effect on the error scores.

...read moreread less

Proceedings ArticleDOI

Prosodic word boundary detection using statistical modeling of moraic fundamental frequency contours and its use for continuous speech recognition

Koji Iwano, +1 more

TL;DR: A new method for prosodic word boundary detection in continuous speech was developed based on the statistical modeling of moraic transitions of fundamental frequency (F/sub 0/) contours, formerly proposed by the authors.

...read moreread less

Robust speech/non-speech detection using LDA applied to MFCC

Citations

Mahimahi: accurate record-and-replay for HTTP

Wishbone: profile-based partitioning for sensornet applications

Robust Speech/Non-Speech Detection using LDA applied to MFCC for Continuous Speech Recognition

Voice activity detection system and method

Real-time speaker identification.

References

A novel approach to robust speech endpoint detection in car environments

Speech/non-speech classification using multiple features for robust endpoint detection

Towards improving ASR robustness for PSN and GSM telephone applications

Evaluation of a statistical approach to voiced-unvoiced-silence analysis for telephone-quality speech

Prosodic word boundary detection using statistical modeling of moraic fundamental frequency contours and its use for continuous speech recognition

Related Papers (5)

Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences

A robust endpoint detection of speech for noisy environments with application to automatic speech recognition

Discrete-Time Processing of Speech Signals

Fundamentals of speech recognition

A statistical model-based voice activity detection