scispace - formally typeset
Proceedings ArticleDOI

Robust speech/non-speech detection using LDA applied to MFCC

Arnaud Martin, +2 more
- Vol. 1, pp 237-240
TLDR
In this article, a method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented.
Abstract
In speech recognition, speech/non-speech detection must be robust to,noise. In the paper, a method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented. The energy is the most discriminant parameter between noise and speech. But with this single parameter, the speech/non-speech detection system detects too many noise segments. The LDA applied to MFCC and the associated test reduces the detection of noise segments. This new algorithm is compared to the one based on signal to noise ratio (Mauuary and Monne, 1993).

read more

Citations
More filters
Proceedings Article

Mahimahi: accurate record-and-replay for HTTP

TL;DR: Mahimahi is a framework to record traffic from HTTP-based applications, and later replay it under emulated network conditions, designed as a set of composable shells, providing ease-of-use and extensibility.
Proceedings Article

Wishbone: profile-based partitioning for sensornet applications

TL;DR: Wishbone is a system that takes a dataflow graph of operators and produces an optimal partitioning, which shows that the system can quickly identify good trade-offs given limitations in CPU and network capacity.
Proceedings Article

Robust Speech/Non-Speech Detection using LDA applied to MFCC for Continuous Speech Recognition

TL;DR: A method for speech/non-speech detection using a linear discriminant analysis (LDA) applied to mel frequency cepstrum coefficients (MFCC) is presented, which reduces the detection of noise segments.
Patent

Voice activity detection system and method

TL;DR: In this paper, a set of frames containing an input signal is received, and at least two different feature vectors are determined for each of said frames by applying at least one weighting factor to each feature vector.
Proceedings ArticleDOI

Real-time speaker identification.

TL;DR: The number of test vectors is reduced by pre-quantizing the test sequence prior to matching, and the number of speakers are reduced by pruning out unlikely speakers during the identification process by optimizing vector quantization (VQ) based speaker identification.
References
More filters
Proceedings ArticleDOI

A novel approach to robust speech endpoint detection in car environments

TL;DR: A novel approach is proposed that finds robust features for endpoint detection in a noisy in-car environment by integrating both the widely used energy and entropy to form a new feature that possesses advantages of each individual while compensating for the drawback of each other.
Proceedings ArticleDOI

Speech/non-speech classification using multiple features for robust endpoint detection

TL;DR: A new speech/non-speech classification method that improves the endpoint detection performance for speech recognition in noisy environments and the classification and regression tree (CART) technique is applied to effectively combine these multiple features for classification of each frame.
Journal ArticleDOI

Towards improving ASR robustness for PSN and GSM telephone applications

TL;DR: The results obtained prove that HMM adaptation and preprocessing techniques can be advantageously combined to improve Automatic Speech Recognition (ASR) robustness and show that spectral subtraction improves speech detection under noisy GSM conditions.
Journal ArticleDOI

Evaluation of a statistical approach to voiced-unvoiced-silence analysis for telephone-quality speech

TL;DR: An investigation was undertaken to determine a suitable set of parameters that would provide a reliable voiced-unvoiced-silence decision across a variety of standard telephone connections, and the use of the Itakura two-pole spectral normalization was investigated to see its effect on the error scores.
Proceedings ArticleDOI

Prosodic word boundary detection using statistical modeling of moraic fundamental frequency contours and its use for continuous speech recognition

TL;DR: A new method for prosodic word boundary detection in continuous speech was developed based on the statistical modeling of moraic transitions of fundamental frequency (F/sub 0/) contours, formerly proposed by the authors.