scispace - formally typeset
Proceedings ArticleDOI

Voice/Unvoice detection based on a composite-Gaussian source model of speech

V. Ramamoorthy
- Vol. 5, pp 57-60
Reads0
Chats0
TLDR
This detector is an approximation to a maximum-a-posteriori sequence estimator and employs the Viterbi algorithm and the performance of this detector with real speech is dealt with.
Abstract
A composite-Gaussian source model for speech was suggested at ICASSP-79. Based upon this model a voice/unvoiced detector is derived. This detector is an approximation to a maximum-a-posteriori sequence estimator and employs the Viterbi algorithm. This paper deals with the performance of this detector with real speech.

read more

Citations
More filters
Journal ArticleDOI

Robust voice activity detection using higher-order statistics in the LPC residual domain

TL;DR: The proposed VAD algorithm combines HOS metrics with second-order measures, such as SNR and LPC prediction error, to classify speech and noise frames and derives a voicing condition for speech frames based on the relation between the skewness and kurtosis of voiced speech.
Journal ArticleDOI

A robust algorithm for accurate endpointing of speech signals

TL;DR: A robust new algorithm for accurate endpointing of speech signals is described in this paper after an overview of the literature, which uses simple measures based on energy and zero-crossing rate for speech/silence detection.
Journal ArticleDOI

A spectral autocorrelation method for measurement of the fundamental frequency of noise-corrupted speech

TL;DR: A method for measurement of the fundamental frequency of a voiced speech signal corrupted by high levels of additive white Gaussian noise and voiced/unvoiced classification by making use of a two-dimensional, nearest-neighbor pattern recognition approach.
Proceedings ArticleDOI

Neural networks for voiced/unvoiced speech classification

TL;DR: A small neural network performs well on the V/UV problem and is suitable for speech classification on the basis of features that are common and easily computed.
Journal ArticleDOI

On the higher order distributions of speech signals

TL;DR: It is demonstrated that the third- and fourth-order distributions of the speech signal can be considered as mixtures of, rather than single, spherically invariant (Gaussian) distributions, where the mixing is controlled by the history of the process.
References
More filters
Journal ArticleDOI

Maximum-likelihood sequence estimation of digital sequences in the presence of intersymbol interference

TL;DR: In this paper, a maximum likelihood sequence estimator for a digital pulse-amplitude-modulated sequence in the presence of finite intersymbol interference and white Gaussian noise is developed, which comprises a sampled linear filter, called a whitened matched filter, and a recursive nonlinear processor, called the Viterbi algorithm.
Journal ArticleDOI

A pattern recognition approach to voiced-unvoiced-silence classification with applications to speech recognition

TL;DR: A pattern recognition approach for deciding whether a given segment of a speech signal should be classified as voiced speech, unvoiced speech, or silence, based on measurements made on the signal, which has been found to provide reliable classification with speech segments as short as 10 ms.
Journal ArticleDOI

Optimal reception for binary partial response channels

TL;DR: This paper describes an exceptionally simple scheme for binary partial response signal formats of the form a k ± a k-1 (for l ≧ 1, and a k = ±1) that is not generalizable to multilevel signaling while still retaining its simplicity.
Journal ArticleDOI

Application of an LPC distance measure to the voiced-unvoiced-silence detection problem

TL;DR: A novel approach to the voiced-unvoiced-silence detection problem is proposed in which a spectral characterization of each of the three classes of signal is obtained during a training session, and an LPC distance measure and an energy distance are nonlinearly combined to make the final discrimination.
Journal ArticleDOI

A procedure for using pattern classification techniques to obtain a voiced/Unvoiced classifier

TL;DR: In the training procedure, covering and satisfaction were attained on successively larger sets of speakers, and a classifier was obtained which could correctly make the V/UV decision for all of the speakers used in testing, including those not used in the training process.