scispace - formally typeset
Search or ask a question
Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.


Papers
More filters
Proceedings ArticleDOI
03 Apr 1990
TL;DR: It is shown that bicepstrum-based time delay estimation provides larger performance gain when the noise sources are Gaussian spatially correlated with unknown correlation function.
Abstract: The authors introduce a novel approach for bicepstrum (i.e. cepstrum of the bispectrum) estimation that combines the use of second- and third-order statistics, and show application of this method to the problem of detecting and estimating the time delay between signals received at two spatially separated sensors together with noise, as well as to nonminimum phase system identification. The approach for time delay estimation is parametric, in the sense that it provides the time delay estimates directly and explicitly instead of examining the results of some reference functions. The performance of this method is demonstrated for different noise conditions and lengths of data. The performance is also compared to that of the least-squares method, which is based entirely on second-order statistics. It is shown that bicepstrum-based time delay estimation provides larger performance gain when the noise sources are Gaussian spatially correlated with unknown correlation function. >

16 citations

Journal ArticleDOI
TL;DR: The effectiveness of local vector transform (LVT) was examined in parameter determination for synthesized and naturally uttered speech signals, suggesting that the method is effective for analyzing time-varying voiced speech signals.
Abstract: A voiced speech signal can be expressed as a sum of sinusoidal components of which instantaneous frequency and amplitude continuously vary with time. Determining these parameters from the input, the time-varying characteristics are crucial error sources for the algorithms, which assume their stationarity within a local analysis segment. To overcome this problem, a new method is proposed, local vector transform (LVT), which can determine instantaneous frequency and amplitude for nonstationary sinusoids. The method does not assume the local stationarity. The effectiveness of LVT was examined in parameter determination for synthesized and naturally uttered speech signals. The instantaneous frequency for the first harmonic component was determined with an accuracy almost equal to that of the time-corrected instantaneous frequency method and higher accuracy than that of spectral peak-picking, autocorrelation, and cepstrum. The instantaneous amplitude was also determined accurately by LVT while considerable errors were left in the other algorithms. The signal reconstructed from the determined parameters by LVT agreed well with the corresponding component of voiced speech. These results suggest that the method is effective for analyzing time-varying voiced speech signals.

16 citations

Proceedings ArticleDOI
22 May 2011
TL;DR: A model for estimating TVs trained on natural speech and a Dynamic Bayesian Network (DBN) based speech recognition architecture that treats vocal tract constriction gestures as hidden variables are proposed, eliminating the necessity for explicit gesture recognition.
Abstract: Previously we have proposed different models for estimating articulatory gestures and vocal tract variable (TV) trajectories from synthetic speech. We have shown that when deployed on natural speech, such models can help to improve the noise robustness of a hidden Markov model (HMM) based speech recognition system. In this paper we propose a model for estimating TVs trained on natural speech and present a Dynamic Bayesian Network (DBN) based speech recognition architecture that treats vocal tract constriction gestures as hidden variables, eliminating the necessity for explicit gesture recognition. Using the proposed architecture we performed a word recognition task for the noisy data of Aurora-2. Significant improvement was observed in using the gestural information as hidden variables in a DBN architecture over using only the mel-frequency cepstral coefficient based HMM or DBN backend. We also compare our results with other noise-robust front ends.

16 citations

Book ChapterDOI
01 Jan 2014
TL;DR: The goal and novelty of this work was the analysis of applicability of the parameters selectively used to assess the pathology.
Abstract: Present development of digital registration and methods of recorded voice processing are useful in detection of most pathologies and diseases of a human vocal tract. The recognition of the voice condition requires the creation of a model which is comprised of different acoustic parameters of speech signal. In this study a vector consisting of 31 parameters for analysing the speech signal was created. The speech parameters were extracted from time, frequency and cepstral domains. Using Principal Components Analysis the number of the parameters was reduced to 17. In order to validate the detection of the pathological voice signal, a tenfold cross-validation and confusion matrix were used. The goal and novelty of this work was the analysis of applicability of the parameters selectively used to assess the pathology.

16 citations

Book ChapterDOI
05 Jun 2009
TL;DR: This work presents the development of an automatic recognition system of infant cry, with the objective to classify two types of cry: normal and pathological cry from deaf babies, using acoustic characteristics obtained by the Mel-Frequency Cepstrum and Lineal Prediction Coding techniques.
Abstract: This work presents the development of an automatic recognition system of infant cry, with the objective to classify two types of cry: normal and pathological cry from deaf babies. In this study, we used acoustic characteristics obtained by the Mel-Frequency Cepstrum and Lineal Prediction Coding techniques and as a classifier a feed-forward neural network that was trained with several learning methods, resulting better the Scaled Conjugate Gradient algorithm. Current results are shown, which, up to the moment, are very encouraging with an accuracy up to 97.43%.

16 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Robustness (computer science)
94.7K papers, 1.6M citations
80% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Deep learning
79.8K papers, 2.1M citations
79% related
Support vector machine
73.6K papers, 1.7M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202386
2022206
202160
202096
2019135
2018130