Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Improving Robustness to Compressed Speech in Speaker Recognition

[...]

Mitchell McLaren¹, Victor Abrash¹, Martin Graciarena¹, Yun Lei¹, Jan Pesan - Show less +1 more•Institutions (1)

SRI International¹

25 Aug 2013

TL;DR: It was found that robustness to compressed speech was marginally improved by exposing PLDA to noisy and reverberant speech, with little improvement using trancoded speech in PLDA based on codecs mismatched to the evaluation conditions.

...read moreread less

Abstract: The goal of this paper is to analyze the impact of codecdegraded speech on a state-of-the-art speaker recognition system and propose mitigation techniques. Several acoustic features are analyzed, including the standard Mel filterbank cepstral coefficients (MFCC), as well as the noise-robust medium duration modulation cepstrum (MDMC) and power normalized cepstral coefficients (PNCC), to determine whether robustness to noise generalizes to audio compression. Using a speaker recognition system based on i-vectors and probabilistic linear discriminant analysis (PLDA), we compared four PLDA training scenarios. The first involves training PLDA on clean data, the second included additional noisy and reverberant speech, a third introduces transcoded data matched to the evaluation conditions and the fourth, using codec-degraded speech mismatched to the evaluation conditions. We found that robustness to compressed speech was marginally improved by exposing PLDA to noisy and reverberant speech, with little improvement using trancoded speech in PLDA based on codecs mismatched to the evaluation conditions. Noise-robust features offered a degree of robustness to compressed speech while more significant improvements occurred when PLDA had observed the codec matching the evaluation conditions. Finally, we tested i-vector fusion from the different features, which increased overall system performance but did not improve robustness to codec-degraded speech. Index Terms: speaker recognition, speech coding, codec degradation, speaker verification.

...read moreread less

19 citations

Proceedings Article•DOI•

Voice source features for cognitive load classification

[...]

Tet Fei Yap¹, Julien Epps¹, Eliathamby Ambikairajah¹, Eric H. C. Choi²•Institutions (2)

University of New South Wales¹, NICTA²

22 May 2011

TL;DR: This paper proposes the use of acoustic voice source features extracted directly from the speech spectrum (or cepstrum) for cognitive load classification and proposes pre- and post-processing techniques to improve the estimation of the cepstral peak prominence (CPP).

...read moreread less

Abstract: Previous work in speech-based cognitive load classification has shown that the glottal source contains important information for cognitive load discrimination. However, the reliability of glottal flow features depends on the accuracy of the glottal flow estimation, which is a non-trivial process. In this paper, we propose the use of acoustic voice source features extracted directly from the speech spectrum (or cepstrum) for cognitive load classification. We also propose pre-and post-processing techniques to improve the estimation of the cepstral peak prominence (CPP). 3-class classification results on two databases showed CPP as a promising cognitive load classification feature that outperforms glottal flow features. Score-level fusion of the CPP-based classification system with a formant frequency-based system yielded a final improved accuracy of 62.7%, suggesting that CPP contains useful voice source information that complements the information captured by vocal tract features.

...read moreread less

19 citations

Journal Article•DOI•

A cepstrum analysis-based classification method for hand movement surface EMG signals

[...]

Erdem Yavuz¹, Can Eyupoglu²•Institutions (2)

Bursa Technical University¹, Turkish Air Force Academy²

07 Aug 2019-Medical & Biological Engineering & Computing

TL;DR: The experimental results demonstrate that the proposed method surpasses most of the previous studies in point of classification accuracy and establishes applicability and efficacy of cepstrum-based features in classifying sEMG signals of hand movements.

...read moreread less

Abstract: It is of great importance to effectively process and interpret surface electromyogram (sEMG) signals to actuate a robotic and prosthetic exoskeleton hand needed by hand amputees. In this paper, we have proposed a cepstrum analysis-based method for classification of basic hand movement sEMG signals. Cepstral analysis technique primarily used for analyzing acoustic and seismological signals is effectively exploited to extract features of time-domain sEMG signals by computing mel-frequency cepstral coefficients (MFCCs). The extracted feature vector consisting of MFCCs is then forwarded to feed a generalized regression neural network (GRNN) so as to classify basic hand movements. The proposed method has been tested on sEMG for Basic Hand movements Data Set and achieved an average accuracy rate of 99.34% for the five individual subjects and an overall mean accuracy rate of 99.23% for the collective (mixed) dataset. The experimental results demonstrate that the proposed method surpasses most of the previous studies in point of classification accuracy. Discrimination ability of the cepstral features exploited in this study is quantified using Kruskal-Wallis statistical test. Evidenced by the experimental results, this study explores and establishes applicability and efficacy of cepstrum-based features in classifying sEMG signals of hand movements. Owing to the non-iterative training nature of the artificial neural network type adopted in the study, the proposed method does not demand much time to build up the model in the training phase. Graphical abstract.

...read moreread less

19 citations

Proceedings Article•DOI•

Signal modeling for speaker identification

[...]

Li Liu¹, Jialong He, Günther Palm•Institutions (1)

University of Ulm¹

07 May 1996

TL;DR: It was shown that both LPCC and MFCC are effective representations, for smaller number of parameters, LPCC representation performs better but is surpassed by MFCC if the analysis order is larger.

...read moreread less

Abstract: A large number of parameters, including pitch, LPCC, /spl Delta/LPCC, PARCOR, MFCC, /spl Delta/MFCC, and residual cepstrum (RCEP) were extracted from speech signals and their effectiveness for text-independent speaker identification was evaluated. In addition, the usefulness of two signal processing techniques, preemphasis and cepstral weighting, was also studied. The VQ-based speaker recognition method with codebooks fine-tuned by LVQ algorithm was used. It was shown that both LPCC and MFCC are effective representations, for smaller number of parameters, LPCC representation performs better but is surpassed by MFCC if the analysis order is larger. Pitch is an independent parameter so that it can be used jointly with other spectral features. In an evaluation experiment, the correct identification rate for 112 male speakers with test utterances of less than one second reached 98.2%.

...read moreread less

19 citations

Patent•

Voice enhancing method based on multiresolution auditory cepstrum coefficient and deep convolutional neural network

[...]

Ruwei Li, Liu Yanan, Li Tao, Xiaoyue Sun

27 Mar 2018

TL;DR: In this paper, a voice enhancing method based on a multiresolution auditory cepstrum system and a deep convolutional neural network is proposed, which consists of three steps: establishing new characteristic parameters, namely MR-GFCC, capable of distinguishing voice from noise; secondly, establishing a self-adaptivemasking threshold on based on ideal soft masking (IRM) and ideal binary masking(IBM) according to noise variations; further training an established seven-layer neural network by using new extracted characteristic parameters and first/second derivatives thereof and the self

...read moreread less

Abstract: The invention discloses a voice enhancing method based on a multiresolution auditory cepstrum system and a deep convolutional neural network. The voice enhancing method comprises the following steps:firstly, establishing new characteristic parameters, namely multiresolution auditory cepstrum coefficient (MR-GFCC), capable of distinguishing voice from noise; secondly, establishing a self-adaptivemasking threshold on based on ideal soft masking (IRM) and ideal binary masking (IBM) according to noise variations; further training an established seven-layer neural network by using new extracted characteristic parameters and first/second derivatives thereof and the self-adaptive masking threshold as input and output of the deep convolutional neural network (DCNN); and finally enhancing noise-containing voice by using the self-adaptive masking threshold estimated by the DCNN. By adopting the method, the working mechanism of human ears is sufficiently utilized, voice characteristic parameters simulating a human ear auditory physiological model are disposed, and not only is a relatively great deal of voice information maintained, but also the extraction process is simple and feasible.

...read moreread less

19 citations

Collapse

Network Information

Performance

Metrics

3,645

Papers

60,375

Citations

No. of papers in the topic in previous years
Year	Papers
2023	86
2022	206
2021	60
2020	96
2019	135
2018	130

Cepstrum

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics