scispace - formally typeset
Search or ask a question
Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.


Papers
More filters
Proceedings ArticleDOI
22 Aug 1999
TL;DR: Intensive experimental results showed that the proposed technique outperforms both the autocorrelation and cepstral based techniques.
Abstract: In speech processing, methods based on glottal closure instances (GCI) for pitch period estimation have proven to give good results. We propose here a new method to estimate the GCIs based on the Teager energy function (TEF). The technique is simple and has a low computational load. The method is based on the peak detection of the TEF for each frame. A smoothing technique is then implemented to pick the right value for the pitch period. Intensive experimental results showed that the proposed technique outperforms both the autocorrelation and cepstral based techniques.

18 citations

Proceedings ArticleDOI
21 Aug 2000
TL;DR: Text-independent and text-dependent speaker recognition systems suitable for verification and identification (open set and closed set) and a limited vocabulary recognition system is developed using vowel phoneme in the limited vocabulary.
Abstract: Speaker recognition systems attempt to recognize a speaker by his/her voice through measurements of the specifically individual characteristics arising in the speaker's voice. Among transformations of LPC parameters the adaptive component weighted (ACW) cepstrum has been shown to be less susceptible to channel effects than others. Text-independent and text-dependent speaker recognition systems suitable for verification and identification (open set and closed set) are presented, The system is based on locating the vowel phonemes of the test utterance. A preprocessing is applied to the speech signal. The centers of the vowel phonemes are located and identified as speech events using a three-step vowel phoneme locating process. The steps of the locating process are: (1) average magnitude function calculation; (2) vowel phoneme candidates location; and (3) ripple rejection. For each vowel phoneme (20 ms) 10 ACW cepstrum coefficients are calculated and are used as inputs to neural networks and the outputs are accumulated and averaged. The system hardware requirements are a microphone and a round card. The system software written in C++ language for windows. The system was tested with a population of 10 speakers (7 male and 3 female), and the statistics were taken (95.67% for text-dependent verification, 93% for text-dependent identification, 92.2% for text-independent verification and 88.95% for text-independent identification). There tests were done with utterances of one word having one vowel phoneme (20 msec used for recognizing the speaker). A vowel phoneme recognition application is also presented. A limited vocabulary recognition system is developed using vowel phoneme in the limited vocabulary. The feature vectors calculation is the same as in the speaker recognition system the only difference is in the neural network training and size (97.5% of word recognition).

18 citations

Proceedings ArticleDOI
01 Dec 2012
TL;DR: Speech-signal-based frequency cepstral coefficients (SFCC) is introduced in speaker recognition domain and proposed to use combination of filter banks of both the MFCC and SFCC in text-independent speaker identification.
Abstract: Over the decade, mel-frequency cepstral coefficient (MFCC) has been the most popular feature extraction method in the field of automatic speaker recognition. But in case of robust speaker recognition system, its performance is good for white noise contamination but not as good for other noises. We introduce speech-signal-based frequency cepstral coefficients (SFCC) in speaker recognition domain. In this method, frequency warping function is derived directly from the speech signal itself by considering equal area portions of the logarithm of the ensemble average short-time power spectrum of entire speech corpus. Speech-signal-based frequency warping function is very much similar to the frequency scale obtained through psycho-acoustic experiments known as mel scale and bark scale. We have proposed to use combination of filter banks of both the MFCC and SFCC in text-independent speaker identification. Speaker identification experiments are performed on POLY-COST database. The proposed technique gives better performance than the single streamed MFCC or SFCC based features for robust speaker identification system.

18 citations

Journal ArticleDOI
TL;DR: This paper uses a natural generalization of the logarithmic function instead of theLogarithic function and refers to the spectral representation parameter on the "generalized logarathmic" scale as the " generalized cepstrum," which corresponds to the cepStrum on the logARithmic scale.
Abstract: In this paper, we present a generalization of the cepstral method from the viewpoint of spectral smoothing for speech. We use a natural generalization of the logarithmic function instead of the logarithmic function and we refer to the spectral representation parameter on the "generalized logarithmic" scale as the "generalized cepstrum," which corresponds to the cepstrum on the logarithmic scale. A number of properties of the generalized cepstrum are shown, and are compared to the cepstrum.

18 citations

Patent
23 Jul 2002
TL;DR: In this article, an LSP decoding section extracts and decodes only LSP information from coded speech data, which is read for each block, and a Cepstrum conversion section converts the obtained LPC information into an LPC CepStrum which represents features of speech.
Abstract: A process of identifying a speaker in coded speech data and a process of searching for the speaker are efficiently performed with fewer computations and with a smaller storage capacity. In an information search apparatus, an LSP decoding section extracts and decodes only LSP information from coded speech data which is read for each block. An LPC conversion section converts the LSP information into LPC information. A Cepstrum conversion section converts the obtained LPC information into an LPC Cepstrum which represents features of speech. A vector quantization section performs vector quantization on the LPC Cepstrum. A speaker identification section identifies a speaker on the basis of the result of the vector quantization. Furthermore, the identified speaker is compared with a search condition in a condition comparison section, and based on the result, the search result is output.

18 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Robustness (computer science)
94.7K papers, 1.6M citations
80% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Deep learning
79.8K papers, 2.1M citations
79% related
Support vector machine
73.6K papers, 1.7M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202386
2022206
202160
202096
2019135
2018130