scispace - formally typeset
Search or ask a question
Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: The cepstrum was found to be the most effective, providing an identification accuracy of 70% for speech 50 msec in duration, which increased to more than 98% for a duration of 0.5 sec.
Abstract: Several different parametric representations of speech derived from the linear prediction model are examined for their effectiveness for automatic recognition of speakers from their voices. Twelve predictor coefficients were determined approximately once every 50 msec from speech sampled at 10 kHz. The predictor coefficients and other speech parameters derived from them, such as the impulse response function, the autocorrelation function, the area function, and the cepstrum function were used as input to an automatic speaker‐recognition system. The speech data consisted of 60 utterances, consisting of six repetitions of the same sentence spoken by 10 speakers. The identification decision was based on the distance of the test sample vector from the reference vector for different speakers in the population; the speaker corresponding to the reference vector with the smallest distance was judged to be the unknown speaker. In verification, the speaker was verified if the distance between the test sample vector and the reference vector for the claimed speaker was less than a fixed threshold. Among all the parameters investigated, the cepstrum was found to be the most effective, providing an identification accuracy of 70% for speech 50 msec in duration, which increased to more than 98% for a duration of 0.5 sec. Using the same speech data, the verification accuracy was found to be approximately 83% for a duration of 50 msec, increasing to 98% for a duration of 1 sec. In a separate study to determine the feasibility of text‐independent speaker identification, an identification accuracy of 93% was achieved for speech 2 sec in duration even though the texts of the test and reference samples were different.

984 citations

Journal ArticleDOI
TL;DR: Algorithms were developed heuristically for picking those peaks corresponding to voiced‐speech segments and the vocal pitch periods, which were then used to derive the excitation for a computer‐simulated channel vocoder.
Abstract: The cepstrum, defined as the power spectrum of the logarithm of the power spectrum, has a strong peak corresponding to the pitch period of the voiced‐speech segment being analyzed. Cepstra were calculated on a digital computer and were automatically plotted on microfilm. Algorithms were developed heuristically for picking those peaks corresponding to voiced‐speech segments and the vocal pitch periods. This information was then used to derive the excitation for a computer‐simulated channel vocoder. The pitch quality of the vocoded speech was judged by experienced listeners in informal comparison tests to be indistinguishable from the original speech.

851 citations

Journal ArticleDOI
TL;DR: This paper proposes a new isolated word recognition technique based on a combination of instantaneous and dynamic features of the speech spectrum that is shown to be highly effective in speaker-independent speech recognition.
Abstract: This paper proposes a new isolated word recognition technique based on a combination of instantaneous and dynamic features of the speech spectrum. This technique is shown to be highly effective in speaker-independent speech recognition. Spoken utterances are represented by time sequences of cepstrum coefficients and energy. Regression coefficients for these time functions are extracted for every frame over an approximately 50 ms period. Time functions of regression coefficients extracted for cepstrum and energy are combined with time functions of the original cepstrum coefficients, and used with a staggered array DP matching algorithm to compare multiple templates and input speech. Speaker-independent isolated word recognition experiments using a vocabulary of 100 Japanese city names indicate that a recognition error rate of 2.4 percent can be obtained with this method. Using only the original cepstrum coefficients the error rate is 6.2 percent.

812 citations

Journal ArticleDOI
01 Oct 1977
TL;DR: The power, complex, and phase cepstra are shown to be easily related to one another, and the interpretation and processing of data in such areas as speech, seismology, and hydroacoustics is discussed.
Abstract: This paper is a pragmatic tutorial review of the cepstrum literature focusing on data processing. The power, complex, and phase cepstra are shown to be easily related to one another. Problems associated with phase unwrapping, linear phase components, spectrum notching, aliasing, oversampling, and extending the data sequence with zeros are discussed. The advantages and disadvantages of windowing the sampled data sequence, the log spectrum, and the complex cepstrum are presented. The influence of noise upon the data processing procedures is discussed throughout the paper, but is not thoroughly analyzed. The effects of various forms of liftering the cepstrum are described. The results obtained by applying whitening and trend removal techniques to the spectrum prior to the calculation of the cepstrum are discussed. We have attempted to synthesize the results, procedures, and information peculiar to the many fields that are finding cepstrum analysis useful. In particular we discuss the interpretation and processing of data in such areas as speech, seismology, and hydroacoustics. But we must caution the reader that the paper is heavily influenced by our own experiences; specific procedures that have been found useful in one field should not be considered as totally general to other fields. It is hoped that this review will be of value to those familiar with the field and reduce the time required for those wishing to become so.

607 citations


Network Information
Related Topics (5)
Feature extraction
111.8K papers, 2.1M citations
82% related
Robustness (computer science)
94.7K papers, 1.6M citations
80% related
Feature (computer vision)
128.2K papers, 1.7M citations
79% related
Deep learning
79.8K papers, 2.1M citations
79% related
Support vector machine
73.6K papers, 1.7M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202386
2022206
202160
202096
2019135
2018130