Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Short‐Time Spectrum and “Cepstrum” Techniques for Vocal‐Pitch Detection

[...]

A. Michael Noll

01 Feb 1964-Journal of the Acoustical Society of America

TL;DR: Cepstral techniques appear to be even more reliable and efficient than visual methods for pitch detection, and to produce high‐resolution spectra without utilizing either heterodyning methods or bandpass filter banks.

...read moreread less

Abstract: A spectrum analyzer based on a definition of short‐time power spectra has been designed and simulated on a digital computer. The analyzer is primarily intended for use in speech analysis. It has been designed to operate in real time, and to produce high‐resolution spectra without utilizing either heterodyning methods or bandpass filter banks. The logarithm of each consecutive amplitude spectrum thus obtained can be used as the input to a second similar spectrum analyzer. The output of this analyzer is then the “cepstrum” or power spectrum of the logarithm spectrum. The cepstrum of a speech signal has a peak corresponding to the fundamental period for voiced speech but no peak for unvoiced speech. Thus, a cepstrum analyzer can function both as a pitch and as a voiced‐unvoiced detector. Cepstral pitch detection has the important advantages that it is insensitive to phase distortion, and is also resistant to additive noise and amplitude distortion of the speech signal. The method does not require the presence of the fundamental frequency in the speech signal, and will give several separate cepstral peaks if several different pitch periods are present. Cepstral techniques appear to be even more reliable and efficient than visual methods for pitch detection. The short‐time spectrum and cepstrum analyzers described in this paper were simulated by a sampled‐data system on an IBM‐7090 digital computer. The simulation was programmed with the assistance of a special block‐diagram compiler.

...read moreread less

219 citations

Journal Article•DOI•

Vocal tract normalization equals linear transformation in cepstral space

[...]

Michael Pitz¹, Hermann Ney¹•Institutions (1)

RWTH Aachen University¹

15 Aug 2005-IEEE Transactions on Speech and Audio Processing

TL;DR: In this paper, the Jacobian determinant of the transformation matrix is computed analytically for three typical warping functions and it is shown that the matrices are diagonal dominant and thus can be approximated by quindiagonal matrices.

...read moreread less

Abstract: Vocal tract normalization (VTN) is a widely used speaker normalization technique which reduces the effect of different lengths of the human vocal tract and results in an improved recognition accuracy of automatic speech recognition systems. We show that VTN results in a linear transformation in the cepstral domain, which so far have been considered as independent approaches of speaker normalization. We are now able to compute the Jacobian determinant of the transformation matrix, which allows the normalization of the probability distributions used in speaker-normalization for automatic speech recognition. We show that VTN can be viewed as a special case of Maximum Likelihood Linear Regression (MLLR). Consequently, we can explain previous experimental results that improvements obtained by VTN and subsequent MLLR are not additive in some cases. For three typical warping functions the transformation matrix is calculated analytically and we show that the matrices are diagonal dominant and thus can be approximated by quindiagonal matrices.

...read moreread less

217 citations

Book•

Pitch Determination of Speech Signals: Algorithms and Devices

[...]

Wolfgang Hess

01 Apr 1983

TL;DR: This chapter discusses Digital Signal Processing with PDAs with Multichannel PDAs, a first look at the areas of application, and Voicing Determination by Means of Pattern Recognition Methods.

...read moreread less

Abstract: 1. Introduction.- 1.1 Voice Source Parameter Measurement and the Speech Signal.- 1.2 A Short Look at the Areas of Application.- 1.3 Organization of the Book.- 2. Basic Terminology. A Short Introduction to Digital Signal Processing.- 2.1 The Simplified Model of Speech Excitation.- 2.2 Digital Signal Processing 1: Signal Representation.- 2.3 Digital Signal Processing 2: Filters.- 2.4 Time-Variant Systems. The Principle of Short-Term Analysis.- 2.5 Definition of the Task. The Linear Model of Speech Production.- 2.6 A First Categorization of Pitch Determination Algorithms (PDAs).- 3. The Human Voice Source.- 3.1 Mechanism of Sound Generation at the Larynx.- 3.2 Operational Modes of the Larynx. Registers.- 3.3 The Glottal Source (Excitation) Signal.- 3.4 The Influence of the Vocal Tract Upon Voice Source Parameters.- 3.5 The Voiceless and the Transient Sources.- 4. Measuring Range, Accuracy, Pitch Perception.- 4.1 The Range of Fundamental Frequency.- 4.2 Pitch Perception. Toward a Redefinition of the Task.- 4.2.1 Pitch Perception: Spectral and Virtual Pitch.- 4.2.2 Toward a Redefinition of the Task.- 4.2.3 Difference Limens for Fundamental-Frequency Change.- 4.3 Measurement Accuracy.- 4.4 Representation of the Pitch Information in the Signal.- 4.5 Calibration and Performance Evaluation of a PDA.- 5. Manual and Instrumental Pitch Determination, Voicing Determination.- 5.1 Manual Pitch Determination.- 5.1.1 Time-Domain Manual Pitch Determination.- 5.1.2 Frequency-Domain Manual Pitch Determination.- 5.2 Pitch Determination Instruments (PDIs).- 5.2.1 Clinical Methods for Larynx Inspection.- 5.2.2 Mechanic PDIs.- 5.2.3 Electric PDIs.- 5.2.4 Ultrasonic PDIs.- 5.2.5 Photoelectric PDIs (Transillumination of the Glottis).- 5.2.6 Comparative Evaluation of PDIs.- 5.3 Voicing Determination - Selected Examples.- 5.3.1 Voicing Determination: Parameters.- 5.3.2 Voicing Determination - Simple Voicing Determination Algo-rithms (VDAs) Combined VDA-PDA Systems.- 5.3.3 Multiparameter VDAs. Voicing Determination by Means of Pattern Recognition Methods.- 5.3.4 Summary and Conclusions.- 6. Time-Domain Pitch Determination.- 6.1 Pitch Determination by Fundamental-Harmonic Extraction.- 6.1.1 The Basic Extractor.- 6.1.2 The Simplest Pitch Determination Device - Low-Pass Filter and Zero (or Threshold) Crossings Analysis Basic Extractor.- 6.1.3 Enhancement of the First Harmonic by Nonlinear Means.- 6.1.4 Manual Preset and Tunable (Adaptive) Filters.- 6.2 The Other Extreme - Temporal Structure Analysis.- 6.2.1 Envelope Modeling - the Analog Approach.- 6.2.2 Simple Peak Detector and Global Correction.- 6.2.3 Zero Crossings and Excursion Cycles.- 6.2.4 Mixed-Feature Algorithms.- 6.2.5 Other PDAs That Investigate the Temporal Structure of the Signal.- 6.3 The Intermediate Device: Temporal Structure Transformation and Simplification.- 6.3.1 Temporal Structure Simplification by Inverse Filtering.- 6.3.2 The Discontinuity in the Excitation Signal: Event Detection.- 6.4 Parallel Processing in Fundamental Period Determination. Multichannel PDAs.- 6.4.1 PDAs with Multichannel Preprocessor Filters.- 6.4.2 PDAs with Several Channels Applying Different Extraction Principles.- 6.5 Special-Purpose (High-Accuracy) Time-Domain PDAs.- 6.5.1 Glottal Inverse Filtering.- 6.5.2 Determining the Instant of Glottal Closure.- 6.6 The Postprocessor.- 6.6.1 Time-to-Frequency Conversion Display.- 6.6.2 f0 Determination With Basic Extractor Omitted.- 6.6.3 Global Error Correction Routines.- 6.6.4 Smoothing Pitch Contours.- 6.7 Final Comments.- 7. Design and Implementation of a Time-Domain PDA for Undistorted and Band-Limited Signals.- 7.1 The Linear Algorithm.- 7.1.1 Prefiltering.- 7.1.2 Measurement and Suppression of F1.- 7.1.3 The Basic Extractor.- 7.1.4 Problems with the Formant F2. Implementation of a Multiple Two-Pulse Filter (TPF).- 7.1.5 Phase Relations and Starting Point of the Period.- 7.1.6 Performance of the Algorithm with Respect to Linear Distortions, Especially to Band Limitations.- 7.2 Band-Limited Signals in Time-Domain PDAs.- 7.2.1 Concept of the Universal PDA.- 7.2.2 Once More: Use of Nonlinear Distortion in Time-Domain PDAs.- 7.3 An Experimental Study Towards a Universal Time-Domain PDA Applying a Nonlinear Function and a Threshold Analysis Basic Extractor.- 7.3.1 Setup of the Experiment.- 7.3.2 Relative Amplitude and Enhancement of First Harmonic.- 7.4 Toward a Choice of Optimal Nonlinear Functions.- 7.4.1 Selection with Respect to Phase Distortions.- 7.4.2 Selection with Respect to Amplitude Characteristics.- 7.4.3 Selection with Respect to the Sequence of Processing.- 7.5 Implementation of a Three-Channel PDA with Nonlinear Processing.- 7.5.1 Selection of Nonlinear Functions.- 7.5.2 Determination of the Parameter for the Comb Filter.- 7.5.3 Threshold Function in the Basic Extractor.- 7.5.4 Selection of the Most Likely Channel in the Basic Extractor.- 8. Short-Term Analysis Pitch Determination.- 8.1 The Short-Term Transformation and Its Consequences.- 8.2 Autocorrelation Pitch Determination.- 8.2.1 The Autocorrelation Function and Its Relation to the Power Spectrum.- 8.2.2 Analog Realizations.- 8.2.3 "Ordinary" Autocorrelation PDAs.- 8.2.4 Autocorrelation PDAs with Nonlinear Preprocessing.- 8.2.5 Autocorrelation PDAs with Linear Adaptive Preprocessing.- 8.3 "Anticorrelation" Pitch Determination: Average Magnitude Difference Function, Distance and Dissimilarity Measures, and Other Nonstationary Short-Term Analysis PDAs.- 8.3.1 Average Magnitude Difference Function (AMDF).- 8.3.2 Generalized Distance Functions.- 8.3.3 Nonstationary Short-Term Analysis and Incremental Time-Domain PDAs.- 8.4 Multiple Spectral Transform ("Cepstrum") Pitch Determination.- 8.4.1 The More General Aspect: Deconvolution.- 8.4.2 Cepstrum Pitch Determination.- 8.5 Frequency-Domain PDAs.- 8.5.1 Spectral Compression: Frequency and Period Histogram Product Spectrum.- 8.5.2 Harmonic Matching. Psychoacoustic PDAs.- 8.5.3 Determination of f0 from the Distance of Adjacent Spectral Peaks.- 8.5.4 The Fast Fourier Transform, Spectral Resolution, and the Computing Effort.- 8.6 Maximum-Likelihood (Least-Squares) Pitch Determination.- 8.6.1 The Least-Squares Algorithm.- 8.6.2 A Multichannel Solution.- 8.6.3 Computing Complexity, Relation to Comb Filters, Simplified Realizations.- 8.7 Summary and Conclusions.- 9. General Discussion: Summary, Error Analysis, Applications.- 9.1 A Short Survey of the Principal Methods of Pitch Determination.- 9.1.1 Categorization of PDAs and Definitions of Pitch.- 9.1.2 The Basic Extractor.- 9.1.3 The Postprocessor.- 9.1.4 Methods of Preprocessing.- 9.1.5 The Impact of Technology of the Design of PDAs and the Question of Computing Effort.- 9.2 Calibration, Search for Standards.- 9.2.1 Data Acquisition.- 9.2.2 Creating the Standard Pitch Contour Manually, Automatically, and by an Interactive PDA.- 9.2.3 Creating a Standard Contour by Means of a PDI.- 9.3 Performance Evaluation of PDAs.- 9.3.1 Comparative Performance Evaluation of PDAs: Some Examples from the Literature.- 9.3.2 Methods of Error Analysis.- 9.4 A Closer Look at the Applications.- 9.4.1 Has the Problem Been Solved?.- 9.4.2 Application in Phonetics, Linguistics, and Musicology.- 9.4.3 Application in Education and in Pathology.- 9.4.4 The "Technical" Application: Speech Communication.- 9.4.5 A Way Around the Problem in Speech Communication: Voice-Excited and Residual-Excited Vocoding (Baseband Coding).- 9.5 Possible Paths Towards a General Solution.- Appendix A. Experimental Data on the Behavior of Nonlinear Functions in Time-Domain Pitch Determination Algorithms.- A.1 The Data Base of the Investigation.- A.2 Examples for the Behavior of the Nonlinear Functions.- A.3 Relative Amplitude RA1 and Enhancement RE1 of the First Harmonic.- A.4 Relative Amplitude RASM of Spurious Maximum and Autocorrelation Threshold.- A.5 Processing Sequence, Preemphasis, Phase, Band Limitation.- A.6 Optimal Performance of Nonlinear Functions.- A.7 Performance of the Comb Filters.- Appendix B. Original Text of the Quotations in Foreign Languages Throughout This Book.- List of Abbreviations.- Author and Subject Index.

...read moreread less

212 citations

Journal Article•DOI•

Blind equalization using a tricepstrum-based algorithm

[...]

Dimitrios Hatzinakos¹, Chrysostomos L. Nikias²•Institutions (2)

University of Toronto¹, Northeastern University²

01 May 1991-IEEE Transactions on Communications

TL;DR: It is demonstrated, by means of extensive simulations, that the proposed tricepstrum-based equalization scheme performs well and outperforms other existing blind equalizers, at the expense of higher computational complexity.

...read moreread less

Abstract: An adaptive blind equalization method is introduced for nonminimum phase communication channels. The method estimates the inverse channel impulse response, by using the complex cepstrum of the fourth-order cumulants (tricepstrum) of the synchronously sampled received signal. As such, the proposed adaptive method depends only on the statistics of the received sequence, and is capable of reconstructing separately both the minimum and maximum phase response of the channel. It is demonstrated, by means of extensive simulations, that the proposed tricepstrum-based equalization scheme performs well and outperforms other existing blind equalizers, at the expense of higher computational complexity. >

...read moreread less

211 citations

Proceedings Article•DOI•

Robust voice activity detection using cepstral features

[...]

J. A. Haigh, John Mason

19 Oct 1993

TL;DR: It is shown that a cepstral based algorithm exhibits a high degree of independence to levels of background noise and successful speech end-pointing can be achieved via thresholding cepStral distance measures.

...read moreread less

Abstract: This paper reviews algorithms which rely on the analysis of time domain samples to provide energy and zero-crossing rates, together with more recent algorithms that use different methods for speech detection. We then examine a different approach using cepstral analysis, showing a high degree of amplitude and noise level independence. We show that a cepstral based algorithm exhibits a high degree of independence to levels of background noise and successful speech end-pointing can be achieved via thresholding cepstral distance measures. Through the use of a noise code-book we are able to provide a successful reference for Euclidean distance measures in the voice detection algorithm. >

...read moreread less

208 citations

Collapse

Network Information

Performance

Metrics

3,645

Papers

60,375

Citations

No. of papers in the topic in previous years
Year	Papers
2023	86
2022	206
2021	60
2020	96
2019	135
2018	130

Cepstrum

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics