Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Vibration Sideband Modulations and Harmonics Separation of a Planetary Helicopter Gearbox with Two Different Configurations

[...]

Nader Sawalhi¹•Institutions (1)

Prince Mohammad bin Fahd University¹

10 Nov 2016-Advances in Acoustics and Vibration

TL;DR: In this paper, the authors examined the spectrum and cepstrum content of vibration signals taken from a helicopter gearbox with two different configurations (3 and 4 planets) and presented a signal processing algorithm to separate synchronous and nonsynchronous components for complete shafts extraction and removal.

...read moreread less

Abstract: This paper examines the spectrum and cepstrum content of vibration signals taken from a helicopter gearbox with two different configurations (3 and 4 planets). It presents a signal processing algorithm to separate synchronous and nonsynchronous components for complete shafts’ harmonic extraction and removal. The spectrum and cepstrum of the vibration signal for two configurations are firstly analyzed and discussed. The effect of changing the number of planets on the fundamental gear mesh frequency (epicyclic mesh frequency) and its sidebands is discussed. The paper explains the differences between the two configurations and discusses, in particular, the asymmetry of the modulation sidebands about the epicyclic mesh frequency in the 4 planets arrangement. Finally a separation algorithm, which is based on resampling the order-tracked signal to have an integer number of samples per revolution for a specific shaft, is proposed for a complete removal of the shafts harmonics. The results obtained from the presented separation algorithms are compared to other separation schemes such as discrete random separation (DRS) and time synchronous averaging (TSA) with clear improvements and better results.

...read moreread less

10 citations

Journal Article•DOI•

Improvement of Speech Detection Using ERB Feature Extraction

[...]

Sang-Yeob Oh¹, Kyung-Yong Chung²•Institutions (2)

Gachon University¹, Sangji University²

01 Dec 2014-Wireless Personal Communications

TL;DR: This study extracted a speech feature using an equivalent rectangular bandwidth (ERB) filter bank cepstrum and constructed a learning model using the acoustic model to improve the speech recognition rate.

...read moreread less

Abstract: A range of speech extraction techniques have been applied to improve speech recognition when the signals are mixed with noise. Degradation of the speech recognition performance is caused by differences between the model training environment and the recognition environment due to inaccurate voice versus non-voice classification at low signal-to-noise ratios (SNRs). Problems also arise because voice activity detection is inaccurate when noise is caused by inconsistent changes in the recognition environment and the learning model. One technique is to extract a speech feature that is resistant to noise by removing that noise to improve the speech recognition performance. This study extracted such a feature using an equivalent rectangular bandwidth (ERB) filter bank cepstrum and constructed a learning model using the acoustic model to improve the speech recognition rate. The ERB filter bank cepstrum was examined in a computational auditory scene analysis system, which analyzes the properties of the speech signal. This paper improved the speech recognition rate by extracting such a feature with an ERB filter bank cepstrum. The proposed model used train and train station noises to evaluate the performance. The distortion was measured by performing noise reduction at SNRs of $$-10$$ - 10 and $$-5$$ - 5 dB in noisy environments, showing a respective 1.67 and 1.74 dB improvement in performance.

...read moreread less

10 citations

Proceedings Article•

Spectral envelope estimation, representation, and morphing for sound analysis, transformation, and synthesis.

[...]

Diemo Schwarz, Xavier Rodet

01 Oct 1999

TL;DR: A more high level approach to spectral envelopes is taken, which can avoid the dilemma of how to control hundreds of partials, and the residual noise part can be treated by the same manipulations as the sinusoidal part by using the same representation.

...read moreread less

Abstract: A spectral envelope is a curve in the frequency-magnitude plane which envelopes the short time spectrum of a signal, e.g. connecting the peaks which represent sinusoidal partials, or modeling the spectral density of a noise signal. It describes the perceptually pertinent distribution of energy over frequency, which determines a large part of timbre for instruments, and the type of vowel for speech. Because of the importance of using spectral envelopes for sound synthesis, a more high level approach to their handling is taken here. We present programs developed using spectral envelopes for analysis, representation, manipulation, and synthesis. Spectral envelopes can be estimated by linear prediction, cepstrum or discrete cepstrum. The strong and weak points of each are discussed relative to the requirements for estimation, such as robustness and regularity. Improvements of discrete cepstrum estimation (regularization, statistical smoothing, logarithmic frequency scale, adding control points) are presented. For speech signals, a composite envelope is shown to be advantageous. It is estimated from the sinusoidal partials and from the noise part above the maximum partial frequency. The representation of spectral envelopes is the central point for their handling. A good representation is crucial for the ease and flexibility with which they can be manipulated. Several requirements are laid out, such as stability, locality, and flexibility. The representations (filter coefficients, sampled, break-point-functions, splines, formants) are then discussed relative to these requirements. The notion of fuzzy formants based on formant regions is introduced. Some general forms of manipulations and morphing are presented. For morphing between two or more spectral envelopes over time, linear interpolation, and formant shifting which preserves valid vocal tract characteristics, are considered. For synthesis, spectral envelopes are applied to sinusoidal additive synthesis and are used for filtering the residual noise component. This is especially easy and efficient for both components in the FFT-1 technique. Finally, in additive analysis, spectral envelopes can be generalized not only to apply to magnitude, but also to frequency and phase, while keeping the same representation. The frequency envelope expresses harmonicity of partials over frequency, the phase envelope expresses phase relations between harmonic partials. With this high level approach to spectral envelopes, additive synthesis can avoid the dilemma of how to control hundreds of partials, and the residual noise part can be treated by the same manipulations as the sinusoidal part by using the same representation. Also, high quality singing voice synthesis can use morphing between sampled spectral envelopes and formants to combine natural sounding transitions with a precisely modeled sustained part. Abovementioned methods have been implemented in a C-library using the SDIF standard for sound description data as file format and are used in various real-time and non real-time programs on Unix and Macintosh.

...read moreread less

10 citations

Proceedings Article•DOI•

Auditory masking based acoustic front-end for robust speech recognition

[...]

Kuldip K. Paliwal¹, B.T. Lilly•Institutions (1)

Griffith University¹

02 Dec 1997

TL;DR: This paper presents an acoustic front-end which uses the properties of auditory masking for extracting acoustic features from the speech signal using a masking threshold as a function of frequency for a given speech frame from its power spectrum.

...read moreread less

Abstract: This paper presents an acoustic front-end which uses the properties of auditory masking for extracting acoustic features from the speech signal. Using the properties of simultaneous masking found in the human auditory system, we compute a masking threshold as a function of frequency for a given speech frame from its power spectrum. All those portions of the power spectrum which are below the auditory threshold are not heard by the human auditory system due to masking effects and hence can be discarded. These portions are replaced by the corresponding portions in the masking threshold spectrum. This modified power spectrum is processed by the linear prediction analysis or homomorphic analysis procedure to derive cepstral features for each speech frame. We study the performance of this front-end for speech recognition under noisy environments. This front-end performs significantly better than the conventional linear prediction or homomorphic analysis based front-ends for noisy speech. In terms of signal-to-noise ratio, simultaneous masking offers an advantage of more than 5 dB over the LPCC front-end in isolated word recognition experiments and 3 dB in continuous speech recognition experiments.

...read moreread less

10 citations

Proceedings Article•DOI•

A novel robust feature of speech signal based on the Mellin transform for speaker-independent speech recognition

[...]

Jingdong Chen, Bo Xu¹, Taiyi Huang¹•Institutions (1)

Chinese Academy of Sciences¹

12 May 1998

TL;DR: A novel kind of speech feature which is the modified Mellin transform of the log-spectrum of the speech signal (short for MMTLS) is presented, which is more appropriate for speaker-independent speech recognition than the popular used cepstrum.

...read moreread less

Abstract: This paper presents a novel kind of speech feature which is the modified Mellin transform of the log-spectrum of the speech signal (short for MMTLS). Because of the scale invariance property of the modified Mellin transform, the new feature is insensitive to the variation of the vocal tract length among individual speakers, and thus it is more appropriate for speaker-independent speech recognition than the popular used cepstrum. The preliminary experiments show that the performance of the MMTLS-based method is much better in comparison with those of the LPC- and MFC-based methods. Moreover, the error rate of this method is very consistent for different outlier speakers.

...read moreread less

10 citations

Collapse

Network Information

Performance

Metrics

3,645

Papers

60,375

Citations

No. of papers in the topic in previous years
Year	Papers
2023	86
2022	206
2021	60
2020	96
2019	135
2018	130

Cepstrum

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics