Topic

Cepstrum

About: Cepstrum is a research topic. Over the lifetime, 3346 publications have been published within this topic receiving 55742 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Mel-cepstrum modulation spectrum (MCMS) features for robust ASR

[...]

Vivek Tyagi, Iain McCowan, Hemant Misra, Hervé Bourlard

30 Nov 2003

TL;DR: These new dynamic features derived from the modulation spectrum of the cepstral trajectories of the speech signal yield a significant increase in the speech recognition performance in various noise conditions when compared directly to the standard temporal derivative features and C-JRASTA PLP features.

...read moreread less

Abstract: In this paper, we present new dynamic features derived from the modulation spectrum of the cepstral trajectories of the speech signal. Cepstral trajectories are projected over the basis of sines and cosines yielding the cepstral modulation frequency response of the speech signal. We show that the different sines and cosines basis vectors select different modulation frequencies, whereas the frequency responses of the delta and the double delta filters are only centered over 15 Hz. Therefore, projecting cepstral trajectories over the basis of sines and cosines yield a more complementary and discriminative range of features. In this work, the cepstrum reconstructed from the lower cepstral modulation frequency components is used as the static feature. In experiments, it is shown that, as well as providing an improvement in clean conditions, these new dynamic features yield a significant increase in the speech recognition performance in various noise conditions when compared directly to the standard temporal derivative features and C-JRASTA PLP features.

...read moreread less

56 citations

Journal Article•DOI•

An approximation to voice aperiodicity

[...]

O. Fujimura¹•Institutions (1)

University of Tokyo¹

01 Mar 1968-IEEE Transactions on Audio and Electroacoustics

TL;DR: Experimental tests have been made on a computer-simulated channel vocoder to see whether pitch perturbation can be effectively simulated by partially replacing voiced excitation by random noise, in appropriate frequency-time portions and shows that partial devoicing of the high-frequency ranges definitely improves speech quality.

...read moreread less

Abstract: Aperiodicity in voiced segments of speech may be ascribed to different causes. The magnitude of pitch perturbation is different in different spectral ranges of the signal. To see whether pitch perturbation can be effectively simulated by partially replacing voiced excitation by random noise, in appropriate frequency-time portions, experimental tests have been made on a computer-simulated channel vocoder. The buzz-hiss decision was made separately for three different frequency portions of the signal. The cepstrum technique was used for pitch detection, and separate buzz-hiss switching decisions were made at the synthesizer for each frequency portion. The switching thresholds were controlled, and deliberately "devoiced" versions were compared with regular vocoded speech. The fundamental frequency was determined by the lowband cepstrum. The result shows that partial devoicing of the high-frequency ranges definitely improves speech quality. Further, a comparatively large amount of devoicing is perceptually tolerable.

...read moreread less

55 citations

Journal Article•DOI•

A hybrid approach for fault diagnosis of planetary bearings using an internal vibration sensor

[...]

Zhiqi Fan¹, Huaizhong Li²•Institutions (2)

University of New South Wales¹, Griffith University²

01 Mar 2015-Measurement

TL;DR: In this article, a hybrid approach for fault diagnosis of planetary bearing using an internal vibration sensor and novel signal processing strategies is presented, where an accelerometer is mounted internally on the planet carrier to address the issues of variable transmission path and adverse effect of the electromagnetic interference in the signal due to the use of a slip ring is tackled by optimizing the spectral kurtosis (SK) technique for demodulation band selection.

...read moreread less

55 citations

Patent•

Speech recognition method and speech recognition apparatus

[...]

Shunsuke Ishimitsu

22 Oct 1998

TL;DR: In this paper, a speech recognition method for recognizing an input speech in a noisy environment by using a plurality of clean speech models is provided, where each clean speech model has a clean speech feature parameter S representing a cepstrum parameter of a clean speaker.

...read moreread less

Abstract: A speech recognition method of recognizing an input speech in a noisy environment by using a plurality of clean speech models is provided. Each of the clean speech models has a clean speech feature parameter S representing a cepstrum parameter of a clean speech thereof. The speech recognition method has the processes of: detecting a noise feature parameter N representing a cepstrum parameter of a noise in the noisy environment, immediately before the input speech is input; detecting an input speech feature parameter X representing a cepstrum parameter of the input speech in the noisy environment; calculating a modified clean speech feature parameter Y according to a following equation: Y = k · S + (1-k) · N (0 < k ≦ 1), where the "k" is a predetermined value corresponding to a signal-to-noise ratio in the noise environment; comparing the input speech feature parameter X with the modified clean speech feature parameter Y; and recognizing the input speech by repeatedly carrying out the calculating process and the comparing process with respect to the plurality of clean speech models.

...read moreread less

54 citations

Investigation of Spectral Centroid Magnitude and Frequency for Speaker Recognition.

[...]

Jia Min Karen Kua¹, Tharmarajah Thiruvaran¹, Mohaddeseh Nosratighods¹, Eliathamby Ambikairajah¹, Julien Epps¹ - Show less +1 more•Institutions (1)

University of New South Wales¹

01 Jan 2010

TL;DR: This study investigates the characterization of subband energy as a two dimensional feature, comprising Spectral Centroid Magnitude (SCM) and SCF, and provides an SCF implementation that improves on the speaker recognition performance of both subband spectral centroid and FM features.

...read moreread less

Abstract: Most conventional features used in speaker recognition are based on spectral envelope characterizations such as Mel-scale filterbank cepstrum coefficients (MFCC), Linear Prediction Cepstrum Coefficient (LPCC) and Perceptual Linear Prediction (PLP). The MFCC’s success has seen it become a de facto standard feature for speaker recognition. Alternative features, that convey information other than the average subband energy, have been proposed, such as frequency modulation (FM) and subband spectral centroid features. In this study, we investigate the characterization of subband energy as a two dimensional feature, comprising Spectral Centroid Magnitude (SCM) and Spectral Centroid Frequency (SCF). Empirical experiments carried out on the NIST 2001 and NIST 2006 databases using SCF, SCM and their fusion suggests that the combination of SCM and SCF are somewhat more accurate compared with conventional MFCC, and that both fuse effectively with MFCCs. We also show that frame-averaged FM features are essentially centroid features, and provide an SCF implementation that improves on the speaker recognition performance of both subband spectral centroid and FM features.

...read moreread less

54 citations

Collapse

Network Information

Performance

Metrics

3,645

Papers

60,375

Citations

No. of papers in the topic in previous years
Year	Papers
2023	86
2022	206
2021	60
2020	96
2019	135
2018	130

Cepstrum

Papers published on a yearly basis

Papers

Trending Questions (9)

Network Information

Related Topics (5)

Performance

Metrics