scispace - formally typeset
Search or ask a question

Showing papers on "Cepstrum published in 2006"


01 Sep 2006
TL;DR: This report discusses methods for estimating linear motion blur, which is modeled as a convolution between the original image and an unknown point-spread function.
Abstract: This report discusses methods for estimating linear motion blur. The blurred image is modeled as a convolution between the original image and an unknown point-spread function. The angle of motion blur is estimated using three different approaches. The first employs the cepstrum, the second a Gaussian filter, and the third the Radon transform. To estimate the extent of the motion blur, two different cepstral methods are employed. The accuracy of these methods is evaluated using artificially blurred images with varying degrees of noise added. Finally, the best angle and length estimates are combined with existing deconvolution methods to see how well the image is deblurred.

87 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used a single pressure transducer to extract the time domain signals of these pressure transients using discrete wavelets to remove the dc offset, and the low and high frequencies.
Abstract: The detection and location of leaks in pipeline networks is a major problem and the reduction of these leaks has become a major priority for pipeline authorities around the world. Although the reasons for these leaks are well known, some of the current methods for locating and identifying them are either complicated or imprecise; most of them are time consuming. The work described here shows that cepstrum analysis is a viable approach to leak detection and location in pipeline networks. The method uses pressure waves caused by quickly opening and closing a solenoid valve. Due to their simplicity and robustness, transient analyses provide a plausible route towards leak detection. For this work, the time domain signals of these pressure transients were obtained using a single pressure transducer. These pressure signals were first filtered using discrete wavelets to remove the dc offset, and the low and high frequencies. They were then analysed using a cepstrum method which identified the time delay between the initial wave and its reflections. There were some features in the processed results which can be ascribed to features in the pipeline network such as junctions and pipe ends. When holes were drilled in the pipe, new peaks occurred which identified the presence of a leak in the pipeline network. When tested with holes of different sizes, the amplitude of the processed peak was seen to increase as the cube root of the leak diameter. Using this method, it is possible to identify leaks that are difficult to find by other methods as they are small in comparison with the flow through the pipe.

79 citations


Journal ArticleDOI
TL;DR: A number of results on the cepstrum of a stationary signal are discussed, which might also be of interest to researchers in spectral analysis and allied topics, such as speech processing.
Abstract: Cepstrum thresholding is shown to be an effective, automatic way of obtaining a smoothed nonparametric estimate of the spectrum of a stationary signal. In the process of introducing the cepstrum thresholding-based spectral estimator, we discuss a number of results on the cepstrum of a stationary signal, which might also be of interest to researchers in spectral analysis and allied topics, such as speech processing

52 citations


Journal Article
TL;DR: The proposed technique outperforms the existing audio watermarking techniques against most of the asynchronous attacks and takes advantage of the attack-invariant feature of the cepstrum domain and the error-correction capability of BCH code to increase the robustness of audio watermarks.
Abstract: In this article, a BCH code-based audio watermarking approach performed in the cepstrum domain is proposed. The technique takes advantage of the attack-invariant feature of the cepstrum domain and the error-correction capability of BCH code to increase the robustness of audio watermarking. In addition, the watermarked audio has very high perceptual quality. A blind watermark detection technique is developed to identify the embedded watermark under various types of attacks. Experiment results demonstrate that the proposed technique outperforms the existing audio watermarking techniques against most of the asynchronous attacks.

48 citations


Proceedings ArticleDOI
14 May 2006
TL;DR: This paper proposes robust feature extraction based on kernel PCA instead of DCT, where the main speech element is projected onto low-order features, while noise or reverberant element isprojected onto high-order ones.
Abstract: We investigate a robust speech feature extraction method using kernel PCA (Principal Component Analysis). Kernel PCA has been suggested for various image processing tasks requiring an image model such as, e.g., denoising, where a noise-free image is constructed from a noisy input image [1]. Much research for robust speech feature extraction has been done, but it is difficult to completely remove the non-stationary noise or reverberation. The most commonly used noise-removal techniques are based on the spectral-domain operation, and then for the speech recognition, MFCC (Mel-Frequency Cepstral Coefficient) is computed, where DCT (Discrete Cosine Transform) is applied to the mel-scale filter bank output. In this paper, we propose robust feature extraction based on kernel PCA instead of DCT, where the main speech element is projected onto low-order features, while noise or reverberant element is projected onto high-order ones. Its effectiveness is confirmed by word recognition experiments on reverberant speech.

44 citations


Journal ArticleDOI
TL;DR: This paper shows that under certain approximations, frequency warping of MFCC features with Mel-warped triangular filter banks equals a linear transformation in the cepstral space, and proposes a formant-like peak alignment algorithm to adapt adult acoustic models to children’s speech.

40 citations


Journal ArticleDOI
TL;DR: In this study, the differences in the effectiveness of using various Japanese sounds in identifying the speakers were investigated and the stimuli used in the experiment was analysed in order to explain these differences in terms of acoustical distances.
Abstract: 1. Introduction Speech sounds convey not only linguistic or phonological information, but also nonlinguistic information, including the speakers' individualities [1]. It is known that the availability of the speech contents used for speaker identification differs depending on the types of sounds they contain, and it is reported that voiced sonorants, such as vowels and nasals, are most effective for speaker identification by both humans [2–4] and machines [5]. The speaker's individuality contained in speech sounds should have some acoustic correlations and their properties can be measured as acoustic parameters [6]. In this study, we conducted a human speaker identification test, and investigated the differences in the effectiveness of using various Japanese sounds in identifying the speakers. We also analysed the stimuli used in the experiment in order to explain these differences in terms of acoustical distances.

39 citations


Journal ArticleDOI
TL;DR: PPS signal processing procedures that attempt to align each individual frame to its natural cycle and avoid truncation of pitch cycles while still using constant frame size and frame offset are introduced in an effort to address the above problems.
Abstract: The fine spectral structure related to pitch information is conveyed in Mel cepstral features, with variations in pitch causing variations in the features. For speaker recognition systems, this phenomenon, known as “pitch mismatch” between training and testing, can increase error rates. Likewise, pitch-related variability may potentially increase error rates in speech recognition systems for languages such as English in which pitch does not carry phonetic information. In addition, for both speech recognition and speaker recognition systems, the parsing of the raw speech signal into frames is traditionally performed using a constant frame size and a constant frame offset, without aligning the frames to the natural pitch cycles. As a result the power spectral estimation that is done as part of the Mel cepstral computation may include artifacts. Pitch synchronous methods have addressed this problem in the past, at the expense of adding some complexity by using a variable frame size and/or offset. This paper introduces Pseudo Pitch Synchronous (PPS) signal processing procedures that attempt to align each individual frame to its natural cycle and avoid truncation of pitch cycles while still using constant frame size and frame offset, in an effort to address the above problems. Text independent speaker recognition experiments performed on NIST speaker recognition tasks demonstrate a performance improvement when the scores produced by systems using PPS are fused with traditional speaker recognition scores. In addition, a better distribution of errors across trials may be obtained for similar error rates, and some insight regarding of role of the fundamental frequency in speaker recognition is revealed. Speech recognition experiments run on the Aurora-2 noisy digits task also show improved robustness and better accuracy for extremely low signal-to-noise ratio (SNR) data.

37 citations


Proceedings ArticleDOI
20 Aug 2006
TL;DR: A novel approach of combining cepstral features and prosodic features in language identification is presented, which shows a significant improvement on a GMM-UBM based language identification (LID) system which utilizes modern shifted delta cepstrum (SDC) and feature warping techniques.
Abstract: A novel approach of combining cepstral features and prosodic features in language identification is presented in this paper. This combination approach shows a significant improvement on a GMM-UBM based language identification (LID) system which utilizes modern shifted delta cepstrum (SDC) and feature warping techniques. The proposed system achieves a high accuracy of 87.1% on a 10-language task, and outperforms the baseline system by 12%. The prosodic features are proven to be very effective in both tonal and non-tonal LID, as they deliver new language-discrimination information in addition to those from widely used cepstral features. Additionally, the performance of MFCC and PLP features with different coefficient numbers in language identification tasks are researched and compared. Less number of coefficients is more likely to be sufficient or even better for language identification.

33 citations


Journal ArticleDOI
TL;DR: The results of a careful examination of the mean scatterer spacing parameter in normal and pathological breast tissues in vivo using the autoregressive cepstrum indicate good correlation with microstructure of breast tissue characterization, and hence the AR cep strum holds promise that it could be used as an effective method for signal analysis of ultrasonic scattering and characterization of breast tissues scatterers.

32 citations


Journal ArticleDOI
TL;DR: In this article, the real cepstrum is used to design an arbitrary length minimum-phase finite-impulse response filter from a mixed-phase prototype, and only two fast Fourier transforms and a recursive procedure are required to find the filter's impulse response.
Abstract: The real cepstrum is used to design an arbitrary length minimum-phase finite-impulse response filter from a mixed-phase prototype. There is no need to start with the odd-length equiripple linear-phase filter first. Neither the phase-unwrapping nor root-finding procedure is needed. Only two fast Fourier transforms and a recursive procedure are required to find the filter's impulse response from its real cepstrum. The resulting filter's magnitude response is exactly the same as the original one even when the filter is of very high order

Proceedings ArticleDOI
16 Oct 2006
TL;DR: The pipeline prevention monitoring and leak detecting system based on calculating LPCC and using HMM (hidden Markov models) to recognise damage acoustic signals and the results show that the acoustic singles recognition rate is improved effectively and can be up to 97%.
Abstract: In order to protect pipeline transportation and prevent from leakage incident for manmade damage or natural factors, it is very important to carry out such researches as active protecting and accurate positioning. Designed the pipeline prevention monitoring and leak detecting system based on calculating LPCC (Linear Prediction Cepstrum Coefficient) and using HMM (Hidden Markov Models) to recognise damage acoustic signals. The continuous non-steady time-variety process was sub-framed and described with a series of short steady sequences on the basis of acoustic signal characteristic analysed. LPCC which represents accurately each short-time acoustic signal was selected as the acoustic signal characteristic parameters and extracted effectively using Durbin algorithm; HMM was established to recognise damage types by Baum-Welch revaluation algorithm with the state-transfer probability and observing time sequences characteristic parameters; using Viterbi decoding algorithm realized the search of best transfer route and achieved the corresponding export probability. The results show that the acoustic singles recognition rate is improved effectively based on sound spectrum LPCC and HMM,and can be up to 97%.

Journal ArticleDOI
TL;DR: A theoretical description of rahmonic analysis of voiced speech containing aspiration noise is provided, leading to a characterization of R1 and it is shown that R1 (estimated from speech) is directly proportional to R1 taken from the glottal signal.
Abstract: Rahmonics comprise the prominent peaks in the cepstrum of voiced speech; their locations correspond to the fundamental period and its multiples. The amplitude of the first rahmonic, R1, has previously been used to indicate voice quality. Although a correspondence between R1 and the richness of the harmonic spectrum for voiced speech is well recognized, a formal description has remained absent. A theoretical description of rahmonic analysis of voiced speech containing aspiration noise is provided, leading to a characterization of R1. The theory suggests that R1 is directly proportional to the geometric mean harmonics-to-noise ratio (gmHNR), where the gmHNR is defined as the mean of the individual spectral (i.e. at specific frequency locations) harmonics-to-noise ratios in dB. This hypothesis is validated using synthetically generated voice signals. R1 is shown to be directly proportional to gmHNR (measured directly from the dB spectrum). It is shown that R1 (estimated from speech) is directly proportional to R1 taken from the glottal signal. R1 and gmHNR (measured spectrally) underestimate the actual gmHNR when (averaged) noise levels exceed harmonic levels. Limiting the number of harmonics in the analysis window overcomes this problem and also alleviates the (temporal) window length/f0 dependence of R1 when estimated period synchronously.

Proceedings ArticleDOI
01 Nov 2006
TL;DR: In this paper, a complex speech analysis for an analytic speech signal to HMM speech recognition is introduced and the estimated complex parameters are converted to LPCCs and MFCCs as a feature vector for HTK (HMM tool kit) in order to realize the HMMspeech recognition.
Abstract: In speech recognition, LPC cepstrum based on LPC or MFCC based on Mel-frequency filter bank are widely used as a feature extraction that determines the performance. However, these are not being regarded as the best feature extraction tion. In this paper, we introduce a complex speech analysis for an analytic speech signal to HMM speech recognition. A complex speech analysis can estimate more accurate speech spectrum in low frequencies, as a result, it is expected that the speech analysis can perform well as a feature extractor in speech recognition. The MMSE-based time-varying complex AR speech analysis is adopted and the estimated complex parameters are converted to LPCCs and MFCCs as a feature vector for HTK (HMM Tool Kit) in order to realize the HMM speech recognition. Through continuous speech recognition experiments with the converted LPCCs and MFCCs, it was found that the complex speech analysis method would not perform well than the real one.

Journal ArticleDOI
TL;DR: An implementation of SPADE that operates in the frequency domain is introduced, and the validity of combining SPADE with speech enhancement methods is examined, confirming that SPADE combined with noise reduction methods can increase robustness in the presence of noise.

Proceedings ArticleDOI
14 May 2006
TL;DR: This paper investigates the use of the recently proposed modified group delay function (MODGDF) coefficients in combination with traditional magnitude-based features in a Gaussian mixture model (GMM) based system and finds that the addition of a modified regression-based shifted delta cepstrum (SDC) further improves system performance beyond that obtained by a more standard SDC configuration.
Abstract: To date, systems for the identification of spoken languages have normally used magnitude-based parameterization methods such as the MFCC and PLP. This paper investigates the use of the recently proposed modified group delay function (MODGDF) coefficients in combination with traditional magnitude-based features in a Gaussian Mixture Model (GMM) based system. We also examine the application of feature warping to magnitude-based features and the MODGDF and find that it can offer a significant cumulative improvement. We find that the addition of a modified regression-based Shifted Delta Cepstrum (SDC) further improves system performance beyond that obtained by a more standard SDC configuration. The combination of PLP, feature warping and the proposed regression-based SDC achieved an accuracy of 88.4% in tests on 10 languages in the OGI TS Corpus, which compares very favourably with alternative language identification systems reported in the literature.

Proceedings Article
27 May 2006
TL;DR: This paper presents two techniques of formants estimation based on LPC and cepstral analysis, implemented with Matlab and applied to the problem of accurate measurement of formant frequencies, and results show the efficiency of LP based technique and the limitation of the cEPstral technique in the estimation offormants of high frequencies.
Abstract: This paper presents two techniques of formants estimation based on LPC and cepstral analysis. These methods are implemented with Matlab and applied to the problem of accurate measurement of formant frequencies. The first algorithm estimate formant frequencies from the all pole model of the vocal tract transfer function. The approach relies on the source - filter model supposing that the speech signal can be considered to be the output of a linear system. The spectral peaks in the spectrum are the resonances of the vocal tract and are commonly referred to as formants. The cepstral algorithm picks formant frequencies from the smoothed spectrum. The approach relies on decomposing the speech signal by homomorphic deconvolution into two components: the first component presents the excitation, while the second component is intended to present vocal tract resonances. The result, called cepstrum, is then used to estimate the smoothed spectrum. Formant picking is achieved by localizing the spectral maxima from the envelope. Results show the efficiency of LP based technique and the limitation of the cepstral technique in the estimation of formants of high frequencies.

Book ChapterDOI
13 Dec 2006
TL;DR: A novel pitch mean based frequency warping (PMFW) method is proposed to reduce the pitch variability in speech signals at the front-end of speech recognition to solve the problem of mismatch in bandwidth between the original and the warped spectra.
Abstract: In this paper, a novel pitch mean based frequency warping (PMFW) method is proposed to reduce the pitch variability in speech signals at the front-end of speech recognition. The warp factors used in this process are calculated based on the average pitch of a speech segment. Two functions to describe the relations between the frequency warping factor and the pitch mean are defined and compared. We use a simple method to perform frequency warping in the Mel-filter bank frequencies based on different warping factors. To solve the problem of mismatch in bandwidth between the original and the warped spectra, the Mel-filters selection strategy is proposed. At last, the PMFW mel-frequency cepstral coefficient (MFCC) is extracted based on the regular MFCC with several modifications. Experimental results show that the new PMFW MFCCs are more distinctive than the regular MFCCs.

Proceedings ArticleDOI
18 Sep 2006
TL;DR: Two novel features that can be concatenated with Mel frequency cepstral coefficients are presented and a significant improvement in the error rate was found using delta cEPstral energy (DCE) and power spectrum deviation (PSDev).
Abstract: Speech and music discrimination has gained much popularity in recent years for efficient coding and automatic retrieval of multimedia sources and Automated Speech Recognition (ASR). Two novel features that can be concatenated with Mel frequency cepstral coefficients are presented in this paper: Delta Cepstral Energy (DCE) and Power Spectrum Deviation (PSDev). Employing a Gaussian mixture model for classification as a back-end to the system, a significant improvement in the error rate was found using these features. The effects of different musical instruments on error rates were also analyzed. Low frequency musical instruments like piano and electric bass guitar were found to be more difficult to discriminate from speech, however, the proposed features are also able to reduce such errors significantly.

Proceedings ArticleDOI
18 Dec 2006
TL;DR: Experiment studied indicates that this novel method combining LPC-based Cepstrum and HPS is effective and valuable for application in pitch detection, since it robustly handles different frequency domain noise and pitch errors.
Abstract: A novel method of pitch detection, combining LPCbased Cepstrum and Harmonic Product Spectrum (HPS), has been proposed. The interaction between the vocal tract and the glottal excitation disturb the detection from glottal excitation, so we use Linear Prediction Residual to eliminate the vocal tract information and the high frequency noise which can improve the accurate to some extent. In real world, when the speech signal has been transmitted through the telephone system, low frequency including pitch information have been cut off which can significantly attenuate the detection of fundamental pitch frequency. In this paper, we use the novel method combining LPC-based Cepstrum and HPS to deal with this problem and pitch errors. Experiment studied indicates that this novel method is effective and valuable for application in pitch detection, since it robustly handles different frequency domain noise and pitch errors.

Proceedings ArticleDOI
14 May 2006
TL;DR: This work explores how two spectral noise estimation approaches can be applied in the context of model-based feature enhancement and shows that the resulting system achieves an accuracy on the Aurora2 task that is comparable to MBFE with prior knowledge on noise.
Abstract: Many compensation techniques, both in the model and feature domain, require an estimate of the noise statistics to compensate for the clean speech degradation in adverse environments. We explore how two spectral noise estimation approaches can be applied in the context of model-based feature enhancement. The minimum statistics method and the improved minima controlled recursive averaging method are used to estimate the noise power spectrum based only on the noisy speech. The noise mean and variance estimates are nonlinearly transformed to the cepstral domain and used in the Gaussian noise model of MBFE. We show that the resulting system achieves an accuracy on the Aurora2 task that is comparable to MBFE with prior knowledge on noise. Finally, this performance can be significantly cantly improved when the MS or IMCRA noise mean is reestimated based on a clean speech model.

Proceedings ArticleDOI
14 May 2006
TL;DR: A supervised approach is proposed to learn the non linear transformation of the uncertainty from the linear spectral domain to the cepstral domain and shows substantial improvement over the baseline performance of the Aurora4 task.
Abstract: Recently several algorithms have been proposed to enhance noisy speech by estimating a binary mask that can be used to select those time-frequency regions of a noisy speech signal that contain more speech energy than noise energy. This binary mask encodes the uncertainty associated with enhanced speech in the linear spectral domain. The use of the cepstral transformation leads to a smearing of this uncertainty. We propose a supervised approach to learn the non linear transformation of the uncertainty from the linear spectral domain to the cepstral domain. This uncertainty is used by a decoder that exploits the variance associated with the enhanced cepstral features to improve robust speech recognition. Systematic evaluations on a subset of the Aurora4 task using the estimated uncertainty shows substantial improvement over the baseline performance.

Book ChapterDOI
01 Jan 2006
TL;DR: It is well known that discontinuities in pipe networks give reflections to pressure waves that can be analysed to find the time delay between the original signal and the reflected one, but more complicated methods are required to extract data about further reflections from the end of the pipe which has a leak in it.
Abstract: It is well known that discontinuities in pipe networks give reflections to pressure waves that can be analysed to find the time delay between the original signal and the reflected one. A leak in a pipe will also give a reflection point, though possibly a more diffuse one. It is a reasonably straightforward task (using, say, a cross correlation) to measure the time delay of the first reflection, but more complicated methods are required to extract data about further reflections from, for example, the end of the pipe which has a leak in it.

Proceedings ArticleDOI
01 Oct 2006
TL;DR: A study is presented to apply order cepstrum and radial basis function (RBF) artificial neural network (ANN) for gear fault detection during speed-up process and the results show the effectiveness ofOrder cep strum and RBF in detection of the gear condition.
Abstract: A study is presented to apply order cepstrum and radial basis function (RBF) artificial neural network (ANN) for gear fault detection during speed-up process. This method combines computed order tracking, cepstrum analysis with ANN. Firstly, the vibration signal during speed-up process of the gearbox is sampled at constant time increments and then is resampled at constant angle increments. Secondly, the resampled signals are processed by cepstrum analysis. The order cepetrum of with normal, wear and crack fault are processed for feature extracting. In the end, the extracted features are used as inputs to RBF for recognition. The RBF is trained with a subset of the experimental data for known machine conditions. The ANN is tested using the remaining set of data. The procedure is illustrated using the experimental vibration data of a gearbox. The results show the effectiveness of order cepstrum and RBF in detection of the gear condition.

Proceedings ArticleDOI
14 May 2006
TL;DR: A bilateral time spread echo kernel is proposed to improve robustness against malicious attacks as well as enhance the detection performance by increasing the peak value of the kernel cepstrum at the echo delay time.
Abstract: Echo hiding is an audio watermarking method where information is embedded in the echo delay. A bilateral time spread echo kernel is proposed to improve robustness against malicious attacks as well as enhance the detection performance by increasing the peak value of the kernel cepstrum at the echo delay time. Based on the spreading property of PN sequences and expansions of logarithm and binomial, a closed-form detection gain formula is derived analytically. Computer simulation confirms the proposed method is effective and outperforms the previous unilateral method.

Journal ArticleDOI
01 Dec 2006
TL;DR: Both measures of the periodicity of the voice signal are shown to be relatively f"0-independent; however, the index appears to be less sensitive when compared against SRA, a new measure utilising all rahmonics in the cepstrum.
Abstract: Aperiodicity in sustained phonation can result from temporal, amplitude and waveshape perturbations, turbulent noise, nonlinear phenomena and non-stationarity of the vocal tract. General measures of the periodicity of the voice signal are of interest in, for example, quantifying voice quality and in the assessment of pathological voice. High and low quefrency cepstral techniques are employed to supply an index of the degree of voice signal periodicity. In the high quefrency region, the first rahmonic is used to provide an indication of the periodicity of the signal. A new measure, SRA (sum of rahmonic amplitudes) - utilising all rahmonics in the cepstrum, is tested against synthesis data (six levels of random jitter, cyclic jitter, shimmer and random noise). In addition, an existing popular technique using the first rahmonic (cepstral peak prominence, CPP) is assessed with synthesis data for the first time. Both measures decrease with increasing aperiodicity levels of the glottal source, decreasing more noticeably for noise and random jitter than for shimmer and cyclic jitter. CPP is shown to be relatively f"0-independent; however, the index appears to be less sensitive when compared against SRA.

Patent
06 Apr 2006
TL;DR: In this paper, a noise detector is configured to execute a so-called "cepstrum" transform of a captured signal exposed to cell phone noise, which can then easily be detected from the cepstrum transform as peaks at known samples, and noise elimination or attenuation may then be executed on the captured signal when cell phone noises are detected.
Abstract: The present invention relates to detecting cell phone noise induced in telecommunication equipment, especially in microphones and other unshielded electronic units connected to a communication terminal. A noise detector is configured to execute a so-called 'cepstrum' transform of a captured signal exposed to cell phone noise. Due to the characteristics of cell phone radio signals using TDMA, cell phone induced can then easily be detected from the cepstrum transform as peaks at known samples, and noise elimination or attenuation may then be executed on the captured signal when cell phone noise is detected.

Journal Article
TL;DR: A blind blur estimation technique based on the low rank approximation of cepstrum of degraded images is proposed and shown that the proposed technique can correctly estimate commonly used blur types both in noiseless and noisy cases.
Abstract: The quality of image restoration from degraded images is highly dependent upon a reliable estimate of blur. This paper proposes a blind blur estimation technique based on the low rank approximation of cepstrum. The key idea that this paper presents is that the blur functions usually have low ranks when compared with ranks of real images and can be estimated from cepstrum of degraded images. We extend this idea and propose a general framework for estimation of any type of blur. We show that the proposed technique can correctly estimate commonly used blur types both in noiseless and noisy cases. Experimental results for a wide variety of conditions i.e., when images have low resolution, large blur support, and low signal-to-noise ratio, have been presented to validate our proposed method.

Journal Article
TL;DR: Two feature extraction methods based on differential power spectrum(DPS) and differential cepstrum,originally used in the research area of speech signal processing and homomorphic signal processing are respectively introduced to the radar target recognition community.
Abstract: The problem of target recognition using the high resolution radar range profiles is discussedTwo feature extraction methods based on differential power spectrum(DPS) and differential cepstrum,originally used in the research area of speech signal processing and homomorphic signal processing are respectively introduced to the radar target recognition communityTwo differential power spectrum based features are applied to target classificationA multi-layered feed forward neural network with SARPROP(simulated annealing resilient propagation) algorithm is selected as classifierThe range profiles are obtained with step-frequency technique and the two-dimension backscatter distribution data of four different scaled aircraft modelsSimulations are presented to evaluate the classification performance with the above featuresThe results show that the differential power spectrum based feature is effective and robust for the radar target recognition

Proceedings ArticleDOI
20 Aug 2006
TL;DR: A set of new dynamic features for speaker verification system are introduced that can compactly represent the information in the delta and delta-delta cepstra and it is shown theoretically that DCE carries the same information as the delta cepstrum using an entropy criterion.
Abstract: Dynamic cepstral features such as delta and delta-delta cepstra have been shown to play an essential role in capturing the transitional characteristics of the speech signal. In this paper, a set of new dynamic features for speaker verification system are introduced. These new features, known as delta cepstral energy (DCE) and delta-delta cepstral energy (DDCE), can compactly represent the information in the delta and delta-delta cepstra. Furthermore, it is shown theoretically that DCE carries the same information as the delta cepstrum using an entropy criterion. Experimental speaker verification results on the TIMIT database support the theoretical result, showing a significant improvement in terms of equal error rate compared with conventional feature extraction methods using delta and delta-delta cepstra