scispace - formally typeset
Search or ask a question

Showing papers on "Cepstrum published in 1994"


Book
01 Jan 1994
TL;DR: Digital signals and systems z-transforms digital filter design discrete and fast Fourier transform algorithms periodogram and Blackman-Tukey CEPSTRUM adaptive noise cancelling adaptive line enhancer adaptive zero tracking methods autoregressive (AR) method autore progressive moving average (ARMA) method Prony's method.
Abstract: Digital signals and systems z-transforms digital filter design discrete and fast Fourier transform algorithms periodogram and Blackman-Tukey CEPSTRUM adaptive noise cancelling adaptive line enhancer adaptive zero tracking methods autoregressive (AR) method autoregressive moving average (ARMA) method Prony's method.

123 citations


Journal ArticleDOI
TL;DR: An algorithm for nonparametric estimation of 1D ultrasound pulses in echo sequences from human tissues is derived, and its ability to recover low variance attenuation estimates in the normal liver from in vivo pulse-echo data is demonstrated.
Abstract: An algorithm for nonparametric estimation of 1D ultrasound pulses in echo sequences from human tissues is derived. The technique is a variation of the homomorphic filtering technique using the real cepstrum, and the underlying basis of the method is explained. The algorithm exploits a priori knowledge about the structure of RF line echo data and can employ a number of adjacent RF lines from an image. The prime application of the algorithm is to yield a pulse suitable for deconvolution algorithms. This will enable these algorithms to properly take into account the frequency dependence of the attenuation and its variation within a patient and among patients. It is also possible to use the estimated pulse for attenuation estimation, and the consistency of the assumptions underlying the proposed technique is demonstrated by its ability to recover low variance attenuation estimates in the normal liver from in vivo pulse-echo data. Estimates are given for 8 different patients. >

85 citations


Journal ArticleDOI
TL;DR: This paper addresses the problem of estimating mean-scatterer spacing from the backscattered ultrasound signal using spectral redundancy characterized by the spectral autocorrelation (SAC) function, and results indicate that SAC- based estimates converge more reliably over smaller amounts of data than cepstrum-based estimates.
Abstract: An ultrasonic backscattered signal from material comprised of quasiperiodic scatterers exhibit redundancy over both its phase and magnitude spectra. This paper addresses the problem of estimating mean‐scatterer spacing from the backscattered ultrasound signal using spectral redundancy characterized by the spectral autocorrelation (SAC) function. Mean‐scatterer spacing estimates are compared for techniques that use the cepstrum and the SAC function. A‐scan models consist of a collection of regular scatterers with Gamma distributed spacings embedded in diffuse scatterers with uniform distributed spacings. The model accounts for attenuation by convolving the frequency dependent scattering centers with a time‐varying system response. Simulation results indicate that SAC‐based estimates converge more reliably over smaller amounts of data than cepstrum‐based estimates. A major reason for the performance advantage is the use of phase information by the SAC function, while the cepstrum uses a phaseless power spec...

83 citations


Proceedings ArticleDOI
19 Apr 1994
TL;DR: Noise-masking is considered, through the addition of a constant offset to the linear spectral estimates, which provides a feature space far more stable to changes in noise statistics, which leads to performance equivalent to that achieved by explicit modelling.
Abstract: This paper examines the effects of additive Gaussian noise on the short-term cepstral analysis of speech. We identify three distinct modifications to the long-term statistics of the cepstrum that cause a gross mismatch after the addition of noise, namely: a mean shift, a change of variance and a distribution distorted from normal, with distinct bimodal characteristics. We assess the importance of each of these, and demonstrate the limitations of simple cepstral mappings. We then consider noise-masking, through the addition of a constant offset to the linear spectral estimates, which provides a feature space far more stable to changes in noise statistics. This leads to performance equivalent to that achieved by explicit modelling. >

70 citations


Journal ArticleDOI
TL;DR: A new set of features is introduced that has been found to improve the performance of automatic speaker identification systems, known as the adaptive component weighting (ACW) cepstral coefficients, which provides an adaptively weighted version of the LP cepstrum.
Abstract: A new set of features is introduced that has been found to improve the performance of automatic speaker identification systems, The new set of features is referred to as the adaptive component weighting (ACW) cepstral coefficients. The new features emphasize the formant structure of the speech spectrum while attenuating the broad-bandwidth spectral components. The attenuated components correspond to the variations in spectral tilt of transmission and recording environment, and other characteristics that are irrelevant to speaker identification. The resulting ACW spectrum introduces zeros into the usual all-pole linear prediction (LP) spectrum. This is equivalent to applying a finite impulse response (FIR) filter that normalizes the narrow-band modes of the spectrum. Unlike existing fixed cepstral weighting schemes, the ACW cepstrum provides an adaptively weighted version of the LP cepstrum. The adaptation results in deemphasizing the irrelevant variations of the LP cepstral coefficients on a frame-by-frame basis. The ACW features are evaluated for text-independent speaker identification and are shown to yield improved performance. >

65 citations


Patent
19 Jul 1994
TL;DR: In this article, the authors used a plurality of correlators to improve an estimate of direct signal arrival time by identifying features of a correlation function at and adjacent to the correlation peak.
Abstract: Method and apparatus for using a plurality of correlators to improve an estimate of direct signal arrival time by identifying features of a correlation function at and adjacent to the correlation peak. In a first embodiment, the errors in location of the center point of a correlation function R(τ), formed by the incoming composite signal and a stored copy of the expected signal, are assumed to be strongly correlated for narrow sample spacing and wide sample spacing of the correlation function. In a second embodiment, multipath signal strengths and phases are estimated, using multiple sampling of the correlation function R(τ). This approach assumes that path delays of the direct signal and of the multipath signals can be determined separately. Path delays can be determined by any of at least three approaches: (1) identification of slope transition points in the correlation function; (2) Cepstrum processing of the received signal, using Fourier transform analysis; and (3) use of a grid of time points on the correlation function domain, and identification of time values, associated with certain solution parameters of the least mean squares analysis that have the largest absolute values, as times of arrival of the direct and multipath signals. Separate identification of multipath time delays reduces the least mean squares analysis to a linear problem. A modified signal is constructed, with the multipath signal(s) approximately removed from the incoming composite signal. This modified signal allows a better estimate of the arrival time of the direct signal.

41 citations


Journal Article
TL;DR: In this article, a new algorithm is proposed which reduces the vector order of individual sound localization transfer functions (SLTFs) using cepstrum parameters, and the SLTF vectors are classified into several clusters using the Linde, Buzo, and Gray method.
Abstract: A new algorithm is proposed which prepares a set of sound localization transfer functions (SLTFs) for a sound localization system with binaural earphones. A listener in a teleconference system can choose the best SLTFs from the set prepared in advance. This avoids the trouble of measuring STLFs individually and maintains good perceptual localization performance. This algorithm reduces the vector order of the individual SLTFs, using cepstrum parameters. The SLTF vectors are classified into several clusters using the Linde, Buzo, and Gray method, and the SLTF nearest each cluster's centroid is chosen as the typical SLTF the cluster

39 citations


Proceedings ArticleDOI
19 Apr 1994
TL;DR: The proposed scheme processes the outputs of two microphones using cepstra operations and the theory of signal reconstruction from phase only and reconstructs the room impulse response associated with each microphone and restores the speech signal.
Abstract: We propose an algorithm for the restoration of speech that has been degraded through addition of multiple echoes The proposed scheme processes the outputs of two microphones using cepstra operations and the theory of signal reconstruction from phase only Under mild assumptions it reconstructs the room impulse response associated with each microphone and restores the speech signal We demonstrate the performance of the proposed scheme using speech reverberated by simulated room acoustics >

39 citations


Journal ArticleDOI
TL;DR: New discrete time blind deconvolution methods for nonminimum phase linear channels driven by cyclo-stationary inputs rely exclusively on second-order statistics and do not impose any constraints on the distribution of the channel input as in the case of methods based on higher- order statistics.
Abstract: New discrete time blind deconvolution methods are proposed for nonminimum phase linear channels driven by cyclo-stationary inputs. The methods rely exclusively on second-order statistics and do not impose any constraints on the distribution of the channel input as in the case of methods based on higher-order statistics. The output of the channel is fractionally sampled and then the complex cepstrum of the cyclic autocorrelation is obtained. It is shown that this complex cepstrum preserves nonminimum phase information and thus the identification of nonminimum phase channels is possible. Practical constraints in the implementation of the methods and channel identifiability conditions are discussed. The applicability of the methods to both channel identification and fractionally spaced linear and DFE equalization is described and verified by means of computer simulations. >

36 citations



Journal ArticleDOI
TL;DR: In this paper, the poles and zeros of the frequency response function (FRF) from a measured response autospectrum were extracted by curve-fitting analytical expressions to selected regions of the response power cepstrum which are known to be dominated by the FRF.

Proceedings ArticleDOI
19 Apr 1994
TL;DR: Experimental results show that the proposed algorithm, combined with cepstrum mean subtraction, improves the recognition accuracy when the system is tested on a microphone with different characteristics than the one on which it was trained.
Abstract: In this paper, we present several approaches designed to increase the robustness of BYBLOS, the BBN continuous speech, hidden Markov model (HMM) recognition system. We address the problem of increased degradation in performance when there is mismatch in the characteristics of the training and the test microphones. First we compare RASTA processing and cepstrum mean subtraction as preprocessing methods, to compensate for unknown channel transfer function effects, when we have no information about the new microphone. Then we introduce a new algorithm that computes a probabilistic transformation from the training microphone codebook to that of a new microphone, given some information about the new microphone. We test this algorithm in supervised mode and, combined with a microphone selection method, in unsupervised mode. We present experimental results which show that the proposed algorithm combined with cepstrum mean subtraction, improves the recognition accuracy when the system is tested on a microphone with different characteristics than the one on which it was trained. >

Proceedings ArticleDOI
19 Apr 1994
TL;DR: A speech spectrum transformation method by interpolating spectral patterns between pre-stored multiple speakers for speech synthesis, which uses only a small amount of training data to generate a new speech spectrum sequence close to the target speaker's.
Abstract: Proposes a speech spectrum transformation method by interpolating spectral patterns between pre-stored multiple speakers for speech synthesis. The interpolation is carried out using spectral parameters such as cepstrum and log area ratio to generate new spectrum patterns. The spectral patterns can be transformed smoothly as the interpolation ratio is gradually changed, and speech individuality can easily be controlled between interpolated speakers. Adaptation to a target speaker can be performed by this interpolation, which uses only a small amount of training data to generate a new speech spectrum sequence close to the target speaker's. An adaptation experiment was carried out in the case of using only one word spoken by the target speaker for learning. It was shown that the distance between the target speaker's spectrum and the spectrum generated by the proposed interpolation method is reduced by about 40% compared with distance between the target speaker's spectrum and spectrum of the speaker closest to the target among pre-stored ones. >

Proceedings ArticleDOI
19 Apr 1994
TL;DR: The adaptive component weighting (ACW) cepstral coefficients represent an adaptively weighted version of the LP cepstrum and are shown to offer improved speaker identification performance as compared to other common methods of cepStral weighting.
Abstract: In this paper we introduce a new set of features that provides improved performance for speaker identification. This feature set is referred to as the adaptive component weighting (ACW) cepstral coefficients. The ACW scheme modifies the linear predictive (LP) spectral components (resonances) so as to emphasize the formant structure by attenuating the broad-bandwidth spectral components. Such components are found to introduce undesired variability in the LP spectra of speech signals due to environmental factors. The ACW cepstral coefficients represent an adaptively weighted version of the LP cepstrum. The adaptation results in deemphasizing the irrelevant variations of the LP cepstral coefficients on a frame-by-frame basis. Experiments are presented using the San Diego portion of the King database. The ACW cepstrum is shown to offer improved speaker identification performance as compared to other common methods of cepstral weighting. >

Proceedings ArticleDOI
19 Apr 1994
TL;DR: The aim of this paper is to show that OSALPC also achieves good performance in a case of real noisy speech (in a car environment), and to explore its combination with several robust similarity measuring techniques, showing that its performance improves by using cepstral liftering, dynamic features and multilabeling.
Abstract: The performance of the existing speech recognition systems degrades rapidly in the presence of background noise. The OSALPC (one-sided autocorrelation linear predictive coding) representation of the speech signal has shown to be attractive for speech recognition because of its simplicity and its high recognition performance with respect to the standard LPC in severe conditions of additive white noise. The aim of this paper is twofold: (1) to show that OSALPC also achieves good performance in a case of real noisy speech (in a car environment), and (2) to explore its combination with several robust similarity measuring techniques, showing that its performance improves by using cepstral liftering, dynamic features and multilabeling. >

Journal ArticleDOI
TL;DR: In this paper, the authors used cepstrum analysis to identify harmonics and sideband families for fault diagnosis in gears, bearings, and turbine blades of ships and submarines.
Abstract: Conventional frequency analysis in machinery vibration is not adequate to find out accurately defects in gears, bearings, and blades where sidebands and harmonics are present. Also such an approach is dependent on the transmission path. On the other hand, cepstrum analysis accurately identifies harmonics and sideband families and is a better technique available for fault diagnosis in gears, bearings, and turbine blades of ships and submarines. Cepstrum represents the global power content of a whole family of harmonics and sidebands when more than one family of sidebands are presents at the same time. Also it is insensitive to the transmission path effects since source and transmission path effects are additive and can be separated in cepstrum. The concept, underlying theory and the measurement and analysis involved for using the technique are briefly outlined. Two cases were taken to demonstrate advantage of cepstrum technique over the spectrum analysis. An LP compressor was chosen to study the transmission path effects and a marine gearbox having two sets of sideband families was studied to diagnose the problematic sideband and its severity.

Journal ArticleDOI
TL;DR: The degree of aperiodicity and the excess of high-frequency noise appear to be frequently influenced in a very different way by the surgical treatment, and the magnitude of the main cepstrum peak reaches the best statistical significance score for demonstrating functional improvement after surgery.
Abstract: Objective and quantitative acoustic parameters are useful—in addition to perceptual evaluation—for assessing the results of voice surgery. Thirty-two patients with different kinds of benign vocal fold lesions and ten patients who had Teflon injection in a paralysed vocal fold were investigated just before and a few months after surgery. In each case we measured in a sustained /a/: relative high-frequency noise, jitter ratio, and magnitude of dominant cepstrum peak. Paired values of all three parameters demonstrate a statistically significant improvement. The degree of aperiodicity and the excess of high-frequency noise appear to be frequently influenced in a very different way by the surgical treatment. However, the magnitude of the main cepstrum peak, which is sensitive to both components, reaches the best statistical significance score for demonstrating functional improvement after surgery.

Journal ArticleDOI
TL;DR: Using pitchnormalized phone models in an HMM-LR speech recognition system improved the phrase recognition accuracy for the top 5 candidates from 96% to 97.5%, i.e. the error rate was nearly halved.
Abstract: This paper proposes a novel method of incorporating pitch information into an HMM speech recognition system by exploiting the correlation between pitch and spectral parameters, e.g. cepstrum. Pitch patterns are not used explicitly; instead, spectral parameters are normalized framewise according to the pitch value. Evidence is given to show that the use of pitch information consistently improves the recognition performance. Experiments with 24 phoneme labels showed that the phoneme error rate for fast continuous speech could be improved by about 10%. Using these pitchnormalized phone models in an HMM-LR speech recognition system improved the phrase recognition accuracy for the top 5 candidates from 96% to 97.5%, i.e. the error rate was nearly halved.

Journal ArticleDOI
TL;DR: Several distance measures (Euclidean distances between complex autoregressive coefficients or complex partial correlation coefficients, log-likelihood distance, and complex power cepstrum distance) between planar shapes are presented on the basis of a complex Autoregressive model and are suitable for classification, identification, or clustering of planar shaped applications, like unsupervised applications.

Journal ArticleDOI
TL;DR: A new technique is developed for the estimation of interpath time delay applying the multiple signal classification (MUSIC) superresolution spectral estimation method, which samples the signals received by two spatially separated antennas to compute the normalized MUSIC cepstrum.
Abstract: This paper considers the problem of passive geolocation for the case of HF multipath propagation. A new technique is developed for the estimation of interpath time delay applying the multiple signal classification (MUSIC) superresolution spectral estimation method. The technique samples the signals received by two spatially separated antennas to compute the normalized MUSIC cepstrum. The method is applied to experimental data in a preliminary proof-of-concept analysis. >


Proceedings ArticleDOI
19 Apr 1994
TL;DR: A new algorithm for optimizing the dynamic cepstrum lifter array successfully improved the speech recognition performance for the speech spoken even in a different speaking style.
Abstract: The dynamic cepstrum parameter representing a masked spectrum performed extremely well in continuous speech recognition. This paper proposes a new algorithm for optimizing the dynamic cepstrum lifter array. The masking filter is represented by a set of Gaussian-shaped lifters. The standard deviation and the gain of the Gaussians are trained in order to improve the performance of the time-frequency filter. Parameterizing the lifter shape provides robustness against unknown speech samples. Because of the parameterized lifter's small degree of freedom, it can avoid over-learning. The gradient descent optimizing algorithm is formulated for both a neural network classifier and an HMM classifier. The optimized dynamic cepstrum successfully improved the speech recognition performance for the speech spoken even in a different speaking style. >

Proceedings ArticleDOI
19 Apr 1994
TL;DR: Experimental results indicate that cepstral-time features, and spectral-time noise processing, provide an effective framework for robust speech recognition in noisy environments.
Abstract: This paper explores the advantages of using cepstral-time feature matrices, and spectral-time filters, for noisy speech recognition within a hidden Markov model framework. The use of cepstral-time features with spectral subtraction and state-based time-varying Wiener filters is investigated. Experimental results indicate that cepstral-time features, and spectral-time noise processing, provide an effective framework for robust speech recognition in noisy environments. >

Proceedings ArticleDOI
09 Oct 1994
TL;DR: The homomorphic deconvolution method gave substantial improvement in the radial resolution of B-scan images of a tissue mimicking phantom and of human tissues in vivo without significant amplification of the image noise.
Abstract: Describes how homomorphic deconvolution can be used to improve the radial resolution of in vitro and in vivo medical ultrasound images. Each of the recorded radiofrequency ultrasound beams used to form the image was considered as a finite depth sequence of length N, and was weighted with the same exponential depth sequence to create at least some minimum phase sequences. The mean value at each depth sample of the complex cepstrum sequences was computed, and the low depth portion of this mean sequence was taken as the complex cepstrum representation of the ultrasound pulse. It was transformed back to the Fourier frequency domain, and was used to compute the deconvolved echo depth sequence. The method gave substantial improvement in the radial resolution of B-scan images of a tissue mimicking phantom and of human tissues in vivo without significant amplification of the image noise.

Proceedings ArticleDOI
19 Apr 1994
TL;DR: A comparison of normalisation functions demonstrates that the root-based cepstral representation can still be significantly enhanced by an appropriate reinforcement of the most energetic speech portions, leading to a 10-15% increase in performance on raw data.
Abstract: In this contribution we address the problem of speech recognition in noise and mismatch (there is mismatch when the conditions of training are different from those of testing). We extend our previously reported work in two directions. First, a comparison of normalisation functions demonstrates that the root-based cepstral representation can still be significantly enhanced by an appropriate reinforcement of the most energetic speech portions, leading to a 10-15% increase in performance on raw data Second, we extend this concept and propose two root-adaptive schemes. Recognition tests demonstrate the noise robustness of the proposed analysis that further improves the results by a significant amount. >

Journal ArticleDOI
TL;DR: In this article, an experimental evaluation of the ability of sound pressure microphones to diagnose different machinery conditions in noisy environments was performed and an adaptive filtering (ANC) routine was incorporated to reduce the noise.

Proceedings ArticleDOI
30 Sep 1994
TL;DR: In this paper, the bicepstrum was used for the identification of linear motion and defocus blurs, and the performance of the blur identification methods based on the spectrum, the cepstrum, the bispectrum and the pectrum was compared for different blur sizes and signal-to-noise ratio levels.
Abstract: The identification of the point spread function (PSF) from the degraded image data constitutes an important first step in image restoration that is known as blur identification. Though a number of blur identification algorithms have been developed in recent years, tow of the earlier methods based on the power spectrum and power cepstrum remain popular, because they are easy to implement and have proved to be effective in practical situations. Both methods are limited to PSF's which exhibit spectral nulls, such as due to defocused lens and linear motion blur. Another limitation of these methods is the degradation of their performance in the presence of observation noise. The central slice of the power bispectrum has been employed as an alternative to the power spectrum which can suppress the effects of additive Gaussian noise. In this paper, we utilize the bicepstrum for the identification of linear motion and defocus blurs. We present simulation results where the performance of the blur identification methods based on the spectrum, the cepstrum, the bispectrum and the bicepstrum is compared for different blur sizes and signal-to-noise ratio levels.© (1994) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.


Proceedings ArticleDOI
13 Apr 1994
TL;DR: Two different approaches are here presented for converting speech into graphic animation, suitable to lipreading, by associating incoming acoustic frames with correspondent consistent graphic primitives, pictures or parameters, suitably visualized.
Abstract: Two different approaches are here presented for converting speech into graphic animation, suitable to lipreading. Coarticulation modeling has been taken into account by means of an original algorithm for multistep statistical processing of digital speech in the cepstrum space. Two different visualization methods have been used, the first based on key reference pictures and the second based on a parametric flexible structure. Speech is converted, in real-time, into lipreadable visual animation by associating incoming acoustic frames with correspondent consistent graphic primitives, pictures or parameters, suitably visualized. Multiple fields of application are foreseen, mainly in rehabilitation, training, education and communication among hard of hearing people. >

Proceedings ArticleDOI
28 Nov 1994
TL;DR: Simulation of the proposed method shows that the difference frequency is estimated of accuracy within 4 Hz, which will contribute to AFC of SCSSB.
Abstract: In order to get distortion free speech at suppressed carrier SSB (SCSSB), a receiver should tune to the carrier frequency accurately. Manual tuning is very difficult due to the lack of carrier component. If fine tuning has failed at reception, harmonic structure of demodulated voiced sound is lost. A fundamental frequency Fo, however, can be calculated from the cepstrum of received speech even if the voiced spectrum is shifted. Difference between the carrier frequency and the receiving frequency, is estimated from peaks on the spectrum of received speech, which correspond to harmonics of Fo. Simulation of the proposed method shows that the difference frequency is estimated of accuracy within 4 Hz. This method will contribute to AFC of SCSSB.