scispace - formally typeset
Search or ask a question

Showing papers on "Cepstrum published in 1988"


Journal ArticleDOI
TL;DR: A computationally efficient identification procedure is proposed for a nonGaussian white-noise-driven linear, time-invariant, nonminimum phase system and is flexible enough to be applied on autoregressive (AR), moving average (MA), or ARMA system without a priori knowledge of the type of the system.
Abstract: A computationally efficient identification procedure is proposed for a nonGaussian white-noise-driven linear, time-invariant, nonminimum phase system. The method is based on the idea of computing the complex cepstrum of higher order cumulants of the system output. In particular, the differential cepstrum parameters of the nonminimum phase impulse response are estimated directly from higher-order cumulants by least-squares solution or two-dimensional FFT operations. The method reconstructs the minimum-phase and maximum-phase impulse response components separately. It is flexible enough to be applied on autoregressive (AR), moving average (MA), or ARMA system without a priori knowledge of the type of the system. Benchmark simulation examples demonstrate the effectiveness of the method even with short length data records. >

189 citations


Journal ArticleDOI
TL;DR: An LPC (linear predictive coding) cepstrum distance measure (CD) is introduced as an objective measure for estimating the subjective quality of speech signals and good correspondence between LPC CD and the subjectivequality, expressed in terms of both opinion equivalent Q and mean opinion score are shown.
Abstract: An LPC (linear predictive coding) cepstrum distance measure (CD) is introduced as an objective measure for estimating the subjective quality of speech signals. Good correspondence between LPC CD and the subjective quality, expressed in terms of both opinion equivalent Q and mean opinion score, are shown. Good repeatability of objective quality evaluation using LPC CD is also shown. A method for generating an artificial voice signal that reflects the characteristics of real speech signals is described. The LPC CD values calculated using this artificial voice are almost the same as those calculated using real speech signals. The speaker-dependency of the coded-speech quality is shown to be an important factor in low-bit-rate speech coding. Even taking this factor into consideration, LPC CD is shown to be effective for estimating the subjective quality. >

151 citations


Journal ArticleDOI
TL;DR: The use of power cepstrum analysis in image registration is explored, and a new algorithm can work very fast and accurately compared to conventional correlation techniques.
Abstract: The use of power cepstrum analysis in image registration is explored. Rotational shifts and translational shifts are corrected separately. The technique involves two main ideas. First, after preprocessing to remove extraneous information and information which could result in false registration parameters, a rotational shift is changed into a translational shift by using the shift-invariant property of the power spectrum. Second, power cepstrum analysis is used to correct the translational shift. Because of the introduction of these ideas, this new algorithm can work very fast and accurately compared to conventional correlation techniques. This registration technique is applied to sequential fundus images with potential application in detecting changes in fundus anomalies.

48 citations


Proceedings ArticleDOI
D. Mansour1, Biing-Hwang Juang1
11 Apr 1988
TL;DR: Experimental results show that the new measures cause no degradation in recognition accuracy at high SNR, but perform significantly better when tested under noisy conditions using only clean reference templates.
Abstract: The authors aim at the formulation of similarity measures for robust speech recognition. Their consideration focuses on the speech cepstrum derived from linear prediction coefficients (the LPC cepstrum). By using common models for noisy speech, they analytically and empirically show how the ambient noise can affect some important attributes of the LPC cepstrum such as the vector norm, coefficient order, and the direction perturbation. The new findings led them to propose a family of distortion measures based on the projection between two cepstral vectors. Performance evaluation of these measures has been conducted in both speaker-dependent and speaker-independent isolated word recognition tasks. Experimental results show that the new measures cause no degradation in recognition accuracy at high SNR, but perform significantly better when tested under noisy conditions using only clean reference templates. At an SNR of 5 dB, the new measures are shown to be able to achieve a recognition rate equivalent to that obtained by the filtered cepstral measure at 20 dB SNR, demonstrating a gain of 15 dB. >

36 citations


Proceedings ArticleDOI
11 Apr 1988
TL;DR: Analysis parameters and various distance measures are investigated for a template matching scheme for speaker identity verification (SIV) and performance varies significantly across vocabulary, and average performance is approximately 5% EER for the better algorithms on telephone speech.
Abstract: Analysis parameters and various distance measures are investigated for a template matching scheme for speaker identity verification (SIV). Two parameters are systematically varied-the length of the signal analysis window, and the order of the linear predictive coding/-cepstrum analysis. Computational costs associated with the choice of parameters are also considered. The distance measures tested are the Euclidean, inverse variance weighting, differential mean weighting, Kahn's simplified weighting, the Mahalanobis distance, and the Fisher linear discriminant. Using the equal error rate (EER) of pairwise utterance dissimilarity distributions, performance is estimated for prespecified and (a simulation of) user-determined input vocabulary. Performance varies significantly across vocabulary, and average performance is approximately 5% EER for the better algorithms on telephone speech. >

30 citations


Book ChapterDOI
24 Jun 1988
TL;DR: In this article, the authors proposed a complete Bayes hypothesis test on second-order autoregressive power density spectrum poles for determining the kind of propulsion a vessel uses, and the nature of gearbox noise is described, and cepstrum is proposed as an algorithm to detect this kind of noise.
Abstract: Concentrating mainly on the signal processing and physical models behind the algorithms used to classify ships by their underwater-radiated noise, the physical model for cavitation is expanded to include the losses by acoustical radiation and the heat transfer from the vapor to the fluid. The resulting equation allows one to find the characteristics of cavitation through simulation. Five algorithms for estimating the propeller speed have been found. The performance of the three most promising ones are judged with respect to the ratio of the expected value to the variance of the estimator. A complete Bayes hypothesis test on second-order autoregressive power density spectrum poles are then described for determining the kind of propulsion a vessel uses. The nature of gearbox noise is described, and the cepstrum is proposed as an algorithm to detect this kind of noise. >

27 citations


Proceedings ArticleDOI
18 Jan 1988
TL;DR: An accurate, yet fast, algorithm for image registration by using a combination of power spectrum and power cepstrum analyses, which can work very fast and accurately compared to conventional techniques.
Abstract: Precise registration techniques are essential for quantitative evaluation of sequential fundus images to make early detection of fundus anomalies feasible. The familiar correlation techniques for achieving such image registration are computationally intensive and suffer from non-uniqueness of solution. We have developed an accurate, yet fast, algorithm for image registration by using a combination of power spectrum and power cepstrum analyses. In this new algorithm rotational shifts and translational shifts are corrected separately. The technique involves two main ideas. First, a rotational shift is corrected and changed into a translational shift by computing Fourier power spectrums. After the rotational shift has been corrected, i.e., images are parallel, the remaining translational shifts are handled. Because of the accuracy characteristics of the power cepstrum and the speed of the-FFT, this new algorithm can work very fast and accurately compared to conventional techniques. Also, the cepstrum technique has better tolerance of image noise than the traditional correlation measures. The accuracy obtained and computational time required for the cepstrum-based registration techniques will be illustrated by operating on sequential fundus images used in early detection of glaucoma.

23 citations


PatentDOI
TL;DR: In this paper, an inputted sound signal is sampled at intervals over a period and cepstrum coefficients are calculated from the sampled values to distinguish voice (vowel) intervals and noise intervals.
Abstract: An inputted sound signal is sampled at intervals over a period and cepstrum coefficients are calculated from the sampled values. Cepstrum sum, distance and/or power are calculated and compared with appropriately preselected threshold values to distinguish voice (vowel) intervals and noise intervals. The ratio of the length of the voice intervals to the sampling period is considered to determine whether the sampled inputted sound signal represents voice or noise.

17 citations


Proceedings ArticleDOI
11 Apr 1988
TL;DR: A high-resolution estimation method using the cumulant cepstra (polycepstra) of the received sensor data as the basic tool for reconstruction is introduced, and the effectiveness of the method is demonstrated for different noise conditions and lengths of data.
Abstract: The authors address the problem in which time delays are closely-spaced, i.e. the distance between two consecutive time delays is significantly less than the duration of the autocorrelation of the FM signal. A high-resolution estimation method using the cumulant cepstra (polycepstra) of the received sensor data as the basic tool for reconstruction is introduced. The effectiveness of the method is demonstrated for different noise conditions and lengths of data. The results apply to sonar signal processing problems in which the acoustic FM signal is embedded in reverberation noise due to the presence of multipath and observation noise. >

16 citations


Journal ArticleDOI
TL;DR: This paper describes the extraction of individual information by the vector-quantization and the text-independent speaker information based on that method, and a feature vector is proposed for the first time which is the quantized distribution by the frequency of the vector -quantization code to represent the individual features of the speaker.
Abstract: At present, one of the most important problems in speech recognition and speaker recognition is the extraction of individual information from the speech waveform. This paper describes the extraction of individual information by the vector-quantization and the text-independent speaker information based on that method. A feature vector is proposed for the first time which is the quantized distribution by the frequency of the vector-quantization code to represent the individual features of the speaker. The properties of the feature vector are investigated, and effectiveness is verified by an actual speaker-identification experiment. The quantization distribution is a feature representing the distribution density in the space for the acoustic features, e.g., the spectrum uttered by the individual. As the acoustic feature parameters, the cepstrum for stationary part, and the change of the cepstrum, are used to construct the quantization distribution. The identification rates are compared. As a result of the identification experiment for 10 speakers, an identification rate of 100 percent was achieved by the quantization distribution of cepstrum for 10 input words, which are different from the training samples. In the experiment using 200 speakers, an identification rate of 88 percent was achieved for the first candidates, and a cumulative identification rate of 95 percent was achieved for up to the second candidate.

11 citations


Proceedings ArticleDOI
11 Apr 1988
TL;DR: The radar backscattered signals are analyzed by the evaluation of the spectral power density, the pole distribution in the complex plane, and the power cepstrum and results on simulated and real data are presented and contrasted with conventional processing.
Abstract: The radar backscattering from rotating objects is assumed to be an autoregressive (AR) process of order M. As a practical example reflections from helicopter rotor blades are considered. Within this model the backscattered signals are analyzed by the evaluation of the spectral power density, the pole distribution in the complex plane, and the power cepstrum. Results on simulated and real data are presented and contrasted with conventional processing, i.e. Fourier analysis. >

Proceedings ArticleDOI
14 Nov 1988
TL;DR: The experimental results indicate that cepstrum is the best speech feature and that the relative distance measure methods are the absolute value distance and bandpass weighted cepstral distance methods.
Abstract: A speech feature analysis system, based on a dynamic programming match algorithm, is described. The system evaluates speech feature extraction methods and distance measure methods by two kinds of error-rate: overstep boundary error-rate and nearest neighbor error-rate. By means of the system, the performances of some speech feature representation methods and distance measure methods are evaluated for two different speech sets. The experimental results indicate that cepstrum is the best speech feature and that the relative distance measure methods are the absolute value distance and bandpass weighted cepstral distance methods. >

Proceedings ArticleDOI
24 Jun 1988
TL;DR: It was found that threshold 1/ sigma -weighted distance metric, where sigma is the standard deviation of a given cepstral coefficient, is generally the best speech distance metric to use in most types of noisy environments.
Abstract: A description is given of a number of different distance measures between speech segments commonly used in the analysis and recognition of speech. The measures considered include spectral slope, correlation coefficients, log likelihood ration, cepstral, weighted ceptstral, and modified distance measures. These metrics were tested on either the linear predictive coding (LPC) or the frequency spectrum depending on the type of measurement. Work reported elsewhere, was also considered and experimentally verified. The tests were performed on speech in a noisy background in Gaussian and in high-frequency noise. All the measures were compared using the same speech database. These evaluations show that by eliminating unwanted information in speech segments, the distance metric can be made more robust in noisy environments. It was found that threshold 1/ sigma -weighted distance metric, where sigma is the standard deviation of a given cepstral coefficient, is generally the best speech distance metric to use in most types of noisy environments. Some of the other metrics work better in isolated areas, but do not show the same high general recognition result. >

Proceedings ArticleDOI
11 Apr 1988
TL;DR: An approximate maximum-likelihood estimator is derived for ARMA (autoregressive moving-average) processes and is shown to correspond to least-squares fitting of the estimated cepstrum of the process by the model cepStrum.
Abstract: An approximate maximum-likelihood estimator is derived for ARMA (autoregressive moving-average) processes and is shown to correspond to least-squares fitting of the estimated cepstrum of the process by the model cepstrum. Experiments with several simple ARMA

Journal ArticleDOI
TL;DR: In this paper, the authors used Kalman filtering and functional expansion to determine the coefficients in the series expansion for the event and the system cepstra, and these become the unknown parameters to be determined.
Abstract: The operation of a machine generates signatures that carry information about its physical condition and operating parameters. In the case of vibration signatures, these signatures may be so complicated that system identification procedures may converge very slowly, if at all. This is due to the large number of physical degrees of freedom which the system has, which translates into a large number of parameters needed to describe the signal. Our research has focused on ways to reduce the required parameter set to that quick and accurate estimates of the source event and the structural signal path can be made. We shall discuss two procedures which we have used for achieving these aims. Our examples are drawn from studies of combustion pressure in a diesel engine. The available signal is the casing vibration of the engine, due to combustion onset in a cylinder. The signal is smoothed by windowing the cepstrum, which reduces the number of parameters needed to describe it to N. Once smoothed, there are two ways of separating the contributions from the combustion event itself and the structural path. The first method is Kalman filtering, which describes the “system” by an adjustable impulse response illustrated by m parameters. The input event is estimated by a small number of N-m parameters. The optimization continues to provide a best estimate of the input and system response. Without the cepstral smoothing that precedes this step however, it is essentially impossible to achieve this optimization. The second method uses a functional expansion to describe the cepstra of the event itself and the transmission path. The functions are Hermite polynomials, which combined with a Gaussian window are called Hermite functions, and have very useful properties. Using this procedure, we want to determine the coefficients in the series expansion for the event and the system cepstra, and these become the unknown parameters to be determined. The procedures described here have application to diagnostic monitoring of machines and structures, to security systems, and to noise control. Some examples of prospective applications will be described.

Proceedings ArticleDOI
11 Apr 1988
TL;DR: Speech recognition results using dynamic time warping template matching with this new distance measure indicate that recognition error rate can be reduced to less than half compared with the conventional Euclidean cepstrum distance measure.
Abstract: The authors propose a new mathematical method for extracting spectral movement in a time-sequence of speech spectrum. The spectral movement is characterized by the time and frequency derivative of a time-sequence of log spectrum envelopes. Spectral movement direction, the movement toward a higher or lower frequency region, can be identified by the sign of the proposed function. A parameter which can be used for speech segmentation is derived form this function A distance measure for speech recognition is also derived as the Euclidean distance between two spectral movement patterns extracted by the proposed function. This distance is easily calculated using cepstrum coefficients. Speech recognition results using dynamic time warping template matching with this new distance measure indicate that recognition error rate can be reduced to less than half compared with the conventional Euclidean cepstrum distance measure. >

Journal ArticleDOI
TL;DR: Speaker‐dependent word recognition experiments using 10 Japanese digits uttered by six male speakers indicate that recognition error rates lower than 2% can be obtained for noise‐free speech and white‐noise‐added speech of 20 and 10 dB SNR.
Abstract: This paper presents a word recognition method using a two‐dimensional mel‐cepstrum (TDMC) in noisy environments. The TDMC is defined as the two‐dimensional Fourier transform of mel‐frequency scaled log spectra in the frequency and time domains, and it consists of the average features and dynamic features of two‐dimensional mel‐log spectra. In comparison with the standard method using a one‐dimensional mel‐cepstrum, this method gives better recognition rates. Furthermore, the selection of the size of the TDMC used to describe dynamic features, the optimum weighting between the average and dynamic features, and reference noise patterns are taken into consideration in order to improve this method. Speaker‐dependent word recognition experiments using 10 Japanese digits uttered by six male speakers indicate that recognition error rates lower than 2% can be obtained for noise‐free speech and white‐noise‐added speech of 20 and 10 dB SNR.

Proceedings ArticleDOI
14 Nov 1988
TL;DR: French stop and nasal consonants are, shown through statistical analysis, to be recognizable by machine using features of spectrum adjacent to the stop burst or the release under speaker- and vowel-independent conditions.
Abstract: French stop and nasal consonants are, shown through statistical analysis, to be recognizable by machine using features of spectrum adjacent to the stop burst or the release under speaker- and vowel-independent conditions. The fact that human perception can recognize most syllables which are misclassified by machine suggests that the acoustic processing of speech signals can be improved for more precise pattern recognition. >

01 Sep 1988
TL;DR: In this paper, the received signal is modelled as the convolution between the transmitted pulse and the medium impulse response, and a homomorphic filter (complex cepstrum) is applied to deconvolve the wavelet.
Abstract: : This study is concerned with deconvolution methods applied to underwater propagation in shallow water, whereby the received signal is modelled as the convolution between the transmitted pulse and the medium impulse response. The aim of the method is to extract information on backscattering, travel time delays, boundary reflection and refraction from the received signal on a point receiver or an array for both seismic and active sonar data. Since experimental data are generally mixed phase, due in part to the multiple reflections (bottom and surface), the conventional linear filtering which assumes the minimum phase property, loses in efficacy. In order to handle this mixed phase characteristic of the data, we proceed in two steps. We first apply a homomorphic filter (complex cepstrum) to deconvolve the wavelet. Then we deconvolve the medium impulse response by means of Wiener filter. The efficacy of the method is shown on both simulated and real data for explosive and active sonar data. Keywords: Acoustic sonar signals; Scattering; Seismic waves; Cepstrum technique; Bottom reflection; Low frequency; Wave propagation; Seismic data; Towed array.

Patent
Masako Ichikawa1, Yukio Mitome1
11 Mar 1988
TL;DR: In this paper, the optimum pole and zero parameters for an input signal with reference to an input cepstrum of the signal are stored in advance in a table, and the candidate poles and zeros are selected from the table by a set pair of a pole parameter set and a zero parameter set.
Abstract: For use in deciding optimum pole and zero parameters for an input signal with reference to an input cepstrum of the signal, candidate pole and zero parameters are stored in advance in a table. Supplied with an impulse and controlled by a candidate set pair of a pole parameter set and a zero parameter set selected from the candidate pole and zero parameters of the table, first and second filters produce first and second outputs defined by terms of up to a certain order. Responsive to the first and second outputs and to factors of multiplication which are given by inverse numbers of time intervals related to the respective terms, an analysis filter produces a converted signal which is equivalent to a model output cepstrum of a model output signal produced by a pole-zero model defined by the candidate set pair. A cepstrum subtracter calculates a cepstrum difference between the input cepstrum and the converted signal. In connection with a few candidate set pairs, cepstrum differences and then squares of the respective cepstrum differences are calculated to decide the optimum pole and zero parameters by a focussing technique. Alternatively, a square of the cepstrum difference is used together with partial derivatives of the square to decide the optimum pole and zero parameters.

Proceedings ArticleDOI
02 Oct 1988
TL;DR: An ultrasonic thickness measurement system for the 20- to 110- mu m range has been developed for thin paint layers on metallic or nonmetallic substrates as mentioned in this paper, which is based on a power cepstrum analysis, defined as the Fourier transform of the logarithm power spectral density of the radiofrequency signal.
Abstract: An ultrasonic thickness measurement system for the 20- to 110- mu m range has been developed for thin paint layers on metallic or nonmetallic substrates. In the case of a multilayered sample this system generally furnishes only the sum of the different thicknesses. Thus, a digital signal-processing method has been developed to extract the different values of thickness from the power spectral density. This method is based on a power cepstrum analysis, defined as the Fourier transform of the logarithm power spectral density of the radio-frequency signal. This technique gives, in the case of three layers, the three thicknesses and their linear combinations. Experimental results confirm the effectiveness of the proposed method. >

Patent
17 Oct 1988
TL;DR: In this paper, two accelerometers are mounted in order to search an abnormal sound generating place and the searching button of a controller 8 is pushed to calculate arrival time difference DELTAt' and, at this time, the sonic velocity C(m/s) is taken out from the derive 16 and the distances from the accelerometers to the abnormal sound generator are calculated according to formulae by the circuit 14.
Abstract: PURPOSE:To simultaneously perform the correction of the sonic velocity of propagating wave motion and the searching of an abnormal sound generating place, by mounting a processing circuit adapting cepstrum analysis to abnormal sound and a time difference detection circuit and detecting the arrival time difference of wave motion to the space between measuring points from the max. peak value of the signal subjected to cepstrum processing. CONSTITUTION:When the arrival time difference DELTAt of wave motion is obtained by detecting the max. value of the quefrency of a cepstrum processing circuit 12 by a time difference detection apparatus 13, sonic velocity C=l3/DELTAt(m/x) (wherein l3 is the distance between accelerometers) is preliminarily stored in a storage device 16 from an operation circuit 14. This value C is again stored in the device 16 and subsequently displayed on a display art 15 to finish the correction of sonic velocity. Next, two accelerometers are mounted in order to search an abnormal sound generating place and the searching button of a controller 8 is pushed to calculate arrival time difference DELTAt' and, at this time, the sonic velocity C(m/s) is taken out from the derive 16 and the distances from the accelerometers to the abnormal sound generating place 4 are calculated according to formulae by the circuit 14. When the calculated distance l1 or l2 is 0, a message is issued to the display part 15 so as to change the mount positions of the accelerometers. When both of the distance l1, l2 are not 0, the distance l1, l2 from the accelerometers to the abnormal sound generating place are displayed on the display part 15.

Journal ArticleDOI
TL;DR: In this paper, the echo time delay, given by the complex cepstrum, is used in conjunction with the bearing estimation and measurement array height to determine the distance to the source.
Abstract: The complex cepstrum is used to correct bearing estimations of acoustic sources in the presence of a reflective surface. The echo time delay, given by the complex cepstrum, then is used in conjunction with the bearing estimation and measurement array height to determine the distance to the source. Consequently, the location of the acoustic source is determined with one detection array. An automated liftering procedure is used that zeros out a block portion of the cepstrum including the echo information. The problem of the resulting distortion is alleviated by applying a coherence criterion to the recovered direct signals at each microphone. Thus, to a large degree, the operator interactive nature of cepstral processing is overcome for this application. For the test signals and geometries considered, the cepstrum is shown to accurately correct for bearing errors in acoustic signals contaminated with reflections from nearby surfaces as well as provide the necessary information to determine the source location.

Proceedings ArticleDOI
03 Aug 1988
TL;DR: Monte Carlo simulation results demonstrate the effectiveness of the proposed discrete-time method, its low sensitivity to observation noise, and its improved performance in terms of probability of error of the reconstructed transmitted sequence.
Abstract: A discrete-time method is proposed for the estimation and cancellation of intersymbol interference in a digital communication channel. The received signal is first demodulated and sampled and then the fourth-order cumulants of the resulting discrete-time sequence are estimated. The method estimates the channel impulse response from the complex cepstrum of the aforementioned fourth-order cumulants (i.e., tricepstrum). As such, the proposed method depends only on the second- and fourth-order statistics of the transmitted sequence and is capable of reconstructing nonminimum-phase impulse responses. Monte Carlo simulation results demonstrate the effectiveness of the method, its low sensitivity to observation noise, and its improved performance in terms of probability of error of the reconstructed transmitted sequence. Performance comparisons are also given using existing equalization techniques. >

Journal ArticleDOI
TL;DR: This study compares the performance of a neural net approach with conventional linear methods as applied to the problem of feature combination in the domain of speaker identity verification (SIV) and shows that, when feature combination is done by the neural net, the SIV task is performed significantly better than when the feature combination (i.e., weighting) isdone by the FLD.
Abstract: This study compares the performance of a neural net approach with conventional linear methods as applied to the problem of feature combination in the domain of speaker identity verification (SIV). The experiment endeavors to combine features consisting of LPC‐cepstral coefficient differences and pitch differences for isolated words in a template‐matching scenario. The signal features are analyzed for 30‐ms frames every 10 ms. The pitch estimate is based on the cepstrum of the LPC residual. Previous work [G. Velius, ICASSP 88, 583–586 (1988)] showed that the Fisher linear discriminant (FLD) was better at feature weighting (for cepstral coefficients only) than several other common linear methods. Results show that, when feature combination is done by the neural net, the SIV task is performed significantly better than when the feature combination (i.e., weighting) is done by the FLD. The neural network architecture used in this experiment was in no way “optimized” for the specific task at hand. An additional finding is that the pitch feature used here, in conjuction with the cepstral coefficients, contributes significantly to the SIV task; that is, the error rate is reduced by 13%.

Journal ArticleDOI
TL;DR: It is shown that improvements in robustness of the recognizer in noise can be achieved by a proper selection of analysis method, and the weighted Euclidean distance with different weightings was applied in the cepstrum domain.
Abstract: In automatic speech recognition (ASR) of speech corrupted by noise, the performance tends to deteriorate rapidly depending on the choice of analysis method and distance measure. In order to evaluate the recognition performance for several analysis methods and distance measures, a series of isolated word recognition experiments was performed. Analysis methods selected are critical‐band filtering, perceptually based linear prediction (PLP), linear prediction (LP), and time synchronous linear prediction (SLP). The weighted Euclidean distance with different weightings [unity, root power sums (RPS), and exponential filtering] was applied in the cepstrum domain. Experiments were carried out for clean speech and for two noise conditions (white and low‐pass filtered white, added to the clean speech) at different SNR ratios (25 to 5 dB), using an alphanumeric vocabulary (ten speakers). It is shown that improvements in robustness of the recognizer in noise can be achieved by a proper selection of analysis method an...

01 Jan 1988
TL;DR: A method for extracting spectral.
Abstract: In this paper we propose a new mathematical. method for extracting spectral. movement in a timesequence of speech spectrum. The spectral. movement is characterized by the time and frequency derivative of a time-sequence of Log spectrum envelopes. Spectral. movement direction. t he movement toward a higher or a Lower frequency region, can be identified by the sign of the proposed function. A parameter which can be used for speech segmentation is derived from this function. A distance measure for speech recognition is also derived as the Euclidean distance between two spectral. movement patterns e xtracted by the proposed function. This distance is easiLy caLcuLated using cepstrum coefficients. Speech recognition results using Dynamic Time Warping (DTW) tempLate matching with this new distance measure indicate t hat recognition error rate can be reduced to Less than haLf compared with the conventionaL EucLidean cepstrum distance measure.

01 Jan 1988
TL;DR: In this paper, the backscattered signals are analyzed by the evaluation of the spectral power density, the pole distribution in the complex plane, and the power cepstrum.
Abstract: The radar backscattering from rotating objects is assumed to be an autoregressive (AR) process of order M. As a practical example reflections from helicopter rotor blades are considered. Within this model the backscattered signals are analyzed by the evaluation of the spectral power density, the pole distribution in the complex plane, and the power cepstrum. Results on simulated and real data are presented and contrasted with conventional processing, i.e. Fourier Analysis.