scispace - formally typeset
Search or ask a question

Showing papers on "Cepstrum published in 1986"


Journal ArticleDOI
TL;DR: This paper proposes a new isolated word recognition technique based on a combination of instantaneous and dynamic features of the speech spectrum that is shown to be highly effective in speaker-independent speech recognition.
Abstract: This paper proposes a new isolated word recognition technique based on a combination of instantaneous and dynamic features of the speech spectrum. This technique is shown to be highly effective in speaker-independent speech recognition. Spoken utterances are represented by time sequences of cepstrum coefficients and energy. Regression coefficients for these time functions are extracted for every frame over an approximately 50 ms period. Time functions of regression coefficients extracted for cepstrum and energy are combined with time functions of the original cepstrum coefficients, and used with a staggered array DP matching algorithm to compare multiple templates and input speech. Speaker-independent isolated word recognition experiments using a vocabulary of 100 Japanese city names indicate that a recognition error rate of 2.4 percent can be obtained with this method. Using only the original cepstrum coefficients the error rate is 6.2 percent.

812 citations


Proceedings ArticleDOI
07 Apr 1986
TL;DR: Speaker-independent word recognition experiments using time functions of the dynamics-emphasized cepstrum and the polynomial coefficient for energy indicate that the error rate can be largely reduced by this method.
Abstract: A new speech analysis technique applicable to speech recognition is proposed considering the auditory mechanism of speech perception which emphasizes spectral dynamics and which compensates for the spectral undershoot associated with coarticulation. A speech wave is represented by the LPC cepstrum and logarithmic energy sequences, and the time sequences over short periods are expanded by the first- and second-order polynomial functions at every frame period. The dynamics of the cepstrum sequences are then emphasized by the linear combination of their polynomial expansion coefficients, that is, derivatives, and their instantaneous values. Speaker-independent word recognition experiments using time functions of the dynamics-emphasized cepstrum and the polynomial coefficient for energy indicate that the error rate can be largely reduced by this method.

99 citations


Patent
15 Oct 1986
TL;DR: In this article, a speech analysis apparatus consisting of a transforming section for receiving a spectrum envelope, an integrator for receiving the transformed spectrum envelope output from the transforming section, and a projection circuit for projecting the spectrum envelope with respect to the integrated data.
Abstract: A speech analysis apparatus according to the invention, comprising a transforming section for receiving a spectrum envelope, for transforming the spectrum envelope such magnitude data thereof becomes suitable, and for generating a transformed spectrum envelope, an integrator for receiving the transformed spectrum envelope output from the transforming section, for integrating the input spectrum envelope with respect to a predetermined variable, and for outputting an integrated spectrum envelope, and a projection circuit for receiving the transformed spectrum envelope from the transform circuit and the integrated spectrum envelope from the integrator, and for projecting the spectrum envelope with respect to the integrated data. Therefore, the analysis result inherent to the phoneme can be obtained regardless of vocal tract lengths. The spectrum envelope to be projected can be integrated by the integrator, along the frequency axis or the mel axis. The analysis apparatus further includes a spectrum envelope-extractor for obtaining the spectrum envelope, by using cepstrum analysis and smoothing the resultant spectrum envelope. A spectrum envelope in the transition from a consonant to a vowel can be obtained.

40 citations


Journal ArticleDOI
TL;DR: A more elaborate model is introduced in which the influence of window length is approximated and the spectral sampling inherent in voiced speech is explicitly represented and reasonable deconvolution approximations are obtained.
Abstract: Traditionally, a very simple model for short-time homomorphic analysis has been used. It is shown that there is no theoretical justification for applying this model to voiced speech and that the model is of limited value for improving cepstral deconvolution procedures. Consequently, a more elaborate model is introduced in which the influence of window length is approximated and the spectral sampling inherent in voiced speech is explicitly represented. As a result, this new model shows that the vocal tract contribution to the complex cepstrum is repeated at every multiple of the pitch quefrency (n p ) and is multiplied by a double sinclike distortion (D(n)). It is shown that in order to achieve deconvolution with a low-time gating system, a cepstral lifter of length n p /2 should be used (instead of the usual length "less than n p "). Furthermore, the lifter should compensate for the distortion D(n). Unfortunately, the accuracy of straightforward homomorphic deconvolution approximations is limited by aliasing distortion which results from the repeated nature of the vocal tract contribution. Nevertheless, reasonable deconvolution approximations are obtained.

38 citations


Journal ArticleDOI
TL;DR: It is demonstrated that the presence of a cepstral peak depends on the form of the probability density function (pdf) of the separation between reflectors, and in the case where the pdf is uniform from O to SM, the cepStral peak is found to occur at the quefrency corresponding to SM.

34 citations


Journal ArticleDOI
TL;DR: The technique, based on the cepstrum, has the great advantage of requiring only the use of fast Fourier transforms in the fitting process, thus, unlike the fitting of two-dimensional autoregressions, no iteration is necessary.
Abstract: A method is presented for parametric modeling of stationary random fields. The class of parametric models considered allows the most general elliptic field, and by linear constraints can include such special cases as isotropic, quarter plane, and separable fields. The technique, based on the cepstrum, has the great advantage of requiring only the use of fast Fourier transforms in the fitting process. Thus, unlike the fitting of two-dimensional autoregressions, no iteration is necessary. Other advantages are that any (Wiener) filters constructed from the fitted spectrum are guaranteed to be stable, and that the spectrum is guaranteed to be positive. Statistical tests for determining various special types of field from data are developed. The choice of model order is discussed as well.

32 citations



Journal ArticleDOI
TL;DR: In this paper, the differential cepstrum was introduced to design equiripple minimum phase FIR filters by using cepstral deconvolution; this fast procedure only takes three FFT computation and avoids the complicated phase wrapping and polynomial root-finding algorithms.
Abstract: The differential cepstrum is introduced to design equiripple minimum phase FIR filters by using cepstral deconvolution; this fast procedure only takes three FFT computation and avoids the complicated phase wrapping and polynomial root-finding algorithms.

14 citations


Journal ArticleDOI
TL;DR: The Cepstrum technique was applied both to the acoustic speech wave and to the residual signal derived from an inverse filter analysis, in order to study the short-term perturbations in the periodicity of the voice sources.

12 citations


Journal ArticleDOI
TL;DR: In this paper, a method to design an equiripple minimum-phase FIR filter using the cepstrum is described, which avoids the complicated polynomial root-finding algorithm of Herrman and Schuessler (1970), or the phase-unwrapping algorithm associated with the complex cepstrum of Mian and Nainar (1982).
Abstract: A method to design an equiripple minimum-phase FIR filter using the cepstrum is described. This method avoids the complicated polynomial root-finding algorithm of Herrman and Schuessler (1970), or the phase-unwrapping algorithm associated with the complex cepstrum of Mian and Nainar (1982). The differential cepstrum method proposed by Pei and Lu (1986) has aliasing problems and requires the computation of three FFTs. The proposed method requires only two FFT computations and avoids the processing of phase.

10 citations


Journal ArticleDOI
TL;DR: The quality of synthesized speech in the speech analysis-synthesis system based on the mel-cepstrum method is described and, using this system, 1.7 kbit/s, high-quality synthesizedspeech is obtained.
Abstract: The quality of synthesized speech in the speech analysis-synthesis system based on the mel-cepstrum method is described. Mel-cepstrum is defined as the Fourier coefficients of the log spectrum on a nonlinear frequency scale approximating the mel scale. The true mel-log spectral envelope is estimated in this system by an improved cepstral method in the analysis part and a mel-log spectrum approximating filter is used in the synthesis part. The preference score by pair comparison tests is used for subjective evaluation and spectral distortion on a nonlinear frequency scale is used for an objective evaluation. Using this system, 1.7 kbit/s, high-quality synthesized speech is obtained.

Journal ArticleDOI
TL;DR: In this article, the authors described laboratory acoustical measurements of the normal incidence reflection coefficient of an absorbent material: emphasis is placed on practical aspects of the technique and means of reducing it.

Proceedings ArticleDOI
01 Apr 1986
TL;DR: This paper proposes a new method to estimate the fundamental frequency in short observation interval using the complex spectrum, and obtains the peak frequency with negligible error using the characteristics of complex spectrum.
Abstract: The fundamental frequency of signal is one of the important information. We can obtain the frequency of periodic wave from the peak of spectrum calculated by FFT. But the resolution of peak frequency is not so high. To get higher resolution, some methods have been proposed. They require a great deal of calculation. To improve the calculation speed, this paper proposes a new method to estimate the fundamental frequency in short observation interval using the complex spectrum. The phase component of complex spectrum gives us a useful information to increase the resolution. Using the characteristics of complex spectrum, we obtained the peak frequency with negligible error. This method requires a little caliclation. The frequency of signal can be estimated in real time when an array processor is used.

01 Jun 1986
TL;DR: An alternate procedure, which does not require cepstral editing, is proposed which allows the complete correction of a contaminated spectrum through use of both the transfer function and delay time of the echo process.
Abstract: The application of the power cepstrum to measured helicopter-rotor acoustic data is investigated. A previously applied correction to the reconstructed spectrum is shown to be incorrect. For an exact echoed signal, the amplitude of the cepstrum echo spike at the delay time is linearly related to the echo relative amplitude in the time domain. If the measured spectrum is not entirely from the source signal, the cepstrum will not yield the desired echo characteristics and a cepstral aliasing may occur because of the effective sample rate in the frequency domain. The spectral analysis bandwidth must be less than one-half the echo ripple frequency or cepstral aliasing can occur. The power cepstrum editing technique is a useful tool for removing some of the contamination because of acoustic reflections from measured rotor acoustic spectra. The cepstrum editing yields an improved estimate of the free field spectrum, but the correction process is limited by the lack of accurate knowledge of the echo transfer function. An alternate procedure, which does not require cepstral editing, is proposed which allows the complete correction of a contaminated spectrum through use of both the transfer function and delay time of the echo process.

Journal ArticleDOI
TL;DR: In this paper, it was shown that the real part of the estimated surface normal impedance is very nearly maximized when the spurious delay is eliminated; this has suggested a new way of determining the extraction delay itself.

Journal ArticleDOI
TL;DR: In this article, a technique for measuring acoustic reflection coefficients using the power cepstrum is presented and discussed, which is used to obtain the impulse response of the reflector which is Fourier transformed to yield the reflection coefficient.
Abstract: A technique for measuring acoustic reflection coefficients using the power cepstrum is presented and discussed. The power cepstrum is used to obtain the impulse response of the reflector, which is Fourier transformed to yield the reflection coefficient. Constraints on the acoustic source, aliasing, and noise effects are examined, and procedures for reducing these effects are presented. An example computation of a reflection coefficient is included.

Journal ArticleDOI
TL;DR: The experimental verification of the developed FPR (fundamental peak recognition) method also included an objective comparison to Miller's data reduction method, and a "subjective" performance evaluation using the standard cepstrum method.
Abstract: In this correspondence, we present a method of detecting the fundamental frequency based on the recognition of "fundamental peaks" of a low-pass filtered (f c = 0.7 kHz) speech signal. Average overall recognition accuracy of approximately 99 percent for the training speech sample and 97.5 percent for a test speech sample were achieved. The experimental verification of the developed FPR (fundamental peak recognition) method also included an objective comparison to Miller's data reduction method, and a "subjective" performance evaluation using the standard cepstrum method. The FPR classification scheme is computationally efficient and easily implementable with relatively slow 8-bit microprocessors.


Journal ArticleDOI
TL;DR: In this paper, a new method for analysis of the group delay characteristics derived from spectral envelopes was proposed, which derives the complex cepstrum from the spectral envelope connecting peaks of fine structure of the power spectrum.
Abstract: A11-pole models have been used in speech signal analysis to represent speech-producing systems [1]. This paper introduces a new method for analysis of the group delay characteristics derived from spectral envelopes. The method derives the complex cepstrum from the spectral envelope connecting peaks of fine structure of the power spectrum. The effectiveness of the method is shown by application to synthesized and actual nasals.

Book ChapterDOI
01 Jan 1986
TL;DR: An evaluation of the cepstrum technique as a tool for the interpretation of gearbox spectra and it will be shown that the method can be used effectively if some practical considerations and some limitations are kept in mind.
Abstract: Cepstrum analysis is used frequently in data processing and signal analysis. The use of this new technique however in monitoring the mechanical condition of gearboxes is still rather unknown. This paper presents the results of an evaluation of the cepstrum technique as a tool for the interpretation of gearbox spectra. The origin of vibrations in gearboxes and a useful definition of the cepstrum will be presented. The method will be applied for the condition evaluation of a transmission train of 4 gears and it will be shown that the method can be used effectively if some practical considerations and some limitations are kept in mind.

Proceedings ArticleDOI
01 Apr 1986
TL;DR: The performance of the complex cepstrum as used to recover wavelets in the presence of distorted echoes and noise is studied using simulated signals.
Abstract: The performance of the complex cepstrum as used to recover wavelets in the presence of distorted echoes and noise is studied using simulated signals. The distortions and noise are chosen to be representative of practical applications encountered in acoustic source location. On the whole the complex cepstrum is seen to perform well for the situations considered.

Patent
19 Aug 1986
TL;DR: In this article, the transmission frequency of a carrier suppression single side band communication system is measured by using a cepstrum of a received voice to detect the difference between the transmission frequencies and reception frequencies.
Abstract: PURPOSE:To measure accurately the transmission frequency of a carrier suppression single side band communication system by using a cepstrum of a received voice to detect the difference between the transmission frequency and reception frequency and measuring the transmission frequency with the frequency difference and the reception frequency. CONSTITUTION:A waveform signal with a prescribed length of a voice signal received at an SSB receiver 7 is extracted and transformed into a short time frequency spectrum G(f) by Fourier transformation to calculate the cepstrum. A fundamental period (1/f0) is obtained from the cepstrum. Then a detuning frequency measuring section 9 calculates a correlation function R(h) with respect to the frequency difference (h) as to the amplitude spectrum comprising harmonics of the fundamental frequency f0 and log G(f) and a frequency hp for the difference between both the spectrums is obtained from the (h) being a maximum value of the function R(h). The transmission frequency is measured by the frequency of the receiver and the frequency hp. Thus, the transmission frequency of the carrier suppression single side band communication system is measured accurately.

Journal ArticleDOI
TL;DR: In this paper, a recursive procedure to reconstruct a given sequence from its group delay or phase derivative is given, based on the relationships between minimum, maximum phase sequences and their cepstra, and on the modified least squares (MLS) rational approximation.
Abstract: A recursive procedure to reconstruct a given sequence from its group delay or phase derivative is given. The procedure is based on the relationships between minimum, maximum phase sequences and their cepstra, and on the modified least squares (MLS) rational approximation. To avoid unwrapping of the phase, the cepstrum of the sequence is calculated from the group delay function. Using a recursive procedure, we find from the cepstrum values a minimum phase sequence with a phase equal to that of the original sequence. The reconstructed sequence is obtained using the MLS procedure to find recursively a rational approximation of the minimum phase sequence. The constraints under which the phase reconstruction is possible are checked with a root distribution algorithm, and we indicate how to modify the sequence when the constraints are not satisfied. Examples illustrate the efficiency of the proposed procedure.

Book ChapterDOI
01 Jan 1986
TL;DR: In-process measurement of cutting tool wear was investigated by applying cepstrum analysis to vibration signals taken from the machine tool structure.
Abstract: In-process measurement of cutting tool wear was investigated by applying cepstrum analysis to vibration signals taken from the machine tool structure.

01 Jan 1986
TL;DR: This paper proposes a new isolated word recognition technique based on a combination of instantaneous and dynamic features of the speech spectrum that is shown to be highly effective in speaker-independent speech recognition.
Abstract: This paper proposes a new isolated word recognition tech- nique based on a combination of instantaneous and dynamic features of the speech spectrum. This technique is shown to be highly effective in speaker-independent speech recognition. Spoken utterances are rep- resented by time sequences of cepstrum coefficients and energy. Regression coefficients for these time functions are extracted for every frame over an approximately 50 ms period. Time functions of regres- sion coefficients extracted for cepstrum and energy are combined with time functions of the original cepstrum coefficients, and used with a staggered array DP matching algorithm to compare multiple templates and input speech. Speaker-independent isolated word recognition ex- periments using a vocabulary of 100 Japanese city names indicate that a recognition error rate of 2.4 percent can be obtained with this method. Using only the original cepstrum coefficients the error rate is 6.2 per- cent.

Proceedings ArticleDOI
01 Apr 1986
TL;DR: Preliminary results indicate a digit recognition accuracy of close to 80% when using digit strings of unknown length with 15-20 connected digits recorded through a telephone handset.
Abstract: This paper describes an approach towards speaker-independent recognition of connected Swedish digits. Each word is modelled as a sequence of acoustic-phonetic events that must be identified in order for the word to be recognized. The events are characterized by extremal values or rapid changes in formant, cepstrum, power and zero crossing contours. Preliminary results indicate a digit recognition accuracy of close to 80% when using digit strings of unknown length with 15-20 connected digits recorded through a telephone handset.

Proceedings ArticleDOI
01 Apr 1986
TL;DR: The cepstral method may be extended by incorporating the dispersion law into the procedure by changing the 'usual' operation of (inverse) Fourier transformation (with respect to frequency) of the log of the Fourier transform is changed to an 'inverse' Fourier Transform with respect to wave number.
Abstract: Cepstral methods have been used for the determination of reflection coefficients in situations where the propagating waves are nondispersive. However, if the propagating waves are dispersive then the reflection properties cannot be estimated by this technique and this paper describes how the cepstral method may be extended to accommodate this. This is achieved by incorporating the dispersion law into the procedure. Specifically, the 'usual' operation of (inverse) Fourier transformation (with respect to frequency) of the log of the Fourier transform is changed to an (inverse) Fourier transform with respect to wave number. This results in a cepstrum in which the independent variable is space (and not time) such that, at a point determined by the distance between measurement point and reflector a function related to the reflection coefficient appears, which may then be recast as a usual reflection coefficient.

Journal ArticleDOI
01 Jan 1986-Frequenz
TL;DR: Recherche sur des systemes de deconvolution homomorphique multidimensionnels pour traiter des signaux multivariables is described in this article, where the authors propose a system of homomorphiques multi-dimensionnels.
Abstract: Recherche sur des systemes de deconvolution homomorphique multidimensionnels pour traiter des signaux multivariables

01 Jan 1986
TL;DR: In this paper, the power cepstrum technique was used to remove unwanted acoustic reflections from model scale helicopter rotor noise spectra, i.e., when the reflections are exact duplicates of the signal and the echo transfer function is linear.
Abstract: The application of the power cepstrum technique to the analysis of model scale helicopter rotor noise spectra is investigated The power cepstrum technique emerges as a useful tool for the removal of undesired acoustic reflections from measured rotor spectra It is most effective when the reflections are exact duplicates of the signal, ie, when the echo transfer function is linear Moreover, the technique is best applied to a measured spectrum which is due entirely to the signal and its echoes, when no nonsource related signals are present It is noted that this technique can be applied without prior knowledge of the echo characteristics

Journal ArticleDOI
TL;DR: In this article, the authors describe the analysis of fluctuation signals from a nuclear reactor and describe the methods of analysis differ from the usual Fourier transform methods and include digital filtration, envelope generation by the use of Hilbert transforms, deconvolution, cepstrum analysis, autocovariances and simulation.