scispace - formally typeset
Search or ask a question

Showing papers on "Cepstrum published in 1989"


Journal ArticleDOI
D. Mansour1, Biing-Hwang Juang1
TL;DR: It is found that the orientation (or direction) of the cepstral vector is less susceptible to noise perturbation than the vector norm, and a family of distortion measures based on the projection between two cEPstral vectors is proposed, which have the same computational efficiency as the band-pass cepStral distortion measure.
Abstract: Consideration is given to the formulation of speech similarity measures, a fundamental component in recognizer designs, that are robust to the change of ambient conditions. The authors focus on the speech cepstrum derived from linear prediction coefficients (the LPC cepstrum). By using some common models for noisy speech, they show analytically that additive white noise reduces the norm (length) of the LPC cepstral vectors. Empirical observations on the parameter histograms not only confirm the analytical results through the use of noise models but further reveal that at a given (global) signal-to-noise ratio (SNR), the norm reduction on cepstral vectors with larger norms is generally less than on vectors with smaller norms, and that lower order coefficients are more affected than higher order terms. In addition, it is found that the orientation (or direction) of the cepstral vector is less susceptible to noise perturbation than the vector norm. As a consequence of the above results, a family of distortion measures based on the projection between two cepstral vectors is proposed. The new measures have the same computational efficiency as the band-pass cepstral distortion measure. >

166 citations


PatentDOI
Kazunori Ozawa1
TL;DR: A speech analysis and synthesis system operates to determine a sound source signal for the entire interval of each speech unit which is to be used for speech synthesis, according to a spectrum parameter obtained from each speech units based on cepstrum as discussed by the authors.
Abstract: A speech analysis and synthesis system operates to determine a sound source signal for the entire interval of each speech unit which is to be used for speech synthesis, according to a spectrum parameter obtained from each speech unit based on cepstrum. The sound source signal and the spectrum parameter are stored for each speech unit. Speech is synthesized according to the spectrum parameter while controlling prosody of the sound source signal. The spectrum of the synthesized speech is compensated through filtering based on cepstrum.

130 citations


Proceedings ArticleDOI
23 May 1989
TL;DR: Several acoustic representations have been compared in speaker-dependent and independent connected and isolated-word recognition tests with undegraded speech and with speech degraded by adding white noise and by applying a 6-dB/octave spectral tilt.
Abstract: Several acoustic representations have been compared in speaker-dependent and independent connected and isolated-word recognition tests with undegraded speech and with speech degraded by adding white noise and by applying a 6-dB/octave spectral tilt. The representations comprised the output of an auditory model, cepstrum coefficients derived from an FFT-based mel-scale filter bank with various weighting schemes applied to the coefficients, cepstrum coefficients augmented with measures of their rates of change with time, and sets of linear discriminant functions derived from the filter-bank output and called IMELDA. The model outperformed the cepstrum representations except in noise-free connected-word tests, where it had a high insertion rate. The best cepstrum weighting scheme was derived from within-class variances. Its behavior may explain the empirical adjustments found necessary with other schemes. IMELDA outperformed all other representations in all conditions and is computationally simple. >

118 citations


Patent
Stuart Jardine1
27 Mar 1989
TL;DR: In this article, a method for determining the state of wear of a multicone drill bit is presented, where the vibrations generated by the working drill bit are converted into a time oscillatory signal from which a frequency spectrum is derived.
Abstract: A method is provided for determining the state of wear of a multicone drill bit. Vibrations generated by the working drill bit are detected and converted into a time oscillatory signal from which a frequency spectrum is derived. The periodicity of the frequency spectrum is extracted. The rate of rotation of at least one cone is determined from the periodicity and the state of wear of the drill bit is derived from the rate of cone rotation. The oscillatory signal represents the variation in amplitude of the vertical or torsional force applied to the drill bit. To extract periodicity, a set of harmonics in the frequency spectrum is given prominence by computing the cepstrum of the frequency spectrum or by obtaining an harmonic-enhanced spectrum. The fundamental frequency in the set of harmonics is determined and the rate of cone rotation is derived from the fundamental frequency.

82 citations


Journal ArticleDOI
TL;DR: A novel discrete-time method is proposed for estimating the impulse response of a frequency-selective digitally modulated communication channel and its low sensitivity to observation noise, and its improved performance in terms of probability of error or the reconstructed transmitted sequence.
Abstract: A novel discrete-time method is proposed for estimating the impulse response of a frequency-selective digitally modulated communication channel. The received signal is first demodulated and sampled and then the fourth-order cumulants of the resulting discrete-time sequence are estimated. The method estimates the channel impulse response from the complex cepstrum of the aforementioned fourth-order cumulants (i.e. tricepstrum). The method depends only on the second- and fourth-order statistics of the transmitted sequence and is capable of reconstructing nonminimum-phase impulse responses. Monte Carlo simulation results demonstrate the effectiveness of the method, its low sensitivity to observation noise, and its improved performance in terms of probability of error or the reconstructed transmitted sequence. >

72 citations


Book
31 Jul 1989
TL;DR: Topics in linear and nonlinear filtering, spectral analysis, generalized correlation, cepstrum and complex demodulation, Cramer-Rao Bounds, maximum likelihood, weighted least-squares, Kalman filtering, expert systems, wave propagation and their use, as well as their performance in applications to canonical ocean problems are presented.
Abstract: A systematic and integrated account of signal and data processing with emphasis on the distinctive marks of the ocean environment is provided in this informative text. Underwater problems such as space-time processing relations vs. disjointed ones, processing of passive observations vs. active ones, time delay estimation vs. frequency estimation, channel effects vs. transparent ones, integrated study of signal, data, and channel processing vs. separate ones, are highlighted. The book provides the beginner with a concise presentation of the essential concepts, defines the basic computational steps, and gives the mature reader an advanced view of underwater systems and the relationships among their building blocks. It presents the needed topics on applied estimation theory within the underwater systems context. Included are topics in linear and nonlinear filtering, spectral analysis, generalized correlation, cepstrum and complex demodulation, Cramer-Rao Bounds, maximum likelihood, weighted least-squares, Kalman filtering, expert systems, wave propagation and their use, as well as their performance in applications to canonical ocean problems. The applications center on the definition, analysis, and solution implementations to representative underwater signal analysis problems dealing with signals estimation, their location and motion. The potential limitations and pitfalls of the implementations are delineated in homogeneous, noisy, interfering, inhomogeneous, multipath, distortions, and/or dispersive channels.

64 citations


Journal ArticleDOI
01 Apr 1989
TL;DR: Using word and monosyllable recognition experiments based on dynamic programming (DP) matching of a time sequence of the TDC, it is confirmed that the global static features (spectral envelope) and global dynamic features are both effective for speech recognition.
Abstract: In the paper, two-dimensional cepstrum (TDC) analysis and its application to word and monosyllable recognition are described. The TDC can simultaneously represent several different kinds of information contained in the speech waveform: static and dynamic features, as well as global and fine frequency structure. Noise reduction and speech enhancement can be easily performed using the TDC. Using word and monosyllable recognition experiments based on dynamic programming (DP) matching of a time sequence of the TDC, it is confirmed that the global static features (spectral envelope) and global dynamic features are both effective for speech recognition. A speaker-independent (noisy) word recognition algorithm is also proposed which recognises the words based on the similarity of dynamic features. The algorithm employs linear matching instead of DP nonlinear matching, requires a small amount of memory, and shows high speed and high accuracy in recognition. At present, the recognition rate is 89.0% at ∞ dB and 70.0% at 0 dB signal-to-noise ratio.

40 citations


Journal ArticleDOI
TL;DR: In this article, a method for estimating vowel formant frequencies from linear combinations of cepstral coefficients is described, which is robust in that large errors from misidentified peaks are rare.
Abstract: A method is described for estimating vowel formant frequencies from linear combinations of cepstral coefficients. The method is robust in that large errors from misidentified peaks are rare. It can be explained in terms of a Fourier cosine series of the desired identity between formant estimates and their true values. This analysis also sheds some light on the high correlations between the formants and dimensions of the log spectrum reported by Pols et al. [J. Acoust. Soc. Am. 53, 1093–1101 (1973)].

36 citations


Proceedings ArticleDOI
04 Jun 1989
TL;DR: A mechanism for verging the cameras of the Rochester Robot in real time is described and it is shown that qualitatively similar filters have similar effects, with the limiting case being equivalent to deconvolution.
Abstract: Binocular robots whose camera can be independently directed require some mechanism for aiming both cameras at the same world point. The authors describe a mechanism for verging the cameras of the Rochester Robot in real time. The mechanism consists of a discrete control loop driven by an algorithm that estimates a single disparity from the two cameras. Two algorithms for disparity estimation are presented. The first uses the cepstral-transform approach of Y. Yeshuran and E.L. Schwartz (1987), and it is argued that, in this application, the cepstrum is best understood as autocorrelation with an adaptive filter that acts to sharpen peaks in the autocorrelation image. It is also shown that qualitatively similar filters have similar effects, with the limiting case being equivalent to deconvolution. Efficient real-time implementations of the cepstral and deconvolution approaches are described. >

33 citations


Journal ArticleDOI
TL;DR: The general procedure of using feature-extraction techniques first and then registering and analyzing images by using power-spectrum and two-dimensional cepstrum techniques provides an unambiguous, accurate, and fast technique for the analysis of a broad range of sequential complex images.
Abstract: The analysis of a class of complex images has been simplified by extracting the edge-dominated features before matching the sequential images. The consecutive images are then registered by a frequency-domain technique, specifically by a combination of two-dimensional power spectrum and cepstrum techniques to correct for rotational and translational shifts, respectively. The cepstrum technique is found to be more accurate for correction of a translational shift than are the commonly used phase-correlation techniques and spatial-domain-correlation techniques, particularly for noisy and nonuniformly featured sequential images. The change in sequential images is expressed quantitatively in terms of the mean and the variance of the computed two-dimensional histogram representing the difference of two consecutive images. Such quantitative measures of change in sequential images have been applied to a class of complex medical images, namely, retinal (fundus) images, to provide a diagnostic measure for early detection of glaucoma. However, the general procedure of using feature-extraction techniques first and then registering and analyzing images by using power-spectrum and two-dimensional cepstrum techniques provides an unambiguous, accurate, and fast technique for the analysis of a broad range of sequential complex images.

26 citations


Journal ArticleDOI
TL;DR: The use of the Hartley transform (HT) in cepstrum analysis, as a substitute for the more commonly used Fourier transform (FT), is examined.
Abstract: The use of the Hartley transform (HT) in cepstrum analysis, as a substitute for the more commonly used Fourier transform (FT), is examined. With this substitution, the input to the cepstrum must be in the real domain only. The benefits of using the HT are approximately 50% less data memory required and approximately 40% faster program execution, at no loss in accuracy. >

Proceedings ArticleDOI
23 May 1989
TL;DR: A novel homomorphic vocoder producing good quality speech at 4.8 kb/s is presented, where the real cepstra containing vocal tract information are vector quantized and the excitation signals are obtained applying an analysis-by-synthesis method using a weighting filter which is derived from the cepstrum.
Abstract: A novel homomorphic vocoder producing good quality speech at 4.8 kb/s is presented. In this homomorphic vocoder framework, the real cepstra containing vocal tract information are vector quantized and the excitation signals are obtained applying an analysis-by-synthesis method using a weighting filter which is derived from the cepstrum. Informal listening tests show that this vocoder is fully competitive with the code-excited linear predictive coder in producing good-quality speech at 4.8 kb/s. >

Proceedings ArticleDOI
25 Apr 1989
TL;DR: In this article, a signal processing algorithm is developed to estimate the location of a discontinuity, e.g., a fault, on an electrical line, which is applied to fast sampled data and performs pecstral analysis of pulsed signals.
Abstract: A signal processing algorithm is developed to estimate the location of a discontinuity, e.g. a fault, on an electrical line. It is applied to fast sampled data and performs pecstral analysis of pulsed signals. The pecstrum algorithm is based on a transform similar to the one used in the cepstrum technique. It is demonstrated both theoretically and experimentally that the performance of the pecstrum estimator is significantly better than that of previously proposed methods. The proposed solution is operational in industrial digital reflectometers and has proved to be robust and successful in practice. With respect to cable-fault location, accuracies up to 30 cm are obtained when a fast, 20-MHz, 8-bit sampler is used for the acquisition of pulsed signals in a reflectogram. >

Patent
25 Sep 1989
TL;DR: In this article, the authors proposed a method to obtain a cepstrum with high accuracy by finishing interference luminous flux into a parallel beam with a specific beam diameter and making it incident on the surface of a sample.
Abstract: PURPOSE: To obtain a cepstrum with high accuracy by finishing interference luminous flux into a parallel beam with a specific beam diameter and making it incident on the surface of a sample. CONSTITUTION: The interference luminous flux guided out of a Michelson inter ferometer 13 is finished into the parallel beam with the specific beam diameter by an optical system 26 for lighting and the light is made incident on the surface of the sample 11, so variation in the angle θ of incidence and variance of the incidence surface are reduced greatly. Consequently, the optical path of transmit ted light in a multi-layered film is put as close to an ideal system as possible and the obtained cepstrum contains accurate information on the thin multi- layered film. Therefore, the thicknesses of the respective layers of the multi- layerd film, the states of borders between the respective layers, etc., can accu rately be evaluated. Further, when a Fourier spectral analysis is taken, data sampling intervals for the travel of a moving mirror 16 are made short and a data arithmetic wave number area is expanded into a wide area, so separate analyses are taken for the measurement of the respective layer thickness of an extremely thin multi-layered film. COPYRIGHT: (C)1991,JPO&Japio

Journal ArticleDOI
TL;DR: In this article, exact analytical formulas for the 2D logarithmic cepstrum and 2D differential cepstral signal were derived for a class of 2D signals.

Proceedings ArticleDOI
23 May 1989
TL;DR: An adaptive filtering method is proposed for the identification of a non-Gaussian, white-noise-driven, linear, generally non-minimum-phase system and the use and effectiveness of the method in blind linear-equalization applications is demonstrated.
Abstract: An adaptive filtering method is proposed for the identification of a non-Gaussian, white-noise-driven, linear, generally non-minimum-phase system. The non-minimum-phase impulse response of the system is estimated from the updated differential cepstrum parameters of the higher order cumulants (polycepstra) of the system output. The updated cepstrum parameters are obtained by utilizing higher order cumulants and an LMS (least mean squares) type algorithm. It is shown, using Monte-Carlo simulation examples, that the proposed method would perform very well at the expense of more computations. The use and effectiveness of the method in blind linear-equalization applications is demonstrated. >

Proceedings ArticleDOI
23 May 1989
TL;DR: Listening tests for uniform and nonuniform pitch modification reveal that the proposed algorithm can synthesize high-quality speech and that it is applicable to a synthesis-by-rule system.
Abstract: The authors propose a new speech modification algorithm using short-time Fourier transform (STFT) synthesis. This algorithm is developed using the criterion that the mean-square-error signals between the STFT spectra of the estimated and the modified should be minimized. The most important and unique ideas of the algorithm are liftering that passes all cepstra except cepstra in the pitch frequency region, and phase control by a signal reconstruction algorithm and window-shift adjustment. Listening tests for uniform and nonuniform pitch modification reveal that the proposed algorithm can synthesize high-quality speech and that it is applicable to a synthesis-by-rule system. >

Proceedings ArticleDOI
23 May 1989
TL;DR: Improvement of vowel recognition is extended to the more general phoneme recognition task by use of a hierarchical feature integration method, which utilizes the vowel recognition results in formant feature space together with consonant recognition based on the LPC-based cepstral feature space.
Abstract: A report is presented of comparative results for vowel classification using hidden Markov models based on linear predictive coding (LPC)-based cepstral vectors and formant features. The classification accuracy is shown to be significantly improved by using time duration constraints in formant feature space, especially for the formant mel-frequency representation and its time derivative. The highest vowel recognition accuracy is obtained by integrating the two feature spaces, multiplying the probabilities computed in the separate feature spaces. This improvement of vowel recognition is extended to the more general phoneme recognition task by use of a hierarchical feature integration method, which utilizes the vowel recognition results in formant feature space together with consonant recognition based on the LPC-based cepstral feature space. >

Patent
07 Apr 1989
TL;DR: In this article, the authors proposed a circuit consisting of a linear predictive coding (LPC) analytic part 11, a 2nd LPC spectrum coefficient standard vector arithmetic part 13, a statistical distance arithmetic part 14, a statistic arithmetic part 15, a sound/ silence decision part 16, a power buffer memory part 17, an LPC coefficient vector buffer memory, and a 1st LPC cepstrum distance arithmetic.
Abstract: PURPOSE: To securely detect the sound section of power smaller than background noise power by making a sound/silence decision in consideration of spectrum structure characteristics CONSTITUTION: This circuit consists of a linear predictive coding (LPC) analytic part 11, a 2nd LPC spectrum coefficient standard vector arithmetic part 13, a statistical distance arithmetic part 14, a statistic arithmetic part 15, a sound/ silence decision part 16, a power buffer memory part 17, an LPC coefficient vector buffer memory part 18, and a 1st LPC cepstrum distance arithmetic part 19 Then a feature parameter vector representing spectrum structure is used for decision and the difference quantity (distance) between the mean feature parameter vector in a past analyzed section decided as a silence section and the feature parameter vector in a current object analyzed section is found to decide the object analyzed section Consequently, even the sound section where speech signal power is smaller than background noise power can accurately be decided COPYRIGHT: (C)1990,JPO&Japio

09 Oct 1989
TL;DR: The Hartley transform is introduced as a substitute for the Fourier transform in the computation of the cepstrum and it has the additional advantage that its forward and inverse forms are identical, thus simplifying implementation.
Abstract: Homomorphic deconvolution has some appealing properties for speech coding applications, yet has failed to achieve any significant popularity. This is due, in part, to the mathematical complexity of the deconvolution process and the difficulty in handling the spectral phase information in the formation of the cepstrum. The paper introduces the application of the Hartley transform as a substitute for the Fourier transform in the computation of the cepstrum. This real transform avoids the need to handle the phase explicitly and thus removes the need for phase unwrapping. The Hartley transform has similar deconvolution properties as the Fourier transform and also has a fast implementation. It has the additional advantage that its forward and inverse forms are identical, thus simplifying implementation. The difficulties with the complex Fourier cepstrum are discussed, the alternative form, the Hartley cepstrum is introduced and some of its useful properties are examined. The use of the Hartley cepstrum in a homomorphic coder is also discussed and some preliminary results are given.

Proceedings ArticleDOI
01 Jan 1989
TL;DR: A noisy cepsual signal model for speech processing and two Singular Value Decomposition as SVD based approaches which greatly enhance cepstral based pitch estimation performance in noisy environments are proposed.
Abstract: Visual Information Technologies 7Afin l n m r T)r Erik Jonsson School of Engineering and Computer Science The University of Texas at Dallas >_"" -."" -I. Plano, Texas 75075 Richardson,-Texas 75083-0688 (214) 985-2267 (214) 690-2894 degroat@utdallas.edu Joseph Picone Speech and Image Understanding Laboratory Texas Insmments Inc. P.O. Box 655474 MS 238 Dallas, Texas 75265 (214) 995-6627 The FIT based cepstral method of human speech pitch (or fundamental frequency) determination is known to be accurate and reliable in studio quality environments, however, it leaves much to be desired at lower signal to noise ratios. Cepstral pitch determination techniques, which are a special case of the more general t h q of homomorphic signal processing. rely on the log operation to deconvolve the pitch sequence from the vocal uact response sequence. Classical cepstral processing modcls do not account for noise added to the signal. In this paper, we develop a noisy cepsual signal model for speech processing and we propose two Singular Value Decomposition (SVD) based approaches which greatly enhance cepstral based pitch estimation performance in noisy environments. Speech Production and Cepstral Pitch Determination Voiced speech pmduction can be modeled reasonably well as a pseudo pulse uain (pitch sequence) convolved with a linear system (vocal tract impulse response). Speech is considered wide sense stationary over short time segments (20 40 msec) [I] which makes analysis possible over short time windows (M frames). We assume that the r-domain description of the speech signal is modeled by [21, [31 S (2) = H ( 2 ) P ( 2 ) (1) where H ( z ) is the 2-transform of the vocal tract response sequence and P ( z ) is the r-transform of the pitch sequence. Analytical expressions for H ( z ) and P(r ) may be found in [2] or 131. We may use homomorphic filtering techniques to separate the multiplicative rclationship in ( I ) using the complex log operation thereby causing the pitch cepsmm and the vocal tract response cepsmm to occupy approximately disjoint quefrency spaces [2], [4]. Practical implementations of cepstral pitch determination.may be obtained from 141 in which it is shown that the Inverse FFT of the log of the magnitude of the FIT provides us with the real version of the quefrency. The connections between the complex cepsmm and the real cepsmm (usually denoted by just cepsmm) an shown in [21 and [31. The Noise Fmblem It is easy to see that homomorphic filtering (cepstral) techniques will not offer good performance in noise. Returning to ( I ) and taking the complex log operation, we find that log [S ( z ) ] = log [H (2) P (r)] = log [H (211 +log [P (.)I. (2) The separation of S(z) into its constituent parts works out very neatly assuming that no noise is added to the system. On the other band, if noise is added to the system, we obtain log [S (z ) + N ( z ) ] = log[H ( z ) P ( z ) + N (z)] . (3) A Cepstral Model for Speech Signals in Noise Manipulating (3) yields a noisy cepsual signal model . . which clearly exposes the desired signal component in the fist term of the right-hand side. We shall find great utility in going to vector and matrix notation at this point following a discretization of equation (4). ?he appropriate discrete Fourier uansform (DIT') equivalent of (4) is 1% [ H ( k ) P ( k ) + N (k)l = 1% [H (k) P ( k ) l where k = 0, ..., M 1 is the discrete normalized frequency variahle. We shall also stay consistent with the notation found in [21 andJ31 for representing the log of a general function, X ( k ) , as X (k). Thus, we represent (5 ) in vector form as P = Z+ log [l + D-'n] (6) 744 23ACSSC-12/89/0744 $1.00 Q 1989 MAPLE PRESS

Journal ArticleDOI
TL;DR: In this article, the cepstrum analysis of Doppler signals for detecting vibrations of periodically vibrating structures, such as aortic valves in children with innocent heart murmurs, were carried out in two stages.

Proceedings ArticleDOI
08 May 1989
TL;DR: An optimum design method is presented for a homomorphic deconvolution system using an exponentially weighted complex cepstrum with a band-pass filter and a precise compression is derived for the deconvolved signal by introducing a transmission factor.
Abstract: An optimum design method is presented for a homomorphic deconvolution system using an exponentially weighted complex cepstrum with a band-pass filter. A precise compression is derived for the deconvolved signal by introducing a transmission factor, which is defined as the ratio of the deconvolved signal to the desired signal. An optimal deconvolution system is defined as the system which maximizes the energy of the maximum distortion factor (=1-transmission factor). The performance of the proposed optimum system is greatly superior to that of the conventional system. >

Patent
22 Sep 1989
TL;DR: In this article, the authors proposed to stably extract a voice fundamental period even if there is much change in the voice fundamental periods by providing plural voice analyzing means, a selective coupling section for the outputs thereof and a voice essential period extracting section.
Abstract: PURPOSE:To stably extract a voice fundamental period even if there is much change in the voice fundamental period by providing plural voice analyzing means, a selective coupling section for the outputs thereof and a voice fundamental period extracting section CONSTITUTION:A microphone 1 is led via an A/D converter 2 to the voice fundamental period extractor by a cepstrum method A spectral analyzing section 3-1 is shortest in analysis section length and 3-n is longest The analysis section length of the analyzing section 5-5 past a logarithmic conversion section 4 is shortest and is longest in 5-n The adequate cepstra of the analysis section length are selected and coupled according to the frequency part in a cepstrum selective coupling section 6 The max value of the cepstra obtd in the selective coupling section 6 is determined and the position of said max value is determined as the fundamental period in the fundamental period extraction section 7 The flexible follow-up to the fundamental frequencies of the heavily changing voices is enabled and the stable extraction of the fundamental frequencies is possible according to this constitution

Proceedings ArticleDOI
23 May 1989
TL;DR: The authors compare a special case of these results and find the range of SNR in which they expect the bicepstrum method to perform better than the complex cepStrum method.
Abstract: The purpose of this study is to present an analytic performance evaluation of the complex cepstrum and bicepstrum (i.e. cepstrum of the bispectrum) methods by providing explicit expressions of the bias and variance of cepstrum parameters. A model consisting of a deterministic signal in additive white Gaussian noise and finite length data is assumed. The authors compare a special case of these results and find the range of SNR in which they expect the bicepstrum method to perform better than the complex cepstrum method. >

Journal ArticleDOI
TL;DR: In this paper, the authors examined the mechanism of sound production in tettigoniids by applying the method of "cepstrum" analysis to insect calls, which is defined as the inverse Fourier transform of the logarithmic power spectrum.
Abstract: The mechanism of sound production in tettigoniids is examined by applying the method of ‘cepstrum’ analysis to insect calls. The power cepstrum is defined as the inverse Fourier transform of the logarithmic power spectrum. This analysis shows that the tettigoniid sound signal is a convolution in time of probably two components. The first is caused by the initial impact of teeth of the stridulatory file on the left wing against the plectrum on the right wing (termed the input pulse); the second is caused by the oscillating properties of the tegmina (these being a function of the intrinsic frequencies of dorsal fields and mirror and their damping properties). In the cepstrum each component appears as a varying number of peaks. The tooth impacts cause a very low quefrency peak probably representing the time in which the two tegmina are in contact during each impact and high quefrency peaks representing the impulse repetition rate. The oscillating properties of the tegmina cause two major quefrency p...

Journal ArticleDOI
TL;DR: In this paper, two methods are shown for characterizing shapes of a power spectrum to estimate distribution (randomness) of the spacings of the scatterers, one method is FFT of cepstrum (FFTC) and the other is fractal-based analysis which is useful for characterization of complex figures such as geometry.
Abstract: Two methods are shown for characterizing shapes of a power spectrum to estimate distribution (randomness) of the spacings of the scatterers. One method is FFT of cepstrum (FFTC). The other is fractal-based analysis which is useful for characterization of complex figures such as geometry. Using a one-dimensional scattering model, trial estimations of the randomness of the scatterers were performed, and it was found that the randomness could be discriminated by both methods.

Patent
Hattori Hiroaki1
29 Nov 1989
TL;DR: In this paper, the authors used the information of the full band and full band characteristic cepstrum time series to recognize speech with high accuracy even under high noises by using the information in a low range where spectral fluctuations are few in the parts of the speech which seem to be vowels and using information of a full band in the part exclusive of said parts.
Abstract: PURPOSE:To recognize speech with high accuracy even under high noises by using the information of a low range where spectral fluctuations are few in the parts of the speech which seem to be vowels and using the information of a full band in the parts exclusive of said parts. CONSTITUTION:The speech inputted to a terminal 401 is received in a full band analysis section 402 where cepstrum coeffts. are determined by using the information of the full band and full band characteristic cepstrum time series are formed. The cepstrum coeffts. are similarly determined by using the information of <=2.5kH and the low band characteristic cepstrum time series are formed in a low band analysis section 403. The full band characteristic vectors of M-pieces of words are previously stored in a full band standard pattern memory section 404 and the low band characteristic vectors of M-pieces of words are also stored in a low band standard pattern memory section 406. Weighting coeffts. of M-pieces of words are previously stored in a coefft. memory section 408 and the distances of the respective bands are calculated by a full band distance calculating section 405 and a low band distance calculating section 407. The inter-frame distances are calculated by using the weighting coeffts. in an inter-frame distance calculating part and the matching thereof is recognized in a recognition section 410.


Patent
02 May 1989
TL;DR: In this paper, a voice is inputted from a microphone connected to a logarithmic amplifier and a converter receives an output from the amplifier and samples a voice waveform and a feature extracting circuit calculates a linear prediction encoding parameter.
Abstract: PURPOSE: To provide a voice recognition method capable of being sufficiently accurately operated by an inexpensive system of which memory and processing functions are restricted by encoding a received voice frame, comparing the code with a reference template and determining optimum matching. CONSTITUTION: A voice is inputted from a microphone 12 connected to a logarithmic amplifier 14. A converter 16, receives an output from the amplifier 14 and samples a voice waveform and a feature extracting circuit 18 calculates a linear prediction encoding parameter. Then a feature coefficient and an energy coefficient are obtained, a cepstrum parameter is calculated and compared with a threshold, and a voice frame is converted into a single byte of data. The converted data are transferred to a time registration device 20 and compared with a reference template 22 for the words of defined vocabularies and the latest two optimum matching states are sent to a determination logic part 24. The logic part 24 combines the optimum matching states with energy information to determine optimum matching and outputs the determined result to a response control part 26.