scispace - formally typeset
Search or ask a question

Showing papers on "Cepstrum published in 1975"


01 Apr 1975
TL;DR: Tests were done to improve the performance of INTEL, a process for improving the signal-to-noise ratio of speech which has been corrupted by wideband noise, showing how it works, why its effect on speech is different from its impact on noise, and why the cepstrum does not provide similar benefits.
Abstract: : This report describes theoretical and experimental studies done to improve the performance of INTEL, a process for improving the signal-to-noise ratio of speech which has been corrupted by wideband noise. The theoretical study is a detailed statistical study of the process, showing how it works, why its effect on speech is different from its effect on noise, and why the cepstrum, a closely similar process, does not provide similar benefits. The experimental studies were explorations suggested by previous studies. These studies implemented various modifications to the basic process in the hope that they would improve its performance. The modifications were: Threshold clipping, Center clipping, Harmonic emphasis, and Adaptation to narrow-band speech. In addition, a number of experiments were conducted to study the effect of phase on speech intelligibility.

68 citations


Journal ArticleDOI
TL;DR: In this paper, a technique for semiautomatically determining the pitch contour of an utterance is described, which is significantly more sophisticated than the standard technique of hand tracking of pitch periods from a waveform display of the utterance and leads to a fairly robust measurement of the pitch period.
Abstract: The purpose of this paper is to describe a technique for semiautomatically determining the pitch contour of an utterance. The method is significantly more sophisticated than the standard technique of hand tracking of pitch periods from a waveform display of the utterance and leads to a fairly robust measurement of the pitch period. This technique utilizes a simultaneous display (on a 10 ms section-by-section basis) of the low-pass filtered waveform, the autocorrelation of a 400- point segment of the low-pass filtered waveform, and the cepstrum of the same 400-point segment of the wideband recording. For each of the separate displays (i.e., waveform, autocorrelation, and cepstrum) an independent estimate of the pitch period is made on an interactive basis with the computer, and the final pitch period decision is made by the user based on results of each of the measurements. The technique has been tested on a large number of utterances spoken by a variety of speakers with very good results. Formal tests of the method were made in which four people were asked to use the method on three different utterances, and their results were then compared. During voiced regions, the standard deviation in the value of the pitch period was about 0.5 samples across the four people. The standard deviation of the location of the time at which voiced regions became unvoiced, and vice versa was on the order of half a section duration, or 5 ms. The major limitation of the proposed method is that it requires about 30 min to analyze 1 s of speech. However, the increased accuracy and robustness of the results indicate that the tradeoff of time for accuracy is a good one for many applications.

45 citations


Journal ArticleDOI
R. Rom1
TL;DR: Some properties of the two-dimensional cepstrum (especially those absent in the one-dimensional) appear to make it an important tool in image processing.
Abstract: Cepstral analysis has been used in speech processing for some time, but in the field of image processing very little attention has been paid to it. Some properties of the two-dimensional cepstrum (especially those absent in the one-dimensional) appear to make it an important tool in image processing. In this correspondence these properties are summarized. Applications to image deblurring (blur identifying) and image classification are mentioned as some possible uses.

44 citations


Journal ArticleDOI
TL;DR: In this paper, the application of homomorphic filtering in marine seismic reflection work is investigated with the aims to achieve the estimation of the basic wavelet, the wavelet deconvolution and the elimination of multiples.
Abstract: The application of homomorphic filtering in marine seismic reflection work is investigated with the aims to achieve the estimation of the basic wavelet, the wavelet deconvolution and the elimination of multiples. Each of these deconvolution problems can be subdivided into two parts: The first problem is the detection of those parts in the cepstrum which ought to be suppressed in processing. The second part includes the actual filtering process and the problem of minimizing the random noise which generally is enhanced during the homomorphic procedure. The application of homomorphic filters to synthetic seismograms and air-gun measurements shows the possibilities for the practical application of the method as well as the critical parameters which determine the quality of the results. These parameters are: a) the signal-to-noise ratio (SNR) of the input data b) the window width and the cepstrum components for the separation of the individual parts c) the time invariance of the signal in the trace. In the presence of random noise the power cepstrum is most efficient for the detection of wavelet arrival times. For wavelet estimation, overlapping signals can be detected with the power cepstrum up to a SNR of three. In comparison with this, the detection of long period multiples is much more complicated. While the exact determination of the water reverberation arrival times can be realized with the power cepstrum up to a multiples-to-primaries ratio of three to five, the detection of the internal multiples is generally not possible, since for these multiples this threshold value of detectibility and arrival time determination is generally not realized. For wavelet estimation, comb filtering of the complex cepstrum is most valuable. The wavelet estimation gives no problems up to a SNR of ten. Even in the presence of larger noise a reasonable estimation can be obtained up to a SNR of five by filtering the phase spectrum during the computation of the complex cepstrum. In contrast to this, the successful application of the method for the multiple reduction is confined to a SNR of ten, since the filtering of the phase spectrum for noise reduction cannot be applied. Even if the threshold results are empirical, they show the limits for the successful application of the method.

33 citations


Journal ArticleDOI
TL;DR: In this paper, it is shown that distortion in the channel modulates the time delay carrier and sets definite limits on the performance of the complex cepstrum, and that the effect of the noise is additive for a noise spectrum which is pointwise weak relative to that of the signal.

25 citations


Journal ArticleDOI
R. Smith1
TL;DR: An expression is derived for the function that governs the discrimination by the power cepstrum against components at large time delays that has been found to be useful in normalizing cepStrum displays.
Abstract: An expression is derived for the function that governs the discrimination by the power cepstrum against components at large time delays. The function has been found to be useful in normalizing cepstrum displays.

5 citations


Journal ArticleDOI
TL;DR: The results of the study suggest that 20‐msec time resolution is adequate for vocoder applications and adapting to better time resolution in unvoiced regions and regions of voiced–unvoiced and unvoicing–voiced transitions leads to improved speech quality in systems that do not normally maintain 20 msec or better timeresolution.
Abstract: This paper reports the results of a study of the perceptual consequences of the time and frequency resolution loss inherent in vocoded speech, and an evaluation of an adaptive resolution scheme. A cepstrum vocoder which adapted its time and frequency resolution according to the voiced–unvoiced nature of the input speech was computer simulated. Speech processed by the vocoder was subjectively evaluated and several tentative conclusions regarding time‐frequency resolution and speech quality were drawn. The results of the study suggest that (1) 20‐msec time resolution is adequate for vocoder applications, (2) adapting to better time resolution in unvoiced regions and regions of voiced–unvoiced and unvoiced–voiced transitions leads to improved speech quality in systems that do not normally maintain 20 msec or better time resolution, (3) frequency resolution may be reduced considerably in unvoiced and transition regions with no noticeable degradation in speech quality, and (4) time‐frequency resolution trading...

4 citations



01 May 1975
TL;DR: An investigation into the use of cepstrum analysis in connection with the processing of communication signals which have been corrupted by multipath effects and the design of signal waveforms whose cepstra have desirable properties is summarized.
Abstract: : The report summarizes an investigation into the use of cepstrum analysis in connection with the processing of communication signals which have been corrupted by multipath effects. Following up on an earlier phase of work, the primary objective consisted of the development and performance evaluation of an experimental setup, incorporating a minicomputer, for on-line cepstrum analysis of baseband communication signals. The purpose of the cepstrum analysis is the extraction of multipath information which can then be used to maintain a multipath cancellation filter in correct adjustment. Another objective was the design of signal waveforms whose cepstra have desirable properties for this type of processing. The basic experimental setup was completed and is described in this report. Test data is presented which was obtained with signals from various modems and with simulated simple discrete multipath interference. Various sources of error are discussed and analyzed.

2 citations