scispace - formally typeset
Search or ask a question

Showing papers on "Linear predictive coding published in 1970"


Journal ArticleDOI
TL;DR: Application of this method for efficient transmission and storage of speech signals as well as procedures for determining other speechcharacteristics, such as formant frequencies and bandwidths, the spectral envelope, and the autocorrelation function, are discussed.
Abstract: A method of representing the speech signal by time‐varying parameters relating to the shape of the vocal tract and the glottal‐excitation function is described. The speech signal is first analyzed and then synthesized by representing it as the output of a discrete linear time‐varying filter, which is excited by a suitable combination of a quasiperiodic pulse train and white noise. The output of the linear filter at any sampling instant is a linear combination of the past output samples and the input. The optimum linear combination is obtained by minimizing the mean‐squared error between the actual values of the speech samples and their predicted values based on a fixed number of preceding samples. A 10th‐order linear predictor was found to represent the speech signal band‐limited to 5kHz with sufficient accuracy. The 10 coefficients of the predictor are shown to determine both the frequencies and bandwidths of the formants. Two parameters relating to the glottal‐excitation function and the pitch period are determined from the prediction error signal. Speech samples synthesized by this method will be demonstrated.

1,124 citations


Journal ArticleDOI
TL;DR: Preliminary studies suggest that the binary difference signal and the predictor parameters together can be transmitted at approximately 10 kilobits/second which is several times less than the bit rate required for log-PCM encoding with comparable speech quality.
Abstract: We describe in this paper a method for efficient encoding of speech signals, based on predictive coding. In this coding method, both the transmitter and the receiver estimate the signal's current value by linear prediction on the previously transmitted signal. The difference between this estimate and the true value of the signal is quantized, coded and transmitted to the receiver. At the receiver, the decoded difference signal is added to the predicted signal to reproduce the input speech signal. Because of the nonstationary nature of the speech signals, an adaptive linear predictor is used, which is readjusted periodically to minimize the mean-square error between the predicted and the true value of the signals. The predictive coding system was simulated on a digital computer. The predictor parameters, comprising one delay and nine other coefficients related to the signal spectrum, were readjusted every 5 milliseconds. The speech signal was sampled at a rate of 6.67 kHz, and the difference signal was quantized by a two-level quantizer with variable step size. Subjective comparisons with speech from a logarithmic PCM encoder (log-PCM) indicate that the quality of the synthesized speech signal from the predictive coding system is approximately equal to that of log-PCM speech encoded at 6 bits/sample. Preliminary studies suggest that the binary difference signal and the predictor parameters together can be transmitted at approximately 10 kilobits/second which is several times less than the bit rate required for log-PCM encoding with comparable speech quality.

291 citations


PatentDOI
TL;DR: The speech produced by most audio response units is noticeably artifical and mechanical sounding as discussed by the authors, which makes it difficult to distinguish between real speech and synthesized speech. But the response units can select speech sounds, stored in analog or coded digital form, as the excitation for a speech synthesizer, for example in telephone audio announcement terminals.
Abstract: Audio response units that select speech sounds, stored in analog or coded digital form, as the excitation for a speech synthesizer are widely used, for example in telephone audio announcement terminals. The speech produced by most units is noticeably artifical and mechanical sounding.

196 citations


Patent
09 Oct 1970
TL;DR: In this paper, the authors propose a speech response apparatus consisting of a memory to store speech parameters, read out means to read out the speech parameters from the memory which are designated by an electronic computer, and a speech synthesizer to reconstruct the speech signal from the output of the readout means.
Abstract: The audio response apparatus comprises means for storing speech parameters including partial autocorrelation coefficients between two closely adjacent time instants of speech signal, which are derived by removing the redundant components from the actual speech signal levels of the two adjacent instants in consideration of the effect of intermediate sample levels between them and an excitation source information determined from sampled values at remotely spaced time instants, a memory to store the speech parameters, read out means to read out the speech parameters from the memory which are designated by an electronic computer, and a speech synthesizer to reconstruct the speech signal from the output of the readout means. The synthesizer is comprised by high speed logic elements and operates to synthesize multichannel audio outputs on the time division basis.

28 citations


Journal ArticleDOI
01 May 1970
TL;DR: This experience in speech processing should serve as a reminder that a thorough understanding of the signal is paramount for successful analyses of real-world processes.
Abstract: The accurate estimation of both discrete and continuous parameters of speech signals has played a central role in speech processing and research. Interestingly, the most successful estimation procedures have often relied on intuition based on knowledge of speech signals and their production in the human vocal apparatus rather than routine applications of well-established theoretical methods. This experience in speech processing should serve as a reminder that a thorough understanding of the signal is paramount for successful analyses of real-world processes.

16 citations


Journal ArticleDOI
B. Lobanov1
TL;DR: The speech signal is examined as part of a specific speech communication system and the results of such examination yield the following principles of this analysis.
Abstract: The speech signal is examined as part of a specific speech communication system. The results of such examination yield the following principles of this analysis. 1) The speech signal must be simultaneously and separately processed for each type of modulation. 2) The different forms of information must be extracted from the result of each detection process, and then combined.

2 citations


Journal ArticleDOI
L.C. Kelly1
TL;DR: The paper describes speech production principles, and their application to speech synthesis; the operation of various types of vocoder and the problems of pitch extraction.
Abstract: Speech signals are produced by relatively slow articulatory movements. This suggests that the information rate of the speech signal is much less than would be expected by considering the bandwidth of the acoustic signal. Vocoders attempt to exploit the redundancy in the speech waveform by extracting and transmitting the information bearing parameters of the speech signal. At the receiver, these parameters are used to control a speech synthesizer that reproduces the original signal without any serious loss of intelligibility but with some degradation of quality. The paper describes speech production principles, and their application to speech synthesis; the operation of various types of vocoder and the problems of pitch extraction.

1 citations