scispace - formally typeset
Search or ask a question

Showing papers on "Linear predictive coding published in 1984"


Proceedings ArticleDOI
F.K. Soong1, Biing-Hwang Juang1
19 Mar 1984
TL;DR: An expression for spectral sensitivity with respect to single LSP frequency deviation is derived such that some insight on their quantization effects can be obtained and results on multi-pulse LPC using LSP for spectral information compression are presented.
Abstract: Line Spectrum Pair (LSP) was first introduced by Itakura [1,2] as an alternative LPC spectral representations. It was found that this new representation has such interesting properties as (1) all zeros of LSP polynomials are on the unit circle, (2) the corresponding zeros of the symmetric and anti-symmetric LSP polynomials are interlaced, and (3) the reconstructed LPC all-pole filter preserves its minimum phase property if (1) and (2) are kept intact through a quantization procedure. In this paper we prove all these properties via a "phase function." The statistical characteristics of LSP frequencies are investigated by analyzing a speech data base. In addition, we derive an expression for spectral sensitivity with respect to single LSP frequency deviation such that some insight on their quantization effects can be obtained. Results on multi-pulse LPC using LSP for spectral information compression are finally presented.

506 citations


Journal ArticleDOI
TL;DR: Several algorithms are presented for the design of shape-gain vector quantizers based on a traning sequence of data or a probabilistic model, and their performance is compared to that of previously reported vector quantization systems.
Abstract: Memory and computation requirements imply fundamental limitations on the quality that can be achieved in vector quantization systems used for speech waveform coding and linear predictive voice coding (LPC). One approach to reducing storage and computation requirements is to organize the set of reproduction vectors as the Cartesian product of a vector codebook describing the shape of each reproduction vector and a scalar codebook describing the gain or energy. Such shape-gain vector quantizers can be applied both to waveform coding using a quadratic-error distortion measure and to voice coding using an Itakura-Saito distortion measure. In each case, the minimum distortion reproduction vector can be found by first selecting a shape code-word, and then, based on that choice, selecting a gain codeword. Several algorithms are presented for the design of shape-gain vector quantizers based on a traning sequence of data or a probabilistic model. The algorithms are used to design shape-gain vector quantizers for both the waveform coding and voice coding application. The quantizers are simulated, and their performance is compared to that of previously reported vector quantization systems.

305 citations


Proceedings ArticleDOI
Sharad Singhal1, B. S. Atal2
19 Mar 1984
TL;DR: This paper focuses on problems encountered in attempting to maintain speech quality while synthesizing speech using multi-pulse excitation at lower bit rates.
Abstract: The multi-pulse excitation model provides a method for producing natural-sounding speech at medium to low bit rates. Multi-pulse analysis obtains the all-pole filter excitation by minimizing a spectrally-weighted mean-squared error between the original and synthetic speech signals. Although the method provides high quality speech around 10 kbits/sec, speech quality suffers if the bit rate is lowered. In this paper, we focus on problems encountered in attempting to maintain speech quality while synthesizing speech using multi-pulse excitation at lower bit rates.

163 citations


Patent
02 Apr 1984
TL;DR: In this paper, a technique for transmitting an entire analog speech signal (S) and a modulated data signal (D(t)) over a transmission channel (20) such as a common analog telephone speech channel was proposed.
Abstract: A technique for transmitting an entire analog speech signal (S(t)) and a modulated data signal (D(t)) over a transmission channel (20) such as a common analog telephone speech channel. The present technique multiplexes the entire modulated data signal within the normal analog speech signal frequency band where the speech is present and its signal power density characteristic is at a low level. Separation of the speech and data signals at the receiver (30) is effected by recovering the modulation carrier frequency (fc) and demodulating (33) the receiver signal (X(t)) to recover the data signal. The data signal is then remodulated (34) with the recovered carrier and is convolved with an arbitrary channel impulse response in an adaptive filter (35) whose output signal is subtracted (37) from the received composite data and speech signal (X(t)) to generate the recovered speech signal (S(t)). To improve the recovered speech signal, a least mean square algorithm (36) is used to update the arbitrary channel impulse response output signal of the adaptive filter (35).

151 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: Harmonic Coding is synthesized in the time domain, as a superimposition of "harmonics" whose instantaneous frequency varies continuously along an interpolation curve, within each frame, so that fast pitch variations can be tracked with no difficulty.
Abstract: The Harmonic Coding concept has already shown its potential for efficiently coding speech. Previous implementations have usec a frame rate of one every 16 ms. This was mainly due to the fact that, with longer frames, even a nonstationary spectral model (of low order) cannot reproduce the zones of fast-varying pitch with the desirable quality. However, the high framing rate is a limitation, since it implies that fewer bits will be available for encoding each frame. A solution for this problem has been devised: the signal is synthesized in the time domain, as a superimposition of "harmonics" whose instantaneous frequency varies continuously along an interpolation curve, within each frame. In this way, fast pitch variations can be tracked with no difficulty. Experimental results are presented, confirming these facts. The integration of this synthesis scheme in a speech coder is discussed.

98 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: It is found that the additional glottal parameters can be coded effectively such that the total bit rate is in the same range as for conventional LPC.
Abstract: A procedure is suggested for improving LPC speech quality. The central theme is to introduce a parametric model of voiced excitation - a glottal source model. In the analysis this allows for a different method than the AR-estimation used in conventional LPC. Here, a method known as AR-X-estimation is used. A complete analysis and coding method is presented. It is found that the additional glottal parameters can be coded effectively such that the total bit rate is in the same range as for conventional LPC. The glottal LPC-vocoder does significantly improve synthesis quality as compared to standard LPC. It should be emphasized, however, that the glottal vocoder requires high quality speech as input, recorded in a phase linear system. Moreover, the computational complexity is high.

56 citations



Proceedings ArticleDOI
01 Mar 1984
TL;DR: A computationally efficient formulation is derived for both covariance and correlation type analyses for multipulse coding of speech, ranging from a purely sequential one to one which reoptimizes pulse amplitudes at each step.
Abstract: This paper discusses the analysis techniques used to derive the excitation waveform for multipulse coding of speech. A computationally efficient formulation is derived for both covariance and correlation type analyses. These methods differ in the way block edges are treated. Several methods for pulse amplitude and position determination are given, ranging from a purely sequential one to one which reoptimizes pulse amplitudes at each step. It is shown that the reoptimization scheme has a nested structure that allows a reduction in the computations. An efficient method for pulse position coding is given. This method can essentially achieve the entropy limit for randomly placed pulses. Experimental results are given for typical configurations including computational requirements and speech quality assessments.

48 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: Algorithms based on spectral subtraction are developed for improving the intelligibility of speech that has been interfered by a second talker's voice, and significant gain in intelligibility for low signal-to-noise ratio conditions is achieved.
Abstract: Algorithms based on spectral subtraction are developed for improving the intelligibility of speech that has been interfered by a second talker's voice. A number of new properties of spectral subtraction are shown, including the effects of phase on the output speech intelligibility, and the choice of magnitude spectral differences for best results. A harmonic extraction algorithm is also developed. Results of formal testing on the final system show that significant gain in intelligibility for low signal-to-noise ratio conditions is achieved.

45 citations


PatentDOI
Peter F. Brown1
TL;DR: In this article, a speech recognition method and apparatus employ a speech processing circuitry for repetitively deriving from a speech input, at a frame repetition rate, a plurality of acoustic parameters.
Abstract: A speech recognition method and apparatus employ a speech processing circuitry for repetitively deriving from a speech input, at a frame repetition rate, a plurality of acoustic parameters. The acoustic parameters represent the speech input signal for a frame time. A plurality of template matching and cost processing circuitries are connected to a system bus, along with the speech processing circuitry, for determining, or identifying, the speech units in the input speech, by comparing the acoustic parameters with stored template patterns. The apparatus can be expanded by adding more template matching and cost processing circuitry to the bus thereby increasing the speech recognition capacity of the apparatus. The speech processing circuitry establishes overlapping time durations for generating the acoustic parameters and further employs a sinc-Kaiser smoothing function in combination with a folding technique for providing a discrete Fourier transform. The Fourier spectra are transformed using a biased principal component analysis which optimizes the across class variance. The template matching and cost processing circuitries provide distributed processing, on demand, of the acoustic parameters for generating through a dynamic programming technique the recognition decision.

42 citations


Patent
Ira A. Gerson1
28 Dec 1984
TL;DR: In this paper, an improved method and means of determining reflection coefficients that characterize an electrical signal that obtains characteristics of an all-zero inverse lattice filter was proposed, where the reflection coefficients were obtained by filtering the signal, sample the filtered signal, obtaining the elements of a correlation array from the samples, initializing values of arrays forward residuals, backward residuals and cross correlation of residuals.
Abstract: An improved method and means of determining reflection coefficients that characterize an electrical signal that obtains characteristics of an all-zero inverse lattice filter. The reflection coefficients are obtained by filtering the signal, sample the filtered signal, obtaining the elements of a correlation array from the samples, initializing values of arrays forward residuals, backward residuals, and cross correlation of residuals, combining array elements to obtain a first reflection coefficient, removing from the forward, backward and cross-correlation arrays the effect of the first reflection coefficient, calculating from the revised arrays a second coefficient, and repeating the calculations to the desired order. In a second embodiment of the present invention, samples are selected from the digitized signal and multiplied by a windowing function. The windowed samples are used to derive values of an autocorrelation array which eliminates the need for both forward and backward arrays as in the first embodiment of the invention.

Proceedings ArticleDOI
19 Mar 1984
TL;DR: It is shown, on analyses of both synthetic and natural speech, that the averaged parabolic approximation between harmonic peaks of voiced speech spectrum reduces the sensitivity of the LP analysis to changes in the fundamental frequency Fo and to noise.
Abstract: In spite of its extensive use, speech analysis based on linear prediction (LP) is liable to various causes of inaccuracy. This paper presents a novel approach to improve the accuracy in the estimation of the voiced speech production model based on the LP method. The presented method uses interpolation between spectral points which are least influenced by artifacts in the spectral analysis and by noise in the signal. We show, on analyses of both synthetic and natural speech, that the averaged parabolic approximation between harmonic peaks of voiced speech spectrum reduces the sensitivity of the LP analysis to changes in the fundamental frequency Fo and to noise. The method is well suited for combination with the Spectral Transform LP method, previously proposed by the authors [1].

Proceedings ArticleDOI
01 Mar 1984
TL;DR: Simulation results demonstrate that vector quantization offers a distinct perceptual improvement compared with scalar quantization of the same subband signals and side information for the same total bit rate.
Abstract: Vector quantization (VQ) is examined as a technique to enhance performance in subband coding of speech at 9.6 kb/s. The set of short-term subband power levels is vector quantized, providing low-rate side information to control the coding of the subband signals. Each subband signal is then vector quantized with variable size codebooks that are dynamically assigned by the quantized side information. Two versions are described, a 7-band coder and a 14-band coder. Simulation results demonstrate that vector quantization offers a distinct perceptual improvement compared with scalar quantization of the same subband signals and side information for the same total bit rate.

Journal ArticleDOI
TL;DR: A modified LPC system (LPC plus) which requires no voiced/unvoiced switch at the synthesizer is presented in this paper and produces synthesized speech which is more natural and intelligible than that produced by conventional LPC.
Abstract: A modified LPC system (LPC plus) which requires no voiced/unvoiced switch at the synthesizer is presented in this paper. The excitation functions of the synthesizer filter are modeled to be the sum of the conventional pulse and noise sources. The mixture ratio is estimated from the LPC residual error signal, and this parameter controls the amplitudes of the pulse and noise sources. Since the V/UV switch is eliminated, this system produces robust speech in highly noisy environments, while a conventional LPC system produces degraded speech due to voicing errors. In addition, this technique has been applied to the speech of two simultaneous speakers and produces synthesized speech which is more natural and intelligible than that produced by conventional LPC.

Proceedings ArticleDOI
K. Oh1, C. Un
19 Mar 1984
TL;DR: It has been found that for pitch detection of noisy speech the algorithm that uses an AMDF or an autocorrelation function yields relatively good performance than others.
Abstract: Results of a performance comparison study of eight pitch extraction algorithms for noisy as well as clean speech are presented. These algorithms are the autocorrelation method with center clipping, the autocorrelation method with modified center clipping, the simplified inverse filter tracking (SIFT) method, the average magnitude difference function (AMDF) method, the pitch detection method based on LPC inverse filtering and AMDF, the data reduction method, the parallel processing method and the cepstrum method. It has been found that for pitch detection of noisy speech the algorithm that uses an AMDF or an autocorrelation function yields relatively good performance than others. A pitch detector that uses center clipped speech as an input signal is effective in pitch extraction of noisy speech. In general, preprocessing such as LPC inverse filtering or center clipping of input speech yields remarkable improvement in pitch detection.

Journal ArticleDOI
TL;DR: In this article, the authors used the maximum likelihood (ML) method to derive a spectral matching criterion for autoregressive (i.e., all-pole) random processes.
Abstract: Itakura and Saito [1] used the maximum likelihood (ML) method to derive a spectral matching criterion for autoregressive (i.e., all-pole) random processes. In this paper, their results are generalized to periodic processes having arbitrary model spectra. For the all-pole model, Kay's [2] covariance domain solution to the recursive ML (RML) problem is cast into the spectral domain and used to obtain the RML solution for periodic processes. When applied to speech, this leads to a method for solving the joint pitch and spectrum envelope estimation problems. It is shown that if the number of frequency power measurements greatly exceeds the model order, then the RML algorithm reduces to a pitch-directed, frequency domain version of linear predictive (LP) spectral analysis. Experiments on a real-time vocoder reveals that the RML synthetic speech has the quality of being heavily smoothed.

PatentDOI
TL;DR: In this article, a speech recognition method and apparatus employ a speech processing circuitry for repetitively deriving from a speech input, at a frame repetition rate, a plurality of acoustic parameters.
Abstract: A speech recognition method and apparatus employ a speech processing circuitry for repetitively deriving from a speech input, at a frame repetition rate, a plurality of acoustic parameters. The acoustic parameters represent the speech input signal for a frame time. A plurality of template matching and cost processing circuitries are connected to a system bus, along with the speech processing circuitry, for determining, or identifying, the speech units in the input speech, by comparing the acoustic parameters with stored template patterns. The apparatus can be expanded by adding more template matching and cost processing circuitry to the bus thereby increasing the speech recognition capacity of the apparatus. Template pattern generation is advantageously aided by using a "joker" word to specify the time boundaries of utterances spoken in isolation, by finding the beginning and ending of an utterance surrounded by silence.

Patent
11 May 1984
TL;DR: In this article, Markov models are applied to quantized speech parameters to represent their time behavior in a probabilistic manner, which is accomplished by representing the quantised speech parameters as finite state machines having predetermined matrices of transitional probabilities from which the conditional probabilities as to the quantization of successive speech data frames are established.
Abstract: Method and system for encoding digital speech information to characterize spoken human speech with an optimally reduced speech data rate while retaining speech quality in the audible reproduction of the encoded digital speech information. Markov modeling is applied to quantized speech parameters to represent their time behavior in a probabilistic manner. This is accomplished by representing the quantized speech parameters as finite state machines having predetermined matrices of transitional probabilities from which the conditional probabilities as to the quantized speech parameter values of successive speech data frames are established. The probabilistic description as so obtained is then used to represent the respective quantized values of the speech parameters by a digital code through Huffman coding in which digital codewords of variable length represent the quantized speech parameter values in accordance with their probability of occurrence such that more probable quantized values are assigned digital codewords of a shorter bit length while less probable quantized values are assigned digital codewords of a longer bit length.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: It is demonstrated that the inclusion of a pitch detector significantly improves the perceived quality of the synthetic speech and a modification of the original algorithm is described, resulting in a lower complexity, and a speech quality close to the results obtained with the original algorithms.
Abstract: We report on the results obtained from simulations of the Multi-Pulse Excitation Coder as proposed by Atal and Remde [4]. We investigated the effects of the different analysis parameters on the resulting synthetic speech signals, using objective and subjective tests. We compared the in [4] proposed sub-optimal solutions with another sub-optimal solution based on an orthogonalization of the solution space and found that the original proposed solutions are reasonable choices. We demonstrate that the inclusion of a pitch detector significantly improves the perceived quality of the synthetic speech. We also describe a modification of the original algorithm, resulting in a lower complexity, and a speech quality close to the results obtained with the original algorithm.

Journal ArticleDOI
TL;DR: This paper presents a method of incorporating LPC spectral shape and energy into the code-book entries of the vector quantizer using a distortion measure for comparing two LPC vectors that uses the weighted sum of an LPC shape distortion and a log energy distortion.
Abstract: The theory of vector quantization (VQ) of linear predictive coding (LPC) coefficients has established a wide variety of techniques for quantizing LPC spectral shape to minimize overall spectral distortion. Such vector quantizers have been widely used in the areas of speech coding and speech recognition. The conventional vector quantizer utilizes only spectral shape information and essentially disregards the energy or gain term associated with the optimal LPC fit to the signal being modeled. In this paper we present a method of incorporating LPC spectral shape and energy into the code-book entries of the vector quantizer. To do this, we postulate a distortion measure for comparing two LPC vectors that uses the weighted sum of an LPC shape distortion and a log energy distortion. Based on this combined distortion measure, we have designed and studied vector quantizers of several sizes for use in isolated word speech recognition experiments. We found that a fairly significant correlation exists between LPC shape and signal energy. Hence, an LPC shape combined with energy vector quantizer with a given distortion requires far fewer code-book entries than one in which LPC shape and energy are quantized separately. Based on isolated word recognition tests on both a 10-digit and a 129-word airlines vocabulary, we found improvements in recognition accuracy by using the VQ with both LPC shape and energy over that obtained using a VQ with LPC shape alone.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: A system for speech analysis and enhancement which combines signal processing and symbolic processing in a closely coupled manner and attempts to reconstruct the original speech waveform using symbolic processing to help model the signal and to guide reconstruction.
Abstract: This paper describes a system for speech analysis and enhancement which combines signal processing and symbolic processing in a closely coupled manner. The system takes as input both a noisy speech signal and a symbolic description of the speech signal. The system attempts to reconstruct the original speech waveform using symbolic processing to help model the signal and to guide reconstruction. The system uses various signal processing algorithms for parameter estimation and reconstruction.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: This study was prompted by direct observation of the structure of the excitation, where patterns of pulses may be found which are associated with phase-correcting mechanisms of the LPC impulse response, and developed a new multipulse technique based on a special ARMA model formed by cascading an all-pole with anall-pass network.
Abstract: Multipulse LPC, as is often designated the model proposed by Atal and Remae, has been directed towards 9.6 kb/s speech coding. However, at such bit rate, the speech quality is not yet generally acceptable. This paper has a double purpose - one is to investigate the role of short-time phase in multipulse LPC. The other is to look for different modelling structures to be used with this method. This study was prompted by direct observation of the structure of the excitation, where patterns of pulses may be found which are associated with phase-correcting mechanisms of the LPC impulse response. Consequently, a new multipulse technique was developed, based on a special ARMA model formed by cascading an all-pole with an all-pass network. This new model will be referred to as the MAPAP (Multipulse All-Pole All-Pass) method. Another one was tried in which the synthetic speech is formed by combination of several MAPAP signals. We therefore denoted it "multichannel multipulse" method. The potential advantages of both single and multichannel models seem rather promising.

Proceedings ArticleDOI
D. Fischell1, C. Coker
19 Mar 1984
TL;DR: The speech direction finder described here is a relatively simple device based on an off the shelf microcomputer which can provide the direction to a talker to within 3 degrees of azimuth angle on a single spoken syllable.
Abstract: The speech direction finder described here is a relatively simple device based on an off the shelf microcomputer. It can provide the direction to a talker to within 3 degrees of azimuth angle on a single spoken syllable, will only respond to speech, and when used with Wallace linear array microphones can provide this at distances of 50 feet or more. There are numerous applications for the device which may enhance the quality of audio and video teleconferences.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: This paper presents a method of incorporating LPC spectral shape and energy into the codebook entries of the vector quantizer, and finds improvements in recognition accuracy by using the VQ with both LPCshape and energy over that obtained using a VQWith LPC shape alone.
Abstract: The theory of vector quantization (VQ) of linear predictive coding (LPC) coefficients has established a wide variety of techniques for quantizing LPC spectral shape to minimize overall spectral distortion. Such vector quantizers have been widely used in the areas of speech coding and speech recognition. The conventional vector quantizer utilizes only spectral shape information and essentially disregards the energy or gain term associated with the optimal LPC fit to the signal being modelled. In this paper we present a method of incorporating LPC spectral shape and energy into the codebook entries of the vector quantizer. To do this we postulate a distortion measure for comparing two LPC vectors which uses a weighted sum of an LPC shape distortion and a log energy distortion. Based on this combined distortion measure we have designed and studied vector quantizers of several sizes for use in isolated word speech recognition experiments. We have found that a fairly significant correlation exists between LPC shape and signal energy; hence a combined LPC shape plus energy vector quantizer with a given distortion requires far fewer codebook entries than one in which LPC shape and energy are quantized separately. Based on isolated word recognition tests on both a 10-digit and a 129 word airlines vocabulary, we have found improvements in recognition accuracy by using the VQ with both LPC shape and energy over that obtained using a VQ with LPC shape alone.

Proceedings ArticleDOI
19 Mar 1984
TL;DR: For the applications of speech synthesis from speech model parameters, time-scale modification of clean speech, speech enhancement by spectral subtraction, and helium speech enhancement, significant improvement is not gained by using the LSEE-MSTFTM algorithm.
Abstract: In this paper, speech synthesis directly from the processed Short-Time Fourier Transform Magnitude (STFTM) using the LSEE-MSTFTM algorithm [6,7] is compared to more conventional algorithms for several speech processing applications. For the applications considered, the most improvement occurs for time-scale modification of multiple speaker speech and noisy speech since these input signals are not well modeled by the analysis/synthesis system used for comparison. However, for the applications of speech synthesis from speech model parameters, time-scale modification of clean speech, speech enhancement by spectral subtraction, and helium speech enhancement, significant improvement is not gained by using the LSEE-MSTFTM algorithm. Significantly better results are not obtained since a good STFT phase estimate is available and employed in the conventional approaches to these applications.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: A further application for time-alignment algorithms is described, in which replacement dialogue for a film soundtrack may be automatically synchronized to reference dialogue recorded during filming, in a digital signal processing system that uses a DP algorithm.
Abstract: A number of applications exist in basic speech research for Dynamic Programming (DP) algorithms that can produce accurate time registration data for aligning one speech signal with a similar speech signal. In this paper, a further application for time-alignment algorithms is described, in which replacement dialogue for a film soundtrack may be automatically synchronized to reference dialogue recorded during filming. This is being carried out in a digital signal processing system that uses a DP algorithm capable of aligning utterances of indeterminate length accurately and efficiently in real-time. The main features of this system and the DP algorithm will be described.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: An end-point detector for LPC speech using squared prediction error look-ahead and automatic/manual threshold determination is described, which is relatively immune to transient pulses and various low-level noises, yet preserves low- level speech sounds such as weak fricatives to a significant extent under moderate noise conditions.
Abstract: An end-point detector for LPC speech using squared prediction error look-ahead and automatic/manual threshold determination is described. The detector is algorithmically simple, computationally efficient,and uses only one decision parameter. Preliminary tests indicate that it is relatively immune to transient pulses and various low-level noises, yet preserves low-level speech sounds such as weak fricatives to a significant extent under moderate noise conditions. Tests indicate that 93.8% of automatically determined endpoints agree to within two frames of manually determined endpoints. The detector is especially suitable for use in vector-quantization based LPC systems, where the squared prediction error is easily available.

Proceedings ArticleDOI
01 Mar 1984
TL;DR: A simple method is presented for extracting the amplitudes and locations of a multiple impulse excitation model which allows a more accurate recomputation of the autoregressive coefficients based upon incorporating the multipulse excitation.
Abstract: One of the sources of degradation in LPC-synthesized speech is the mechanical quality due to a single impulse excitation per pitch period. This paper presents a simple method for extracting the amplitudes and locations of a multiple impulse excitation model. These multipulse parameters are obtained very easily from the autoregressive (LPC) residual. Additionally, a method is developed which allows a more accurate recomputation of the autoregressive coefficients based upon incorporating the multipulse excitation.

Book ChapterDOI
01 Jan 1984
TL;DR: If one approximates the vocal tract as a series of fixed length tubes (which is equivalent to representing it as an all-pole digital filter) it becomes possible to predict successive samples of the speech wave as linear combinations of previous samples.
Abstract: If one approximates the vocal tract as a series of fixed length tubes (which is equivalent to representing it as an all-pole digital filter) it becomes possible to predict successive samples of the speech wave as linear combinations of previous samples. The coefficients in the linear combination characterize the shape of the vocal tract. A sequence of sets of coefficients can be used to characterize the changing shape of the vocal tract over time. This representation is widely used because of the particularly efficient algorithms associated with it.

Patent
05 Mar 1984
TL;DR: In this article, a digital arrangement utilizing linear predictive coding for equalizing over a desired frequency spectrum the variable attenuation of a voice-frequency message signal transmitted on a communication line is presented.
Abstract: Disclosed is a digital arrangement utilizing linear predictive coding for equalizing over a desired frequency spectrum the variable attenuation of a voice-frequency message signal transmitted on a communication line. The arrangement comprises a digital signal processor, program memories for storing program instruction sets, and a data memory for storing samples of a spectrally white test signal that has been variably attenuated by the line. Under control of one instruction set that incorporates linear predictive coding, the processor uses the stored test samples to calculate the reflection coefficients of the line that characterize the variable attenuation of a signal on the line. Under the control of the other instruction set, the processor functions as a digital inverse filter employing the calculated reflection coefficients for equalizing over the desired frequency spectrum the variable attenuation of a voice-frequency message signal transmitted on the line.