scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 1985"


Proceedings ArticleDOI
26 Apr 1985
TL;DR: A code-excited linear predictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion, indicating that a random code book has a slight speech quality advantage at low bit rates.
Abstract: We describe in this paper a code-excited linear predictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion. Each sample of the innovation sequence is filtered sequentially through two time-varying linear recursive filters, one with a long-delay (related to pitch period) predictor in the feedback loop and the other with a short-delay predictor (related to spectral envelope) in the feedback loop. We code speech, sampled at 8 kHz, in blocks of 5-msec duration. Each block consisting of 40 samples is produced from one of 1024 possible innovation sequences. The bit rate for the innovation sequence is thus 1/4 bit per sample. We compare in this paper several different random and deterministic code books for their effectiveness in providing the optimum innovation sequence in each block. Our results indicate that a random code book has a slight speech quality advantage at low bit rates. Examples of speech produced by the above method will be played at the conference.

1,343 citations


Proceedings ArticleDOI
01 Apr 1985
TL;DR: This paper describes an effective and efficient time domain speech encoding technique that has an appealingly low complexity, and produces (near) toll quality speech at rates below 16 kbit/s.
Abstract: This paper describes an effective and efficient time domain speech encoding technique that has an appealingly low complexity, and produces (near) toll quality speech at rates below 16 kbit/s. The proposed coder uses linear predictive techniques to remove the short-time correlation in the speech signal. The remaining (residual) information is then modeled by a regular (in time) excitation signal that, when inputted to the time-varying model filter, produces a signal that is "close" to the reference speech signal. The procedure for finding the appropriate excitation model parameters incorporates the solution of a few sets of linear equations and is of moderate complexity compared to competing coding systems such as Adaptive Transform Coding and Multi-Pulse Excitation Coding.

32 citations



Proceedings Article
01 Jan 1985
TL;DR: Experimental results show that a significant gain in segmental SNR can be obtained over nonadaptive VQ with a negligible increase in complexity.
Abstract: A class of adaptive vector quantizers (VQs) that can dynamically adjust the 'gain' of codevectors according to the input signal level is introduced The encoder uses a gain estimator to determine a suitable normalization of each input vector prior to VQ coding The normalized vectors have reduced dynamic range and can then be more efficiently coded At the receiver, the VQ decoder output is multiplied by the estimated gain Both forward and backward adaptation are considered and several different gain estimators are compared and evaluated An approach to optimizing the design of gain estimators is introduced Some of the more obvious techniques for achieving gain adaptation are substantially less effective than the use of optimized gain estimators A novel design technique that is needed to generate the appropriate gain-normalized codebook for the vector quantizer is introduced Experimental results show that a significant gain in segmental SNR can be obtained over nonadaptive VQ with a negligible increase in complexity

16 citations


Proceedings ArticleDOI
01 Apr 1985
TL;DR: A new high-quality speech information compression method which introduces techniques of eliminating unnecessary samples of prediction residual wave pulses to obtain a thinned-out residual and produces slightly higher quality speech than does the MPE method.
Abstract: A new high-quality speech information compression method is developed. This method introduces techniques of eliminating unnecessary samples of prediction residual wave pulses to obtain a thinned-out residual. First, a thinning-out procedure which minimizes the quality degradation is formulated. Next, a procedure which simplifies this thinning-out procedure under several hypotheses is defined. Subjective evaluation of this procedure using preference tests confirms that almost no quality degradation occurs. Pitch information is utilized. Adding the process of repetitive use of the thinned-out residual to the procedure, preference tests are carried out at a bit-rate of 9.6 kb/s for purposes of comparison with the newest MPE which includes the pitch prediction process. The results are that our proposed method produces slightly higher quality speech than does the MPE method. The number of processing steps is less than one-third that of MPE.

13 citations


Proceedings ArticleDOI
01 Apr 1985
TL;DR: Several approaches for pulse amplitude and position determination are described, based on the insertion of a long term pitch predictor in the multi-pulse analysis, and the idea of a two stage modelization is introduced.
Abstract: Since the presentation of multi-pulse excitation concept for LPC coders, by Atal and Remde, many different analysis techniques have been proposed to derive the excitation waveform. This paper describes several approaches for pulse amplitude and position determination. The original solution is compared to procedures which work directly on the residual signal, or which compute again a jointly optimal set of amplitudes, or which improve the filter parameters by taking into account the computed multi-pulse excitation. In addition, other novel techniques, based on the insertion of a long term pitch predictor in the multi-pulse analysis are presented. Also, the idea of a two stage modelization is introduced. Results of experimental evaluations for typical configurations, with respect to implementation complexity as well as speech quality, are given.

11 citations


Proceedings ArticleDOI
26 Apr 1985
TL;DR: This paper shows that a multipulse LPC synthesizer can also be used in a text-to-speech system based on diphone concatenation, and produces French synthetic speech of fairly good naturalness.
Abstract: Multipulse Linear Predictive Coding [1] has been shown to produce natural sounding speech at relatively low bit rates. So far, this technique has mostly been used for speech transmission or storage. In this paper, we show that a multipulse LPC synthesizer can also be used in a text-to-speech system based on diphone concatenation. The main problem is how to manipulate the prosodic parameters required for speech synthesis, and it is addressed here by a two-step procedure. First, a speech signal with relatively flat pitch contour is obtained by multipulse synthesis of concatenated diphones. Then the prosodic parameters of this signal are corrected using a special purpose phase vocoder. This method produces French synthetic speech of fairly good naturalness.

8 citations


Proceedings ArticleDOI
01 Apr 1985
TL;DR: This paper shows a technique of encoding the LPC residual which allows the achievement of speech coding with residual excitation at a bit rate as low as 2400 bps, inspired by the multipulse coding approach introduced by Atal.
Abstract: This paper shows a technique of encoding the LPC residual which allows the achievement of speech coding with residual excitation at a bit rate as low as 2400 bps. The method is inspired by the multipulse coding approach introduced by Atal, associated with an irregular downsampl-ing. The real time implementation of a 4800 bps vocoder on a single TMS 320 DSP is discussed.

5 citations


Proceedings ArticleDOI
01 Oct 1985
TL;DR: This paper describes two systems that use VQ and transmit intelligible speech in the range of 300 to 600 b/s and presents the quantization algorithms and bit allocation for the two vocoders and compares their performance for varying bit rates and different noisy speech conditions.
Abstract: Vector quantization (VQ) has been used recently for developing vocoders operating below 800 b/s. We describe in this paper two systems that use VQ and transmit intelligible speech in the range of 300 to 600 b/s. The frame vocoder which uses VQ for quantizing the spectral parameters of a single frame of speech was found to be most effective at the higher rate of 600 b/s. The segment vocoder which uses VQ for quantizing the spectral parameters of a sequence of frames yielded better intelligibility at the lower 300 b/s rate. We present the quantization algorithms and bit allocation for the two vocoders and compare their performance for varying bit rates and different noisy speech conditions.

4 citations


Proceedings ArticleDOI
01 Apr 1985
TL;DR: The strategy proposed here selects the all-pole parameters to concentrate the model excitation in a finite number of locations to produce a maximally pulse-like residual as a result of theall-pole parameter estimation.
Abstract: Multiple pulse excited linear predictive coding (MPLPC) has recently received a great deal of attention in the literature as an attractive means of speech coding at data rates below 10 Kbits/second. The existing approaches to MPLPC analysis arrive at the parameters for an all-pole model by minimizing the mean squared modeling error before attempting to find a set of pulses to excite the model. The strategy proposed here selects the all-pole parameters to concentrate the model excitation in a finite number of locations. The goal is then to produce a maximally pulse-like residual as a result of the all-pole parameter estimation.

3 citations


Proceedings ArticleDOI
01 Apr 1985
TL;DR: A method for estimating the LPC input pulses using L1-norm that renders a preselected signal-to-noise ratio for every analysis frame and includes a general framework capable of yielding multipulse sequences with different characteristics for a given speech signal.
Abstract: Based on the multipulse model for speech coding, we propose a method for estimating the LPC input pulses using L 1 -norm. Our method renders a preselected signal-to-noise ratio for every analysis frame. Additionally, it includes a general framework capable of yielding multipulse sequences with different characteristics for a given speech signal.

Proceedings ArticleDOI
06 Nov 1985

Proceedings ArticleDOI
01 Apr 1985
TL;DR: It is demonstrated that Markov-Huffman coding can lead to average savings of more than 20% in bit rate and a suboptimal scheme is investigated, which can facilitate the implementation of the method on currently available signal processing chips.
Abstract: A post-quantization processing method is presented which reduces the bit rate of LPC-coded speech without any effect on the speech quality. A Markov model is applied to the quantization levels of the LPC parameters and the resulting transition probabilities are used co generate Huffman coding tables. The appropriate coding table is selected depending on the quantization level of the parameter in the previous frame. It is demonstrated that Markov-Huffman coding can lead to average savings of more than 20% in bit rate. A suboptimal scheme is also investigated, which can facilitate the implementation of the method on currently available signal processing chips.

Proceedings ArticleDOI
01 Apr 1985
TL;DR: Three different methods of codebook generation are presented and their performance as evaluated by the average distortion, signal to quantization noise, speed of implementation are discussed.
Abstract: This paper discusses the results of some experiments that were performed to test the feasibility of high speed vector quantization scheme for low bit rate speech coding. Three different methods of codebook generation are presented and their performance as evaluated by the average distortion, signal to quantization noise, speed of implementation are discussed.