scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 1992"


Journal ArticleDOI
J.-H. Chen1, Richard V. Cox1, Y.-C. Lin, Nuggehally Sampath Jayant2, M.J. Melchner2 
TL;DR: The official CCITT laboratory tests revealed that the speech quality of this 16 kb/s LD-CELP coder is either equivalent to or better than that of the CCITT G.721 standard 32-kb/s ADPCM coder for almost all conditions tested.
Abstract: A low-delay code-excited linear prediction (LD-CELP) speech coder which is expected to be standardized in 1992 as a CCITT G Series Recommendation for universal applications of speech coding at 16 kb/s is presented. The coder achieves a one-way coding delay of less than 2 ms by making both the LPC predictor and the excitation gain backward-adaptive and by using a small excitation vector size of five samples. The official CCITT laboratory tests revealed that the speech quality of this 16 kb/s LD-CELP coder is either equivalent to or better than that of the CCITT G.721 standard 32-kb/s ADPCM coder for almost all conditions tested. A description of the LD-CELP algorithm, its implementation on the DSP32C for CCITT testing, and performance results from these tests are presented. >

206 citations


Patent
25 Jun 1992
TL;DR: In this article, a low-bit-rate speech decoder is proposed, which operates in two modes depending on the received mode bit, pitch prefiltering and global post-filtering are employed for enhancement of the synthesized speech.
Abstract: Code excited linear prediction (CELP) is performed using two voiced and unvoiced sets of windows, each set is used both for linear prediction and pitch determination. The accompanying degradation in voice quality is comparable to the IS54 standard 8.0 Kbps voice coder employed in U.S. digital cellular systems. This is accomplished by using the same parametric model used in traditional CELP coders but determining, quantizing, encoding, and updating these parameters differently. The low bit rate speech decoder is like most CELP decoders except that it operates in two modes depending on the received mode bit. Both pitch prefiltering and global postfiltering are employed for enhancement of the synthesized speech. In addition, built-in error detection and error recovery schemes are used that help mitigate the effects of any uncorrectable transmission errors.

113 citations


Patent
19 Nov 1992
TL;DR: In this article, information is transmitted in digital form over fading channels using DPSK coded modulation incorporating multi-level coding in order to provide unequal error protection for different classes of data such as generated by CELP or other speech encoders.
Abstract: Information is transmitted in digital form over fading channels using DPSK coded modulation incorporating multi-level coding in order to provide unequal error protection for different classes of data such as generated by CELP or other speech encoders.

78 citations


Journal ArticleDOI
I.A. Gerson1, M.A. Jasiuk1
TL;DR: Techniques for improving the performance of CELP (code excited linear prediction)-type speech coders while maintaining reasonable computational complexity are explored and a harmonic noise weighting function is introduced.
Abstract: Techniques for improving the performance of CELP (code excited linear prediction)-type speech coders while maintaining reasonable computational complexity are explored. A harmonic noise weighting function, which enhances the perceptual quality of the processed speech, is introduced. The combination of harmonic noise weighting and subsample pitch lag resolution significantly improves the coder performance for voiced speech. Strategies for reducing the speech coder's data rate, while maintaining speech quality, are presented. These include a method for efficient encoding of the long-term predictor lags, utilization of multiple gain vector quantizers, and a multimode definition of the speech coder frame. A 5.9-kb/s VSELP speech coder that incorporates these features is described. Complexity reduction techniques which allow the coder to be implemented using a single fixed-point DSP (digital signal processor) are discussed. >

65 citations


PatentDOI
Mei Yong1
TL;DR: In this article, a spectral interpolation (500, 600) and efficient excitation codebook search method (700) were developed for a Code-Excited Linear Predictive (CELP) speech coder.
Abstract: A novel spectral interpolation (500, 600) and efficient excitation codebook search method (700) developed for a Code-Excited Linear Predictive (CELP) speech coder (100) is set forth. The interpolation is performed on an impulse response of the spectral synthesis filter. As the result of using this new set of interpolation parameters, the computations associated with an excitation codebook search in a CELP coder are considerably reduced. Furthermore, a coder utilizing this new interpolation approach provides noticeable improvement in speech quality coded at low bit-rates.

61 citations


Patent
28 Dec 1992
TL;DR: In this article, a code excited linear prediction (CELP) type speech signal coding system is provided, a code vector obtained by applying linear prediction to a vector of a residual speech signal of white noise is stored in a code book.
Abstract: A code excited linear prediction (CELP) type speech signal coding system is provided, a code vector obtained by applying linear prediction to a vector of a residual speech signal of white noise is stored in a code book. A pitch prediction vector obtained by applying linear prediction to a residual signal of a preceding frame is given a delay corresponding to a pitch frequency and added to the code vector. Use is made of an impulse vector obtained by applying linear prediction to a residual signal vector of impulses having a predetermined relationship with the vectors of the white noise code book. Variable gains are given to at least the above code vector and impulse vector, a reproduced signal is produced, and this reproduced signal is used for identification of the input speech signal. Thus, a pulse series corresponding to the sound source of voiced speech sounds is created.

57 citations


Proceedings ArticleDOI
23 Mar 1992
TL;DR: The authors discuss the application of generalized analysis-by-synthesis coding to the pitch predictor of a code excited linear predictor (CELP) coder, which makes it possible to transmit the pitch prediction parameters at a much lower rate than conventional approaches, without compromising speech quality.
Abstract: Many modifications can be applied to a speech signal without changing its perceptual quality. For a particular speech coder, the coding efficiency will differ for distinct modifications. To exploit this, the authors introduced a generalized analysis-by-synthesis procedure. In this procedure, a search is performed over a multitude of modified original signals (on a blockwise basis), and the signal which can be encoded with the least distortion is selected for transmission. At the receiver, a quantized version of this modified original signal is constructed. The authors discuss the application of generalized analysis-by-synthesis coding to the pitch predictor of a code excited linear predictor (CELP) coder. The use of this technique makes it possible to transmit the pitch predictor parameters at a much lower rate than conventional approaches, without compromising speech quality. >

56 citations


Proceedings ArticleDOI
23 Mar 1992
TL;DR: The improved PS-VXC coder operated by the authors has a subjective performance closely matching that of the 4.8 kb/s DoD CELP coder.
Abstract: Several major modifications to the phonetically segmented vector excitation coding (PS-VXC) coder by the authors (1989, 1990) reported previously have resulted in enhanced speech quality while reducing the delay, complexity, and bit rate. Speech is segmented into variable-length phonetic classes and a VXC coding module is tailored to each class. Coding techniques include adaptive linear predictive coding (LPC) analysis and interpolation, two-stage excitation coding of onsets, comb filtering, modified perceptual weighting, and pitch contour smoothing. The improved PS-VXC coder operates at a peak rate of 3.4 kb/s with an average rate of 3.0 kb/s and has a subjective performance closely matching that of the 4.8 kb/s DoD CELP coder. >

40 citations


Journal ArticleDOI
P. Kroon1, Kumar Swaminathan
TL;DR: The design and implementation of a real-time CELP coder for mobile communication applications and the speech quality was evaluated through a formal listening test, and it was found that this coder compares favorably with other (standardized) coders operating at similar or higher rates.
Abstract: The design and implementation of a real-time CELP coder for mobile communication applications are discussed. To realize a single-chip implementation, several tradeoffs were made without compromising speech quality. In addition, techniques that make the coder more robust under a variety of channel conditions are discussed. The real-time coder can be operated at different bit rates (8, 6.8, 4.6 kb/s) by simply changing the frame update rates. The speech quality was evaluated through a formal listening test, and it was found that this coder compares favorably with other (standardized) coders operating at similar or higher rates. >

32 citations


PatentDOI
TL;DR: In this article, a speech coder, utilizing CELP techniques, includes a first filter for filtering out the spectral information from the speech signal and such pitch information is also provided for transmission.
Abstract: Methods and apparatus for speech coding are disclosed for converting analog speech signals to digital speech signals for transmission. The speech coder, utilizing CELP techniques, includes a first filter for filtering out the spectral information from the speech signal. The spectral information is provided for transmission. A second filter is provided for filtering out the pitch information from the speech signal and such pitch information is also provided for transmission. A codevector generator determines, in one embodiment, the characteristics of a bi-pulse codevector representative of the speech signal. In this embodiment the impulse response of the first filter is truncated for determining the codevector characteristics. In this embodiment it is also preferred to determine the codevector characteristics by conducting a numerator only search in relation to a traditional fraction used for determining codevectors. In another embodiment, the codevector generator includes a transformer for transforming codevector possibilities from being representative of pulse-like sound to being representative of noise-like sound. It is especially preferred for the transform to be a Hadamard transform. It is also preferred to scramble the transformed codevector to modify the sequency properties. In still another embodiment the bi-pulse codevector generator and the scrambled codevector generator are combined with a single pulse codevector generator. In such an embodiment, it is preferred to include a comparator for evaluating the characteristics determined by the three codebook generators and choosing the output of the one providing the best codebook vector.

28 citations


Journal ArticleDOI
01 Jun 1992
TL;DR: As the authors address new challenges in wideband speech technology, several strides in coding research are likely to occur, including refinements of existing models for auditory noise-masking, and a unification of linear prediction and frequency-domain coding.
Abstract: The technologies of ISDN teleconferencing, CD-ROM multimedia services, and High Definition Television are creating new opportunities and challenges for the digital coding of wideband audio signals, wideband speech in particular. In the coding of wideband speech, an important point of reference is the CCITT standard for 7 kHz speech at a rate of 64 kbit/s. Results of recent research are pointing to better capabilities — higher signal bandwidth at 64 kbit/s, and 7 kHz bandwidth at lower bit-rates such as 32 and 16 kbit/s. The coding of audio with a signal bandwidth of 20 kHz is receiving significant attention due to recent activity in the ISO (International Standards Organization), with a goal of storing a CD-grade monophonic audio channel at a bit-rate not exceeding 128 kbit/s. Prospects for accomplishing this are very good. As a side result, emerging algorithms will offer very attractive options at lower rates such as 96 and 64 kbit/s. As we address new challenges in wideband speech technology, several strides in coding research are likely to occur. Among these are refinements of existing models for auditory noise-masking, and a unification of linear prediction and frequency-domain coding.

Proceedings ArticleDOI
23 Mar 1992
TL;DR: To improve the tandeming performance, the authors first tuned the perceptual weighting filter to optimize the coder's performance after three tandems, and added an adaptive postfilter which was also tuned for three t andems.
Abstract: The CCITT is in the process of standardizing a 16 kb/s speech coder submitted by AT&T called low-delay code excited linear prediction (LD-CELP). In the first phase of CCITT testing, the coder met all performance requirements except for the tandeming condition. To improve the tandeming performance, the authors first tuned the perceptual weighting filter to optimize the coder's performance after three tandems. They then added an adaptive postfilter which was also tuned for three tandems. The excitation codebook was also reoptimized using a multiple-language IRS-weighted training database. With these changes, in the second phase of CCITT testing, the coder significantly exceeded the tandeming performance requirement and also met all other requirements. In fact, the speech quality of LD-CELP was equivalent to or better than that of 32 kb/s ADPCM for all conditions tested. It is anticipated that the version of LD-CELP will be formally ratified as a CCITT standard in 1992. The author describes the modifications made to improve the performance and discusses the CCITT test results. >

PatentDOI
TL;DR: An exemplary CELP coder where gain adaptation is performed using previous gain values in conjunction with an entry in a table comprising the logarithms of the root-mean-squared values of the codebook vectors, to predict the next gain value.
Abstract: An exemplary CELP coder where gain adaptation is performed using previous gain values in conjunction with an entry in a table comprising the logarithms of the root-mean-squared values of the codebook vectors, to predict the next gain value. Not only is this method less complex because the table entries are determined off-line, but in addition the use of a table at both the encoder and the decoder allows fixed-point/floating-point interoperability requirements to be met.

Proceedings ArticleDOI
23 Mar 1992
TL;DR: An algorithm for 2.4 kb/s speech coding is described, which results in a better compromise between bit allocation for short-term quantization and residual coding and an improved high-frequency regeneration.
Abstract: An algorithm for 2.4 kb/s speech coding is described. The main problem addressed is the coding of voiced speech. A way of coding the pitch structure is introduced. Compared with traditional coding schemes, it results in a better compromise between bit allocation for short-term quantization and residual coding. The coder uses vector quantization of the short-term parameters (line spectrum frequencies). The residual is lowpass filtered to obtain the baseband signal. Unvoiced frames are coded by means of a method based on repetition and interpolation of pitch pulses. The method exploits the high correlation between pitch pulses. Harmonic postfiltering is applied to obtain an improved high-frequency regeneration. >

Patent
Juin-Hwey Chen1
03 Sep 1992
TL;DR: In this article, a low-bitrate (typically 8 kbit/s or less), low-delay digital coder and decoder based on Code Excited Linear Prediction for speech and similar signals features backward adaptive adjustment for codebook gain and short-term synthesis filter parameters and forward adaptive adjustment of long-term (pitch) synthesis filter parameter.
Abstract: A low-bitrate (typically 8 kbit/s or less), low-delay digital coder and decoder based on Code Excited Linear Prediction for speech and similar signals features backward adaptive adjustment for codebook gain and short-term synthesis filter parameters and forward adaptive adjustment of long-term (pitch) synthesis filter parameters. A highly efficient, low delay pitch parameter derivation and quantization permits overall delay which is a fraction of prior coding delays for equivalent speech quality at low bitrates.

Proceedings ArticleDOI
23 Mar 1992
TL;DR: Four voice packet reconstruction methods used for speech coded by code excited linear prediction (CELP)-type speech coders are described and their performance is discussed.
Abstract: Four voice packet reconstruction methods used for speech coded by code excited linear prediction (CELP)-type speech coders are described. In the first method, the authors generalize the waveform substitution technique originally developed for the PCM coded speech to the CELP speech coding. In the second method, a priority level is assigned to each speech frame to protect against those perceptually important and hard-to-reconstruct speech frames being lost. The third and fourth methods both split the information bits in a frame into two groups of different levels of importance. In method three, the bits for representing the filter parameters are given high priority and bits for representing the excitation signals are given low priority. Method four is an embedded coding technique based on two-stage CELP. The four methods were tested in combination with a simulated voice activity and queuing model and their performance is discussed. >

Proceedings ArticleDOI
Toshiki Miyano1, Masahiro Serizawa1, J. Takizawa1, S. Ikeda1, Kazunori Ozawa1 
23 Mar 1992
TL;DR: The authors propose 4.8-kb/s LCELP (learned code excited LPC) coding, which uses a two-stage vector quantizer with multiple candidates to improve conventional CELP speech quality and produce high quality speech.
Abstract: The authors propose 4.8-kb/s LCELP (learned code excited LPC) coding. In order to improve conventional CELP speech quality with relatively small computation amount at around 4.8 kb/s. in 4.8-kb/s LCELP, the excitation signal is quantized by a two-stage vector quantizer (VQ) with multiple candidates. The conventional two-stage VQ can significantly reduce the computational amount for the excitation codebook search. However, the conventional two-stage VQ performance is usually lower than that for a single-stage VQ with the same number of bits. In order to overcome this problem and to produce high quality speech, the authors use a two-stage VQ with multiple candidates. The optimum combination of first and second code vectors is searched for among multiple candidates. Two excitation codebooks are designed iteratively, using a speech database. An experimental result shows that the 4.8-kb/s LCELP achieves an average of 11.8 dB segmental SNR, which is 1.6-dB higher than that for a 4.8 kb/s conventional CELP. Informal listening tests indicate that the 4.8 kb/s LCELP speech quality is significantly higher than that for the 4.8 kb/s conventional CELP coder. >

Proceedings ArticleDOI
A. De1, Peter Kabal
06 Dec 1992
TL;DR: The output space of the cochlear model is explored using this measure, in order to verify the existence of the pitch and formant information in speech coding.
Abstract: The authors (1992) proposed a perceptual distortion measure for speech coders using an auditory (cochlear) model. This measure evaluates the neural-firing cross-entropy of the coded speech with respect to that of the original speech. Here the output space of the cochlear model is explored using this measure, in order to verify the existence of the pitch and formant information. A rate-distortion analysis for speech coding is provided. A lower bound to the rate-distortion function is evaluated based on the distortion measure, and the exact rate-distortion function is computed using the Blahut (1972) algorithm. Four state-of-the-art speech coders with rates ranging from 4.8 kb/s (CELP) to 32 kb/s (ADPCM) are studied from the viewpoint of their performance with respect to the rate-distortion limits. >

Proceedings ArticleDOI
23 Mar 1992
TL;DR: The authors investigate three algorithms that orthogonalize codebooks in a multi-stage code excited linear prediction (CELP) speech coder and show that the recursive modified Gram-Schmidt algorithm extra computational cost is less than the other two.
Abstract: The authors investigate three algorithms that orthogonalize codebooks in a multi-stage code excited linear prediction (CELP) speech coder. They carry out the same processing, a locally optimal modeling of the perceptual signal, but the computational costs differ. The authors show that the recursive modified Gram-Schmidt algorithm extra computational cost is less than the other two. An orthogonal codebook is defined a priori and the authors observe an equivalence to orthogonal transform coding. Three methods based on the Karhunen-Loeve transform for designing this codebook are compared. A partitioned shape-gain VQ is applied in the transform domain. >

Patent
28 May 1992
TL;DR: In this article, the recursive loop used to search vectors of an adaptive codebook is rearranged so that an impulse function of a short term perceptually weighted filter is first convolved with target speech and the result cross-correlated with each codebook vector to produce an error function for identifying the optimum adaptive code book vector.
Abstract: The computational effort and time for CELP coding of speech is reduced by rearranging the recursive loop used to search vectors of an adaptive codebook so that an impulse function of a short term perceptually weighted filter is first convolved with perceptually weighted target speech and the result cross-correlated with each codebook vector to produce an error function for identifying the optimum adaptive codebook vector. In addition, autocorrelation is initially performed for only a small number of autocorrelation coefficients in each codebook vector and the values found are used to scan through all the vectors to find those giving a better match to input speech. Autocorrelation using all the vector values is then performed on this subset of vectors to identify the best vector for representing the frame of speech. An end correction procedure is used for vectors shorter than the speech frame length to avoid copy-up errors common in the prior art. An improved means and method for obtaining correlation coefficients for the stochastic codebook vectors is also described..

Journal ArticleDOI
Claude Galand1, J.E. Menez, M.M. Rosso
TL;DR: A novel way to use the code excited linear prediction (CELP) concept that decreases the processing load while keeping the same speech quality is discussed.
Abstract: A novel way to use the code excited linear prediction (CELP) concept that decreases the processing load while keeping the same speech quality is discussed. Rather than performing individual weighting of each candidate sequence, a global implementation of the perceptual weighting function at the codebook level is proposed. As a result, the analysis-by-synthesis procedure does not require the processing of all the candidate sequences through the synthesis and weighting filters; the complexity requirement of the algorithm is therefore much reduced. The concept is carried out with an adaptive codebook. Two fixed-point implementations of the adaptive CELP (ACELP) algorithm are reported: a 7.2 kb/s block coder (7 MIPS), and a 12 kb/s low-delay coder (11 MIPS). Both coders have been rated to provide the same quality as the 13 kb/s block coder adopted by the GSM for the European cellular telephone. >

Journal ArticleDOI
01 Jun 1992
TL;DR: Results on wideband 7 kHz speech coding at 16 kbit/s where the proposed CELP algorithm is implementable on a single floating point DSP is presented where the basic coder structure can be reduced to 12.4 MIPS with a 7 bit stochastic codebook.
Abstract: This paper presents results on wideband 7 kHz speech coding at 16 kbit/s where the proposed CELP algorithm is implementable on a single floating point DSP. As a basic coder structure, the long-term predictor is implemented as an adaptive codebook, while a sparse Gaussian codebook with non-overlapping vectors is used for the stochastic excitation. In order to meet the complexity requirements, several methods for efficient codebook search are adopted. With these methods, it is shown that the computational effort for the basic coder structure can be reduced to 12.4 MIPS with a 7 bit stochastic codebook. A two-stage hierarchical search through the adaptive codebook is investigated. This search method reduces the computational effort further although at the cost of a small degradation in coder performance. The coder is evaluated in an absolute category rating (MOS) test using both a hi-fi handset and a loudspeaker, and compared to the CCITT standard coder G.722 at 48–64 kbit/s. The speech quality with the basic CELP structure is judged to be comparable to the G.722 coder at 48 kbit/s.

Proceedings ArticleDOI
23 Mar 1992
TL;DR: Several modifications to improve the quality of the standard CELP algorithm while simultaneously reducing its transmission rate are proposed, including a multimode excitation that increases the convergence rate of the adaptive (pitch) codebook and a low-complexity spectral vector quantization algorithm.
Abstract: While code excited linear prediction (CELP) speech coders can produce high levels of output speech quality at rates near 4 kb/s, they may not be suitable for toll quality communications. The authors propose several modifications to improve the quality of the standard CELP algorithm while simultaneously reducing its transmission rate. The modifications include a multimode excitation that increases the convergence rate of the adaptive (pitch) codebook. A low-complexity spectral vector quantization algorithm is also developed that reduces the coding rate and decreases the spectral distortion. SNR improvements of 2 dB and a significant reduction in perceptual artifacts have been observed. >

Proceedings ArticleDOI
23 Mar 1992
TL;DR: The results show that the vocoder with the new excitation model is capable of synthesizing more intelligible and more natural speech at 2.4 kb/s.
Abstract: A novel excitation model called the multicategory vector excitation (MCVE) model for a linear predictive coding (LPC) vocoder at 2.4 kb/s is proposed. In this model, speech signal is classified into four categories: unvoiced, voiced, onset, and offset. For every category of speech, an excitation codebook is available. Different excitation codebooks hold different characteristics. The analysis-by-synthesis procedure is used to select the excitation vectors. The computer simulation has been carried out, and the results show that the vocoder with the new excitation model is capable of synthesizing more intelligible and more natural speech at 2.4 kb/s. >

Proceedings ArticleDOI
23 Mar 1992
TL;DR: It is shown that, by taking into account the nature of the MP-excitation signal into LPC parameter computation, it is possible to improve the effectiveness of the LPC model.
Abstract: An algorithm for LPC (linear predictive coding) parameter optimization in multipulse (MP)-LPC based speech coders is presented. It is shown that, by taking into account the nature of the MP-excitation signal into LPC parameter computation, it is possible to improve the effectiveness of the LPC model. This results in a better quality of the reconstructed signal in terms both of objective and subjective criteria. The implementation details of the algorithm are discussed and experimental results are presented. In particular a comparison with standard MP-LPC techniques is given. >

Proceedings ArticleDOI
11 Oct 1992
TL;DR: The authors present a 2400-b/s speech coder based on a novel linear predicting coding (LPC) vocoder model that features a more flexible parameterization of the LPC excitation signal, which allows the synthesizer to generate more realistic output speech, and increases the robustness of the coder to acoustic background noise.
Abstract: The authors present a 2400-b/s speech coder based on a novel linear predicting coding (LPC) vocoder model. This model preserves the low bit rate capabilities of the traditional LPC vocoder, but it features a more flexible parameterization of the LPC excitation signal. This allows the synthesizer to generate more realistic output speech, and it also increases the robustness of the coder to acoustic background noise. The vocoder has been implemented in a real-time system. Formal subjective testing of the coder confirms that it produces natural-sounding speech even in a difficult noise environment. In fact, diagnostic acceptability measure (DAM) test scores show that its performance is close to that of the 4800-b/s Department of Defense CELP (code excited linear prediction) coder. >

14 Apr 1992
TL;DR: A MB-LPC vocoder operating at 2.4 and 1.2 kb/s exploits the advantages of both time and frequency domain speech coding to product natural sounding, good quality speech.
Abstract: At 4.8 kb/s and below, conventional code excited linear prediction (CELP) does not provide the appropriate degree of periodicity. It has been suggested that good quality low bit rate speech can be obtained from frequency domain techniques. A MB-LPC vocoder operating at 2.4 and 1.2 kb/s exploits the advantages of both time and frequency domain speech coding to product natural sounding, good quality speech. This includes a new frequency domain post-filter which attenuates the noise in the formant nulls and enhances the synthesized speech. >

Journal ArticleDOI
TL;DR: A very simple and efficient weighting filter with which the computational complexity of CELP coders can be considerably reduced is proposed and several coders have been implemented showing that the perceptual quality of the simplified algorithm is equivalent to that of the original C ELP.

Proceedings ArticleDOI
23 Mar 1992
TL;DR: The authors proposed an improvement to the excitation model of CELP coders by embedding a labeled-state finite state vector quantization (FSVQ) and a mixture density approach in constructing 5-ms-long excitation vectors.
Abstract: Code excited linear prediction (CELP) coding and its derivatives are currently the most frequently used techniques for speech compression at medium-to-low rate range. Until this study, the excitation vectors were always selected from a codebook generated by a Gaussian source or by the ensemble of residual signals collected from the used speech database. To the best of the author's knowledge, there is no study in which the excitation vectors were formed from a mixture of sources, a notion very successfully used by the speech recognition community within the hidden Markov model (HMM) framework. The authors proposed an improvement to the excitation model of CELP coders by embedding a labeled-state finite state vector quantization (FSVQ) and a mixture density approach in constructing 5-ms-long excitation vectors. >

Proceedings ArticleDOI
M. Mauc1, Genevieve Baudoin1
23 Mar 1992
TL;DR: The authors present a multistage method that preserves the qualities of the CELP vocoder with reduced complexity that can be improved up to a calculation gain of 13 by introducing several stages of subsampling.
Abstract: Code excited linear predictor (CELP) coders enable speech coding with good quality at bit rates as low as 4 kbps. Several methods have been developed to lessen the computational task. The authors present a multistage method that preserves the qualities of the CELP vocoder with reduced complexity. The basic idea is to find a subset S/sub 1/ of candidate codewords by approximating the synthetic signal energy. This approximation is obtained by restricting the frequency range and subsampling. A full exact search is done on this subset. The procedure can be iterated. Typically, for an initial codebook of 1024 words and a single stage of subsampling by a factor equal to q/sub 1/=5, the size of the subset S/sub 1/ is 70 and the calculation gain is 9. This performance can be improved up to a calculation gain of 13 by introducing several stages of subsampling q/sub 1/=5 and q/sub 2/=2. >