scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 1997"


Proceedings ArticleDOI
21 Apr 1997
TL;DR: This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10.
Abstract: This paper describes the new US Federal Standard at 2400 bps The mixed excitation linear prediction (MELP) coder was chosen by the DoD Digital Voice Processing Consortium to replace the existing 2400 bps Federal Standard FS 1015 (LPC-10) This new standard provides equal or improved performance over the 4800 bps Federal Standard FS 1016 (CELP) at a rate equivalent to LPC-10 The MELP coder is based on the traditional LPC model, but includes additional features to improve its performance

212 citations


Patent
31 Jan 1997
TL;DR: In this article, a preselection of a small number of excitation sequences is made by selecting the most similar subsequences to a backward filtered target signal, which results in a minimum error between the target signal and the synthetic signal.
Abstract: In a CELP coder a comparison between a target signal and a plurality of synthetic signals is made. The synthetic signal is derived by filtering a plurality of excitation sequences by a synthesis filter having parameters derived from the target signal. The excitation signal which results in a minimum error between the target signal and the synthetic signal is selected. The search for the best excitation signal requires a substantial computational complexity. To reduce the complexity a preselection of a small number of excitation sequences is made by selecting a small number of excitation sequences resembling the most a backward filtered target signal. With this small number of excitation sequences a full complexity search is made. Due to the reduced number of excitation sequences involved in the final selection the required computational complexity is reduced.

82 citations


Patent
26 Jun 1997
TL;DR: In this article, a method for speech coding using Code-Excited Linear Prediction (CELP) produces toll-quality speech at data rates between 4 and 16 Kbit/s.
Abstract: The invention provides a method for speech coding using Code-Excited Linear Prediction (CELP) producing toll-quality speech at data rates between 4 and 16 Kbit/s. The invention uses a series of baseline, implied and adaptive codebooks, comprised of pulse and random codebooks, with associated gain vectors, to characterize the speech. Improved quantization and search techniques to achieve real-time operation, based on the codebooks and gains, are also provided.

82 citations


Patent
03 Oct 1997
TL;DR: In this article, a speech coding system was proposed to provide reconstructed voiced speech with a smoothly evolving pitch-cycle waveform, where a speech signal is represented by isolating and coding prototype waveforms.
Abstract: A speech coding system providing reconstructed voiced speech with a smoothly evolving pitch-cycle waveform. A speech signal is represented by isolating and coding prototype waveforms. Each prototype waveform is an exemplary pitch-cycle of voiced speech. A coded prototype waveform is transmitted at regular intervals to a receiver which synthesizes (or reconstructs) an estimate of the original speech segment based on the prototypes. The estimate of the original speech signal is provided by a prototype interpolation process which provides a smooth time-evolution of pitch-cycle waveforms in the reconstructed speech. Illustratively, a frame of original speech is coded by first filtering the frame with a linear predictive filter. Next a pitch-cycle of the filtered original is identified and extracted as a prototype waveform. The prototype waveform is then represented as a set of Fourier series (frequency domain) coefficients. The pitch-period and Fourier coefficients of the prototype, as well as the parameters of the linear predictive filter, are used to represent a frame of original speech. These parameters are coded by vector and scalar quantization and communicated over a channel to a receiver which uses information representing two consecutive frames to reconstruct the earlier of the two frames based on a continuous prototype waveform interpolation process. Waveform interpolation may be combined with conventional CELP techniques for coding unvoiced portions of the original speech signal.

66 citations


Proceedings ArticleDOI
21 Apr 1997
TL;DR: A variable-rate multimodal speech coder with an average bit rate of 3 kb/s for a speech activity factor of 80% and quality comparable to the GSM full rate coder is developed.
Abstract: In general, a variable rate coder can obtain the same speech quality as a fixed rate coder, while reducing the average bit rate. We have developed a variable-rate multimodal speech coder with an average bit rate of 3 kb/s for a speech activity factor of 80% and quality comparable to the GSM full rate coder. The coder has four coding modes and uses a robust classification method involving the pitch gain, zero crossings, and a peakiness measure. Also the coder employs a novel gain-matched analysis-by-synthesis technique for very low rate coding of unvoiced frames and an improved noise-level-dependent postfilter. This paper describes the details of our algorithm and presents the results from subjective listening tests.

64 citations


Proceedings ArticleDOI
02 Jul 1997
TL;DR: Two computationally efficient LSP-based processing methods designed to enhance the intelligibility of speech degraded by acoustic interference are described.
Abstract: CELP coders commonly use line spectral pairs (LSP) to represent linear prediction parameters, giving stable filters and efficient coding. However, manipulation of LSPs can alter frequencies within the represented signals. This paper describes two computationally efficient LSP-based processing methods designed to enhance the intelligibility of speech degraded by acoustic interference.

42 citations


Journal ArticleDOI
TL;DR: A new method for rate variation based on a measure of subband spectral flatness, called spectral entropy, which is a normalized indicator of the texture of the input spectrum and is thus less dependent on speech and background noise energy variations.
Abstract: Code-excited linear prediction (CELP) is the predominant methodology for communications quality speech coding below 8 kbps, and several variable-rate CELP schemes have been discussed in the literature, including QCELP, the variable-rate wideband digital cellular mobile radio speech coding standard specified in IS-95. A key component of these speech coders is the detection and classification of speech activity, and several cues for rate variation have been studied, such as measuring the short-term speech energy, deciding whether the speech is voiced or unvoiced, or making more sophisticated phonetic classifications. We present a new method for rate variation based on a measure of subband spectral flatness, called spectral entropy. Spectral entropy is a normalized indicator of the texture of the input spectrum and is thus less dependent on speech and background noise energy variations. We present some results on the use of spectral entropy for voice activity detection across subbands and then evaluate using spectral entropy for deriving mode and rate allocation cues for a variable-rate CELP coder operating at an average rate of 2 kbps. To achieve communications quality speech at this rate, we develop a new split-band vector quantization (VQ) technique for representing the line spectral pairs and a multiple codebook approach for efficiently quantizing the coefficients of a three-tap pitch predictor, called lag-indexed VQ.

38 citations


Patent
Alan V. McCree1
18 Jun 1997
TL;DR: In this paper, an improved filtering method for use in an enhancement filter in a mixed excitation linear prediction (MELP) speech coder or a post-filter in a codebook excitationlinear prediction (CELP), which includes two filters, is disclosed.
Abstract: An improved filtering method for use in an enhancement filter in a mixed excitation linear prediction (MELP) speech coder or a postfilter in a codebook excitation linear prediction (CELP) speech coder is disclosed which includes two filters. The first filter (62) has a transfer function of ##EQU1## where P is the set of prediction coefficients, α and β are scaling factors, z is the inverse of the unit delay operation used in the transform representation of the transfer functions and sig-prob is signal probability estimator value and the second filter (65) has a transfer function of 1-μz -1 * sig-prob, where μ= a scaling factor. The sig-prob is the signal probability value based on a comparison of power of the signals in a current frames to a long term estimate of noise power in signal probability estimator (63). The sig-prob value is 1 if the power of the signals is greater than the noise power plus 30 dB and the sig-prob is zero if the power is less than noise power plus 12 dB. Between these two conditions, sig-prob is (log gain-12 dB-noise gain)/18.

37 citations


Proceedings ArticleDOI
21 Apr 1997
TL;DR: A novel low-delay wideband speech coder that employs a multi-band bank of off-line filtered excitation codebooks, fullband linear prediction synthesis, and minimization of the error between the original and the synthesized speech signal over the full frequency range is described.
Abstract: A novel low-delay wideband speech coder, called multiband CELP (MB-CELP), overcomes the major obstacles usually associated with two traditional CELP approaches to wideband speech coding-namely fullband CELP and split-band CELP. The new MB-CELP coder employs a multi-band bank of off-line filtered excitation codebooks, fullband linear prediction synthesis, and minimization of the error between the original and the synthesized speech signal over the full frequency range. A 16 kbps version of the MB-CELP coder with two equal bands, is described. Subjective comparison test results show that this coder performs better than the G.722 coder at a bit-rate of 48 kbps.

37 citations


Proceedings ArticleDOI
21 Apr 1997
TL;DR: This paper compares the performance scores, diagnostic information, and complexity of MELP to the 4800 bps Federal Standard (FS1016) code excited linear prediction (CELP), the 16 kbps continuously variable slope delta modulation (CVSD) algorithm, and the venerable Federal standard (FIPS Pub. 137) 2400 bps linear predictive coding (LPC-10) algorithm.
Abstract: In 1996, the U.S. Department of Defense Digital Voice Processing Consortium (DDVPC) selected Texas Instrument's mixed excitation linear prediction (MELP) algorithm as the recommended new Federal Standard for 2400 bps voice communications. The algorithm selection process involved quality, intelligibility, communicability, and recognizability testing in many acoustic noise, error, and tandem conditions. Algorithm complexity was also measured. This paper compares the performance scores, diagnostic information, and complexity of MELP to the 4800 bps Federal Standard (FS1016) code excited linear prediction (CELP) algorithm, the 16 kbps continuously variable slope delta modulation (CVSD) algorithm, and the venerable Federal Standard (FIPS Pub. 137) 2400 bps linear predictive coding (LPC-10) algorithm.

37 citations


Patent
19 Feb 1997
TL;DR: In this article, a CELP-type voice coding/decoding device using the voice source vector generating device as the noise code book was proposed to improve the quality of the synthetic voice.
Abstract: PROBLEM TO BE SOLVED: To make possible performing noise code book retrieval at a calculation cost nearly equal to the case using an algebraic structural voice source as a noise code book and obtaining a high quality synthetic voice in a voice source vector generating device and a voice coding/decoding device efficiently compression coding/decoding voice information. SOLUTION: The voice vector generating device makes possible storing (11) plural pieces of fixed waveforms, arranging (12) respective fixed waveforms to respective start end positions based on start end candidate positional information, adding (13) these fixed waveforms, generating a voice source vector (14) and generating the voice source vector close to a rear voice. Further, by constituting a CELP(code excited linear prediction) type voice coding/decoding device using the voice source vector generating device as the noise code book, the quality in the synthetic voice is improved at the calculation cost nearly equal to the case using the algebraic structural voice source as the noise code book. COPYRIGHT: (C)1998,JPO

Journal ArticleDOI
TL;DR: The MFCELP method provides a significant visual improvement over the discrete cosine transform based Joint Photographers Expert Group (JPEG) method, the wavelet transform based embedded zero-tree wavelet wavelet (EZW) coding method, and the vector tree (VT) codingmethod, as well as the multispectral segmented autoregressive moving average (MSARMA) method the authors developed previously.
Abstract: This paper reports a multispectral code excited linear prediction (MCELP) method for the compression of multispectral images. Different linear prediction models and adaptation schemes have been compared. The method that uses a forward adaptive autoregressive (AR) model has been proven to achieve a good compromise between performance, complexity, and robustness. This approach is referred to as the MFCELP method. Given a set of multispectral images, the linear predictive coefficients are updated over nonoverlapping three-dimensional (3-D) macroblocks. Each macroblock is further divided into several 3-D micro-blocks, and the best excitation signal for each microblock is determined through an analysis-by-synthesis procedure. The MFCELP method has been applied to multispectral magnetic resonance (MR) images. To satisfy the high quality requirement for medical images, the error between the original image set and the synthesized one is further specified using a vector quantizer. This method has been applied to images from 26 clinical MR neuro studies (20 slices/study, three spectral bands/slice, 256/spl times/256 pixels/band, 12 b/pixel). The MFCELP method provides a significant visual improvement over the discrete cosine transform (DCT) based Joint Photographers Expert Group (JPEG) method, the wavelet transform based embedded zero-tree wavelet (EZW) coding method, and the vector tree (VT) coding method, as well as the multispectral segmented autoregressive moving average (MSARMA) method we developed previously.

Patent
29 Dec 1997
TL;DR: In this paper, a multimodal code-excited linear prediction (CELP) speech coder determines a pitch-lag-periodicity-independent peakiness measure from the input speech.
Abstract: A multimodal code-excited linear prediction (CELP) speech coder determines a pitch-lag-periodicity-independent peakiness measure from the input speech. If the measure is greater than a peakiness threshold the encoder classifies the speech in a first coding mode. In one embodiment only frames having an open-loop pitch prediction gain not greater than a threshold, a zero-crossing rate not less than a threshold, and a peakiness measure not greater than the peakiness threshold will be classified as unvoiced speech. Accordingly, the beginning or end of a voiced utterance will be properly coded as voiced speech and speech quality improved. In another embodiment, gain-match scaling matches coded speech energy to input speech energy. A target vector (the portion of input speech with any effects of previous signals removed) is approximated using the precomputed gain for excitation vectors while minimizing perceptually-weighted error. The correct gain value is perceptually more important than the shape of the excitation vector for most unvoiced signals.

Patent
20 Feb 1997
TL;DR: In this article, a pitch peak position calculator was used to determine the pitch peak positions of an adaptive code vector, and an amplitude emphasizing window generator was used for emphasizing the amplitude of the peak position.
Abstract: PROBLEM TO BE SOLVED: To improve the sound quality of the sound source generating section in a CELP(code excited linear prediction) type voice coding device. SOLUTION: A pitch peak position calculator 12 determines the pitch peak position of an adaptive code vector, an amplitude emphasizing window generator 13 generates a window for emphasizing the amplitude of the pitch peak position and an amplitude emphasizing window applicator 16 emphasizes the amplitude of the noise code vector corresponding to the pitch peak position. Or it determines the search positions of pulses so that they are dense near the pitch peak position and sparse in the other regions, and, based on the determined search positions, searches pulse positions. Or it adaptively switches backward the source constitution to improve sound quality and suppress the propagation of the effects of transmission line error, making use of the pitch peak position and pitch period information in the just precedent subframe and the pitch period information in the present subframe.

Patent
16 Sep 1997
TL;DR: In this article, a parametric speech codec, such as a CELP, RELP, or VSELP codec, is integrated with an echo canceler to provide the functions of parametric text encoding, decoding, and echo cancellation in a single unit.
Abstract: A parametric speech codec; such as a CELP, RELP, or VSELP codec; is integrated with an echo canceler to provide the functions of parametric speech encoding, decoding, and echo cancellation in a single unit. The echo canceler (90) includes a convolution processor (116) or transversal filter that is connected to receive the synthesized parametric components, or codebook basis functions, of respective send and receive signals being decoded and encoded by respective decoding and encoding processors (130, 94). The convolution processor (116) produces an estimated echo signal for subtraction from the send signal. In order to process the synthesized parametric components having distinct basis functions in the convolution processor, conversion means (142, 145) are provided for providing the receive-side parametric component to the processor, or for providing the estimated echo signal, in terms of the send-side parameter. Plural convolution processors are provided for processing respective parametric components of the desired coding scheme.

Journal ArticleDOI
TL;DR: A technique for nonlinear prediction of speech via local linear prediction (LLP) is presented and applied to LD-CELP at 16 kbps and gives better prediction gain and a remarkably "whiter" residual compared to backward adaptive linear predictor.
Abstract: A technique for nonlinear prediction of speech via local linear prediction (LLP) is presented and applied to LD-CELP at 16 kbps. With 18th-order backward adaptive LLP for voiced frames, the hybrid LD-CELP coder gives higher segmental signal-to-noise ratio (SNR) compared to a reference version of the ITU-T G.728 LD-CELP algorithm, which has a 50th-order backward adaptive linear predictor. The computational complexity for LLP analysis is significantly less than that of a conventional one-step recursive LLP, and the LLP method gives better prediction gain and a remarkably "whiter" residual compared to backward adaptive linear predictor. With an appropriate state space neighborhood for local linear analysis, the short-delay predictor is also able to effectively model long-term correlations without requiring pitch estimation.

Proceedings ArticleDOI
07 Sep 1997
TL;DR: The audibility of stationary and cyclostationary narrow-band noise added to voiced speech generated by natural and synthetic excitation supports the notion that sinusoidal coders work well for female speech and that CELP codersWork well for male speech.
Abstract: This paper examines the audibility of stationary and cyclostationary narrow-band noise added to voiced speech generated by natural and synthetic excitation. Varying the temporal location of noise within a pitch cycle corresponds to varying its phase spectrum. Exploiting this fact, we find that a change of phase is more perceptible for a low-pitched sound than for a high-pitched sound. Our results support the notion that sinusoidal coders work well for female speech and that CELP coders work well for male speech.


Proceedings ArticleDOI
02 Dec 1997
TL;DR: Results of the investigation of three common speech coding systems (CELP, LPC and GSM) on the pitch and formant frequencies of speech extracted from several dialect regions of the TIMIT Speech Corpus are presented.
Abstract: The introduction of speech coding systems in the telephone network raises the question of their impact on formant frequencies, fundamental frequency trajectories and other acoustics features used for text dependent speaker identification. This paper presents results of the investigation of three common speech coding systems (CELP, LPC and GSM) on the pitch and formant frequencies of speech extracted from several dialect regions of the TIMIT Speech Corpus. Voice pitch (F0) and formant frequencies (F1, F2, F3) extracted from time aligned, uncoded and coded speech samples are compared to establish the statistical distribution of error attributed to the coding system.

Patent
06 Nov 1997
TL;DR: In this paper, the noise vector reader and the noise code list of a conventional CELP voice encoder/decoder are replaced by an oscillator which outputs different vector sequences in accordance with the values of inputted seeds and a seed storage unit in which a plurality of seeds (seeds of oscillators) are stored respectively.
Abstract: The noise vector reader and the noise code list of a conventional CELP voice encoder/decoder are replaced by an oscillator which outputs different vector sequences in accordance with the values of inputted seeds and a seed storage unit in which a plurality of seeds (seeds of oscillators) are stored respectively. With this replacement, it is not necessary to store fixed vectors in a fixed code list (ROM), and the memory capacity is substantially reduced.

Proceedings ArticleDOI
07 Sep 1997
TL;DR: A low rate speech coding algorithm, harmonic vector excitation coding (HVXC) is proposed for MPEG-4 standardization, in which an efficient coding scheme based on harmonic and stochastic vector representation of linear predictive coding (LPC) residuals is employed as discussed by the authors.
Abstract: A low rate speech coding algorithm, harmonic vector excitation coding (HVXC) is proposed for MPEG-4 standardization, in which an efficient coding scheme based on harmonic and stochastic vector representation of linear predictive coding (LPC) residuals is employed. A combination of weighted vector quantization of harmonic spectral envelope of LPC residual signal for voiced segments and vector excitation coding for unvoiced segments provides good speech duality at very low bit rates. MPEG-4 formal listening tests in December 95 showed that the subjective speech quality of HVXC at 2.0 kbps was better than that of FS1016 4.8 kbps CELP.

Patent
Bunkei Matsuoka1
17 Oct 1997
TL;DR: In this article, a variable rate speech coding method for a CELP speech coding system, an adaptive sound source vector and a first noise source vector are selected from a sound source code book and a noise source code books so that a first synthesized speech signal is obtained which has a minimum distortion relative to an input speech signal.
Abstract: In a variable rate speech coding method for a CELP speech coding system, an adaptive sound source vector and a first noise source vector are selected from a sound source code book and a noise source code book so that a first synthesized speech signal is obtained which has a minimum distortion relative to an input speech signal. A virtual reference speech signal is generated using a sound source signal which is produced using the adaptive sound source vector. A second noise source vector corresponding to the adaptive sound source vector is selected so that a second synthesized speech signal is obtained which has a minimum distortion relative to the virtual reference speech signal. The sending of a noise source code book index corresponding to the first noise source vector is suspended according to the quality of the second synthesized speech signal.

Proceedings ArticleDOI
04 May 1997
TL;DR: In this article, the physical layer of a 3.6864 Mcps wideband CDMA system for future public land mobile telecommunication systems (FPLMTS) is described.
Abstract: In this paper, the physical layer of a 3.6864 Mcps wideband CDMA system which has been proposed by ETRI for FPLMTS (future public land mobile telecommunication systems) is described. It is designed to be adequate upto a 5 MHz bandwidth in order to make it easy that frequencies are allocated to carriers by multiples of 5 MHz bands and make the specification of pulse shaping filter more loose. 8 kbps CS-ACELP (conjugate structure algebraic CELP) is adopted as a main vocoder algorithm and 32 kbps ADPCM can be used. In the reverse link, the continuous pilot scheme is introduced to cope with discontinuous data transmission and to have a symmetrical H/W component to the forward link. In order to maintain the service quality with heavy signaling data, signaling activity with a dedicated signaling channel is introduced for spectrum efficiency. The user information data of upto 128 kbps can be transmitted by using QPSK data/QPSK spreading, variable spreading factor, and code pair assignment. Based on the 3.6864 Mcps system, the multiband 0.9216/3.6864/14.7456 Mcps system for multilayered cell environments is under consideration for FPLMTS.

Proceedings ArticleDOI
21 Apr 1997
TL;DR: Informal subjective testing (MOS) indicates that the proposed variable-rate CELP codec, at an average rate of less than 3.2 kb/s, achieves better quality than fixed rate standard codecs with rates in the range 4-4.8kb/s.
Abstract: This paper presents a variable-rate CELP codec which achieves good communications speech quality at an average rate of about 3 kb/s. The codec operates as a source-controlled variable rate coder with rates of 4.9 kb/s for voiced and transition sounds, 3.0 kb/s for unvoiced sounds and 670 b/s for silent frames. New techniques used in the codec include prediction of the fixed codebook target vector and joint optimization of the adaptive and fixed codebook search. The prediction of the fixed codebook target vector is based on fixed codebook selections in previous subframes and a running estimate for the fundamental frequency. Informal subjective testing (MOS) indicates that the proposed codec, at an average rate of less than 3.2 kb/s, achieves better quality than fixed rate standard codecs with rates in the range 4-4.8 kb/s.

Proceedings ArticleDOI
02 Nov 1997
TL;DR: This study examines and compares two methods that are designed for explicit control of spectral dynamics in speech coding by incorporating a constraint in the distortion measure and the other method smoothes the trajectory of output vectors at the decoder side.
Abstract: Taking the evolution of spectral parameters into consideration in speech coding has been shown to enhance the perceptual performance. In this study we examine and compare two methods that are designed for explicit control of spectral dynamics. One method operates on the encoder part of the coding system by incorporating a constraint in the distortion measure and the other method smoothes the trajectory of output vectors at the decoder side. The decoder method requires however an additional coding delay of one frame. By means of listening experiments it is demonstrated for three different vector quantizer structures that especially the decoder method gives significant improvements. For noisy channels, the preference for this method is even more emphasized.

Proceedings ArticleDOI
21 Apr 1997
TL;DR: Simulation results indicate that the quality of the wideband enhanced speech is significantly improved over the narrowband CELP-coded speech.
Abstract: Results for improving the quality of narrowband CELP-coded speech by enhancing the pitch periodicity and by regenerating the high-band components of speech spectra are reported. Multiband excitation (MBE) analysis is applied to enhance the pitch periodicity by re-synthesizing the speech signal using a harmonic synthesizer. The high-band magnitude spectra are regenerated by matching to low-band spectra using a trained wideband spectral codebook. Information about the voiced/unvoiced (V/UV) excitation in the high-band is derived from a training procedure and recovered by using the matched low-band index. Simulation results indicate that the quality of the wideband enhanced speech is significantly improved over the narrowband CELP-coded speech.

Journal ArticleDOI
TL;DR: The authors use computationally efficient LSP manipulation to enhance the intelligibility of speech degraded by acoustic interference.
Abstract: Linear prediction parameters within CELP coders are commonly represented by line spectral pairs (LSP), giving stable filters and efficient coding. However, LSP manipulation can also alter the frequencies of the represented signals. The authors use computationally efficient LSP manipulation to enhance the intelligibility of speech degraded by acoustic interference.

Proceedings ArticleDOI
21 Apr 1997
TL;DR: A CELP coder is implemented in which mel-generalized cepstral coefficients are quantized using MA prediction, which has a higher objective quality than conventional C ELP.
Abstract: The performance of several algorithms for the quantization of the mel-generalized cepstral coefficients is studied. First, the objective and subjective performance of two-stage vector quantization (VQ) is measured. It is shown that the subjective quality for the mel-generalized cepstral coefficients is higher than that for LSP. Secondly, interframe prediction is introduced in the encoding of mel-generalized cepstral coefficients. By utilizing interframe moving average (MA) prediction, the mel-generalized cepstral coefficients can be encoded more efficiently than LSP in terms of cepstral distortion. Finally, we implement a CELP coder based on mel-generalized cepstral analysis in which mel-generalized cepstral coefficients are quantized using MA prediction. This coder has a higher objective quality than conventional CELP.

Proceedings ArticleDOI
Pasi Ojala1
21 Apr 1997
TL;DR: A source controlled variable-rate CELP type speech codec that produces toll quality speech equal to that of the 32 kbit/s ADPCM (G.726) standard.
Abstract: This paper presents a source controlled variable-rate CELP type speech codec. First, a voice activity detection block distinguishes active speech frames from silence and background noise. The active speech is further classified into voiced and unvoiced frames. The voiced frames have variable bit-rate pitch-lag quantization based on the characteristics of the speech, whereas the unvoiced frames are coded without pitch information. A variable bit-rate fixed codebook excitation with a variable number of excitation pulses is determined for each speech frame. The performance of the linear analysis part of the codec as well as the input speech characteristics determine the excitation bit-rate. The average bit-rate of the codec is around 7.0 kbit/s for active speech, and the overall bit-rate ranges from 0 to 7.85 kbit/s. The described variable-rate codec produces toll quality speech equal to that of the 32 kbit/s ADPCM (G.726) standard.

Patent
06 Nov 1997
TL;DR: In this article, a random code vector reading section was replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds.
Abstract: A random code vector reading section and a random codebook of a conventional CELP type speech coder/decoder are respectively replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds. This makes it unnecessary to store fixed vectors as they are in a fixed codebook (ROM), thereby considerably reducing the memory capacity.