Topic
Code-excited linear prediction
About: Code-excited linear prediction is a research topic. Over the lifetime, 2025 publications have been published within this topic receiving 28633 citations. The topic is also known as: CELP.
Papers published on a yearly basis
Papers
More filters
••
02 Oct 1994TL;DR: An FIR QMF filter bank is applied to compute the long-term prediction in a CELP coder, which employs two-band splitting to improve the quality of the synthetic speech.
Abstract: Applies an FIR QMF filter bank to compute the long-term prediction in a CELP coder. By splitting the frequency spectrum of the past excitation signal, a different pitch lag and gain factor can be attached to each frequency band. This strategy improves the quality of the synthetic speech, providing much more natural sounding voice. The filter bank must be designed to abide by some requirements related to low and nearly-constant group delay and band aliasing, rather than to accomplish perfect reconstruction of the output signal. In the design, which employs two-band splitting, the authors have relaxed the usual condition of linear-phase in favor of achieving less aliasing between bands. >
7 citations
••
02 Dec 1990TL;DR: From the preliminary results of CCITT laboratory tests, it appears that this coder can meet all CCITT requirements for the 16 kb/s speech coding standard, and is named low-delay CELP, or LD-CELP.
Abstract: The coder is basically a backward-adaptive version of the code-excited linear prediction (CELP) coder, and is named low-delay CELP, or LD-CELP. The low coding delay (less than 2 ms) is achieved by using backward-adaptive predictor and gain, and by using a small excitation vector size. This coder has been implemented in real-time using the AT&T DSP32C floating-point DSP chips. Currently the encoder uses about 90% of the processor time while the decoder takes about 40%. Fixed-point implementation of the coder is also possible with some modifications of the algorithm. The speech quality of 16 kb/s LD-CELP is approximately equivalent to that of the CCITT G.721 32 kb/s ADPCM (adaptive differential PCM) standard. The coder can pass 300, 1200, and 2400 b/s modem signals as well as DTMF (dual-tone multifrequency) tones. From the preliminary results of CCITT laboratory tests, it appears that this coder can meet all CCITT requirements for the 16 kb/s speech coding standard. >
7 citations
••
04 Sep 2005
TL;DR: This paper proposes a speech coder using mel-frequency cepstral coefficients (MFCCs) instead of LPCs to improve the performance of a server-based speech recognition system in network environments and proposes an 8.7 kbps MFCC-based CELP coder.
Abstract: Existing standard speech coders can provide high quality speech communication. However, they tend to degrade the performance of automatic speech recognition (ASR) systems that use the reconstructed speech. The main cause of the degradation is in that the linear predictive coefficients (LPCs), which are typical spectral envelope parameters in speech coding, are optimized to speech quality rather than to the performance of speech recognition. In this paper, we propose a speech coder using mel-frequency cepstral coefficients (MFCCs) instead of LPCs to improve the performance of a server-based speech recognition system in network environments. To develop the proposed speech coder with a low-bit rate, we first explore the interframe correlation of MFCCs, which results in the predictive quantization of MFCC. Second, a safety-net scheme is proposed to make the MFCC-based speech coder robust to channel errors. As a result, we propose an 8.7 kbps MFCC-based CELP coder. It is shown that the proposed speech coder has a comparable speech quality to 8 kbps G.729 and the ASR system using the proposed speech coder gives the relative word error rate reduction by 6.8% as compared to the ASR system using G.729 on a large vocabulary task (AURORA4).
7 citations
••
12 May 2008TL;DR: An efficient method for estimating frame energy of speech from enhanced variable rate coder (EVRC) bitstream for network-based speech processing applications in transcoder free operation (TrFO) environments, where speech signals are represented as speech coding parameters.
Abstract: This paper proposes an efficient method for estimating frame energy of speech from enhanced variable rate coder (EVRC) bitstream for network-based speech processing applications in transcoder free operation (TrFO) environments, where speech signals are represented as speech coding parameters. A frame of speech energy is decomposed into the energy of excitation and vocal tract filter, and the frame energy estimation method is derived for each component. Among many parameters of EVRC bitstream, the fixed codebook gain and adaptive codebook gain are used for the estimation of excitation energy, and line spectrum pair (LSP) information is used to estimate the energy of vocal tract filter. Experimental results demonstrated the novelty of the proposed method. The correlation coefficient between the actual and estimated frame energy can be maintained at a value of 0.994 with just 5% multiplicative operations of full decoding.
7 citations
••
01 Sep 1998
TL;DR: ADPCM schemes with a nonlinear predictor based on neural nets, which yields an increase of 1-2.5dB in the SEGSNR over classical methods is discussed.
Abstract: Many speech coders are based on linear prediction coding (LPC), nevertheless with LPC is not possible to model the nonlinearities present in the speech signal. Because of this there is a growing interest for nonlinear techniques. In this paper we discuss ADPCM schemes with a nonlinear predictor based on neural nets, which yields an increase of 1−2.5dB in the SEGSNR over classical methods. This paper will discuss the block-adaptive and sample-adaptive predictions.
7 citations