scispace - formally typeset
Search or ask a question

Showing papers on "Code-excited linear prediction published in 2001"


Patent
30 Apr 2001
TL;DR: In this paper, a random code vector reading section was replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds.
Abstract: A random code vector reading section and a random codebook of a conventional CELP type speech coder/decoder are respectively replaced with an oscillator for outputting different vector streams in accordance with values of input seeds, and a seed storage section for storing a plurality of seeds. This makes it unnecessary to store fixed vectors as they are in a fixed codebook (ROM), thereby considerably reducing the memory capacity.

92 citations


Patent
Takahiro Unno1
13 Nov 2001
TL;DR: In this article, the authors propose a layered code-excited linear prediction speech encoder/decoder with progressively weakening perceptual weighting filters for the enhancement layers in the encoder and progressively weakening short-term postfilters for increased bit rates (enhancement layers) and a longterm postfilter for all bit rates.
Abstract: Layered code-excited linear prediction speech encoders/decoders with progressively weakening perceptual weighting filters for the enhancement layers in the encoder and progressively weakening short-term postfilters for increased bit rates (enhancement layers) and a long-term postfilter for all bit rates.

36 citations


Journal ArticleDOI
TL;DR: It is concluded that using cross channel linear prediction in the multichannel perceptual audio coding does not provide a net coding gain and the whitening effect of the prediction filter increases either the energy or the energy peaks in the high frequency region.
Abstract: We have studied and concluded that the time domain cross channel prediction is generally not applicable to perceptual audio coding. From the statistical analysis, correlation among certain channels seems to be enough to provide some coding gain and the prediction also successfully reduces the total energy of the signal to be coded. But this energy decrease and its resulting bit reduction mainly happens in the low frequency bands. In fact, the whitening effect of the prediction filter increases either the energy or the energy peaks in the high frequency region. As a result, more bits are required to code the high-frequency part of the signal and this increase outpaces the bit reduction realized in the low frequency region. Therefore, using cross channel linear prediction in the multichannel perceptual audio coding does not provide a net coding gain.

35 citations


Proceedings ArticleDOI
27 Nov 2001
TL;DR: An LDA-based method for extracting optimal feature sets from codec bitstreams is proposed and it is demonstrated that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs.
Abstract: Communication devices which perform distributed speech recognition (DSR) tasks currently transmit standardized coded parameters of speech signals. Recognition features are extracted from signals reconstructed using these on a remote server. Since reconstruction losses degrade recognition performance, proposals are being considered to standardize DSR-codecs which derive recognition features, to be transmitted and used directly for recognition. However, such a codec must be embedded on the transmitting device, along with its current standard codec. Performing recognition using codec bitstreams avoids these complications: no additional feature-extraction mechanism is required on the device, and there are no reconstruction losses on the server. We propose an LDA-based method for extracting optimal feature sets from codec bitstreams and demonstrate that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs. For GSM and CELP, we show that the performance is comparable to that with uncoded speech and standard DSR-codec features.

31 citations


Patent
Yang Gao1
25 Jan 2001
TL;DR: In this article, a speech-coding device includes a fixed codebook, an adaptive codebook and a short-term enhancement circuit, which is connected to a synthesis filter to emphasize the spectral formants in an encoder and a decoder.
Abstract: A speech-coding device includes a fixed codebook, an adaptive codebook, a short-term enhancement circuit, and a summing circuit. The short-term enhancement circuit connects an output of the fixed codebook to a summing circuit. The summing circuit adds an adaptive codebook contribution to a fixed codebook contribution. The short-term enhancement circuit can also be connected to a synthesis filter to emphasize the spectral formants in an encoder and a decoder.

27 citations


Proceedings ArticleDOI
Yang Gao1, Adil Benyassine2, Jes Thyssen2, Huan-Yu Su2, Eyal Shlomot2 
07 May 2001
TL;DR: This paper presents the core technology of novel enhancements to achieve toll quality at 4 kbps, traditional CELP coding, coined eXtended CelsP (eX-CELP), and showed that this technology is also successful and suitable for centered on a combined and selective usage of closed-loop/open-both high and medium bit rates.
Abstract: This paper presents the core technology of novel enhancements to achieve toll quality at 4 kbps, our experiments and test results traditional CELP coding, coined eXtended CELP (eX-CELP). It is showed that this technology is also successful and suitable for centered on a combined and selective usage of closed-loop/open-both high and medium bit rates. Fig. I and Fig.2 illustrate the basic loop approach, and variant algorithm structure concept. The above structure of the eX-CELP encoder and decoder. two concepts are complemented by new features and refined One of the main themes of the eX-CELP technology is the existing technologies. The eX-CELP paradigm was used in judicious combination of the closed-loop approach and the open-several speech coding systems. It is the core technology of the loop approach, together with a careful selective usage of them. recently chosen candidate for the 3G-CDMA speech codec This mechanism is coined COLA, and its main objective is to standard. It was the best candidate for ITU-T 4 kbps codec intelligently employ the most appropriate approach for different qualification test, and became the basis technology for a types of input signals in order to preserve the perceptually consortium candidate to the ITU-T 4 kbps speech coding important contents. Another important feature in the eX-CELP competition.

27 citations


Proceedings ArticleDOI
07 May 2001
TL;DR: Simulations show that FE robust coding with interpolation achieves average spectral distortions 0.7-1.8 dB smaller than that of the original coders.
Abstract: Frame erasure (FE) robustness is an important quality measure for voice over IP networks (VoIP). The recovery of the erased frames from the received information is crucial to realize this robustness. We allow the lost frames to be recovered from both the "previous" and "next" good frames. We first give quantitative distortion comparisons between predictive and interpolative frame recovery. Then we add FE-robust LSF coding modes to the popular ITU G.723.1 and G.729 CELP coders. These FE-robust modes utilize intraframe LSF VQ and invoke no bit-rate increase for the G.723.1 coder and a small increase (0.4 kb/s) for G.729. Simulations show that FE robust coding with interpolation achieves average spectral distortions 0.7-1.8 dB smaller than that of the original coders. Significant quality improvement was achieved by combined implementation of FE robust coding, LSF and pitch interpolation, and a proposed fixed codebook excitation recovery method.

26 citations


Patent
13 Mar 2001
TL;DR: In this paper, an encoding and decoding method for a sound signal which suppress quality deterioration in case of frame loss is proposed. But the method is limited to CELP encoding, where a synthesizing filter determined by a current frame and encoded codes for constituting a driving excitation signal are stored in a buffer, and periodicity information including pitch information obtained by analyzing a signal stored in the buffer is sent together.
Abstract: PROBLEM TO BE SOLVED: To provide an encoding and a decoding method for a sound signal which suppress quality deterioration in case of frame loss. SOLUTION: In CELP encoding, a synthesizing filter determined by a current frame, encoded codes for constituting a driving excitation signal, and a sound signal to be encoded after a next frame are stored in a buffer, periodicity information including pitch information obtained by analyzing a signal stored in the buffer is sent together. In CELP decoding, if encoded codes of the next frame are lost, the sound signal of the next frame is decoded by using the encoded codes determined by the current frame and the periodicity information including the pitch information of the next frame. COPYRIGHT: (C)2002,JPO

26 citations


Journal ArticleDOI
TL;DR: A new hybrid speech coding technique is presented in this paper, which combines a frequency-domain parametric coder with a time-domain waveform coder (for stationary voiced and stationary unvoiced speech) and a general nonsquare transform or dimension conversion and a weighted vector quantization approach.
Abstract: A new hybrid speech coding technique is presented in this paper, which combines a frequency-domain parametric coder (for stationary voiced and stationary unvoiced speech) with a time-domain waveform coder (for transition speech). Our hybrid coder uses a parametric representation for the excitation of a linear-prediction filter. The excitation of stationary voiced speech is a sum of harmonic cosines with interpolated magnitudes and a synthetic phase model, the excitation for stationary unvoiced speech is a spectrally shaped noise, and the excitation for transition speech is a set of signed pulses. Signal alignment when switching between the harmonic excitation of stationary voiced speech and the pulse model used for transition speech is required, and achieved by special alignment procedures. A 4 kb/s hybrid coder, which achieves high-quality reconstructed speech, is described. The 4 kb/s hybrid coder employs a neural network classifier, and a novel pitch detection and harmonic bandwidth estimation algorithm. The locations of excitation pulses for coding transitions are determined by analysis-by-synthesis. A simple and efficient dimension conversion and quantization of the harmonic. Spectral magnitudes of voiced speech was devised, combining the general nonsquare transform (NST) or dimension conversion and a weighted vector quantization (VQ) approach. Subjective listening tests demonstrate that the 4 kb/s hybrid coding scheme competes favorably with CELP coders at low bit-rates.

24 citations


Journal ArticleDOI
TL;DR: A novel voice-driven adaptive packet loss recovery algorithm is proposed to lessen the possible voice degradation and error propagation for analysis-by-synthesis speech coders in Internet applications.
Abstract: In this paper, a novel voice-driven adaptive packet loss recovery algorithm is proposed to lessen the possible voice degradation and error propagation for analysis-by-synthesis speech coders in Internet applications. After voicing classification, we adaptively adopt random noise generation, multiresolution excitation generation, or pulse tracking procedure to recover the lost packets, By applying the algorithm to the G.723.1 coder, simulation results show that the proposed algorithm is superior to the recovery algorithm embedded in the G.723.1 standard through the subjective evaluation.

21 citations


Patent
Hong-Goo Kang1, Hong Kook Kim1
26 Oct 2001
TL;DR: In this article, a frame erasure concealment method based on reestimating gain parameters for a code excited linear prediction (CELP) coder was proposed, which improved the speech quality under various channel conditions, compared with a conventional extrapolation-based concealment algorithm.
Abstract: The present invention provides a frame erasure concealment device and method that is based on reestimating gain parameters for a code excited linear prediction (CELP) coder. During operation, when a frame in a stream of received data is detected as being erased, the coding parameters, especially an adaptive codebook gain gp and a fixed codebook gain gc, of the erased and subsequent frames can be reestimated by a gain matching procedure. By using this technique with the IS-641 speech coder, it has been found that the present invention improves the speech quality under various channel conditions, compared with a conventional extrapolation-based concealment algorithm.

Proceedings ArticleDOI
01 Aug 2001
TL;DR: A speech content authentication scheme, which is integrated with CELP speech coders to minimize the total computational cost and is not only much faster than traditional cryptographic bitstream integrity algorithms, but also more compatible for a variety of applications.
Abstract: A speech content authentication scheme, which is integrated with CELP speech coders to minimize the total computational cost is proposed in this research. Speech features relevant to semantic meaning are extracted, encrypted and attached as the header information. A low cost synchronization algorithm is used to resolve mis-synchronization caused by content preserving operations. Silent and tonal regions in the speech are identified with algorithms of low complexity to enhance the precision of integrity verification. This scheme is not only much faster than traditional cryptographic bitstream integrity algorithms, but also more compatible for a variety of applications. Experimental results are collected by using the GSM-AMR speech coder, and statistical analysis is performed to calculate the false positive rate of tamper detection.

Journal ArticleDOI
TL;DR: This paper approximate the WLSD measure by the quadratically weighted measure or the weighted mean squared error (WMSE) measure and propose an optimal error shaping technique of LSF vector quantization, where the optimal WMSE weights are determined based on the theoretical analysis of the W LSD measure.
Abstract: This paper presents an error shaping technique for line spectrum frequency (LSF) vector quantization. The error shaping technique based on the weighted logarithm spectral distortion (WLSD) measure can be used for shaping the spectral distortion distribution of quantization error into any different curve depending on what kind of weighting function is used. However, the high computational complexity of the WLSD measure deters this error shaping technique from practical use. To solve this problem, we approximate the WLSD measure by the quadratically weighted measure or the weighted mean squared error (WMSE) measure and propose an optimal error shaping technique of LSF vector quantization. In this proposed error shaping technique, the optimal WMSE weights (i.e., the optimal weights of LSF parameters) are determined based on the theoretical analysis of the WLSD measure. Three experiments are performed to check the performance of the proposed error shaping technique. One experiment is set up by incorporating human perception into the LSF quantization and another is set up by emphasizing the human-sensitivity frequency band in lower frequency bandwidth 0-3 kHz. In the third experiment, we apply the proposed error shaping technique to the LSF quantization of a CELP coder to test how it affects the overall speech quality in an actual speech coding algorithm.

Proceedings ArticleDOI
07 May 2001
TL;DR: An adaptive multi-rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving third generation (3G) mobile speech services.
Abstract: This paper describes an adaptive multi-rate wideband (AMR-WB) speech codec proposed for the GSM system and also for the evolving third generation (3G) mobile speech services. The speech codec is based on SB-CELP (splitband-code-excited linear prediction) with five modes operating bit rates from 24 kbit/s down to 9.1 kbit/s. The respective channel coding schemes are based on RSC (recursive systematic code) and UEP (unequal error protection). Both, source and channel codec are designed as homogenous as possible to guarantee robust transmission on current and future mobile radio channels.

Journal ArticleDOI
TL;DR: This paper considers vector quantization of excitation gains in code-excited linear predictive (CELP) speech coder using the average error in reconstruction of the excitation signal as the distortion measure and derived a generalized Lloyd's algorithm to design a codebook for quantization.

Proceedings ArticleDOI
07 May 2001
TL;DR: This work considers soft reconstruction of LSF parameters in the IS-641 CELP coder transmitted over a noisy channel and proposes two schemes to exploit the interframe residual redundancies in the sequence of received parameters.
Abstract: Exploiting the residual redundancy in a source coder output stream during the decoding process has been proven to be a bandwidth efficient way to combat the noisy channel degradations. We consider soft reconstruction of LSF parameters in the IS-641 CELP coder transmitted over a noisy channel. We propose two schemes. The first scheme attempts to exploit the interframe residual redundancies in the sequence of received parameters. The second approach exploits both interframe and intraframe residual redundancies. Simulation results are provided which demonstrates the efficiency of the algorithms. Another issue addressed here, is a methodology to efficiently approximate and store the residual redundancies or the a priori transition probabilities. For quantizers with high rates calculating these probabilities require a huge number of source samples, and storing them also require a large amount of memory. These issues can well make the decoder design process an impractical task. The proposed method is based on the classification of the signal domain. The presented schemes provide high quality error concealment solutions for CELP coders.

Journal ArticleDOI
TL;DR: A frame erasure concealment algorithm based on reestimating gain parameters for a code-excited linear prediction (CELP) coder is proposed and it is found that the proposed algorithm improves the speech quality under various channel conditions compared with the conventional extrapolation-based concealment algorithms.
Abstract: In this paper, we propose a frame erasure concealment algorithm based on reestimating gain parameters for a code-excited linear prediction (CELP) coder. When a frame is detected as being erased, the coding parameters, especially the adaptive codebook gain and fixed codebook gain, of the erased and subsequent frames, are reestimated by a gain-matching procedure. By doing this, we can reduce the abrupt change caused in the decoded excitation signal by a simple scaling down procedure. We have applied this technique to the IS-63-1 speech coder and found that the proposed algorithm improves the speech quality under various channel conditions compared with the conventional extrapolation-based concealment algorithm.

Proceedings ArticleDOI
25 Jun 2001
TL;DR: A synthesis system for the Arabic language is presented with the potential to give better results thanks to their property of interpolation and their capacity of generalisation and to use CELP to drive the NN, which provides high quality speech.
Abstract: Speech is the most natural and widespread form of human communication. That is why speech synthesis has interested researchers for decades. It turns out that developing an unlimited text-to-speech system is an enormous task. The traditional methods (synthesis by rule and synthesis by concatenation of pre-recorded sounds) used for this have not given good results. In such a situation, neural networks (NNs) have the potential to give better results thanks to their property of interpolation and their capacity of generalisation. We present a synthesis system for the Arabic language. The choice of parameters which will be used to drive the NN is very important and have an effective influence on the quality of produced speech. Work has been done to evaluate different methods based on linear predictive coding; the resulting speech was machine-like and not intelligible. We suggest to use CELP to drive the NN, which provides high quality speech.

Patent
13 Sep 2001
TL;DR: In this article, a CELP-based speech coding with fine grain scalability was proposed, where a parameter encoder generates a basic bit-stream from LPC coefficients for a frame, pitch related information for all the sub-frames obtained by searching an adaptive codebook, and first pulse-related information for even sub-frame obtained by a fixed codebook.
Abstract: Methods and systems for providing a CELP-based speech coding with fine grain scalability include a parameter encoder that generates a basic bit-stream from LPC coefficients for a frame, pitch-related information for all the sub-frames obtained by searching an adaptive codebook, and first pulse-related information for even sub-frames obtained by searching a fixed codebook. The parameter encoder also generates enhancement bits, which are preceded by the basic bit-stream, from second pulse-related information for odd sub-frames. The quality of synthesized speech is improved on a basis of one additional odd sub-frame pulse, as more of the second pulse-related information in the enhancement bits is received by a decoder.

Proceedings ArticleDOI
25 Nov 2001
TL;DR: An adaptive voice playout method for handling network delay jitter in voice over packet (VOP) receivers and a novel, voicing-classification based speech extension algorithm for CELP speech coders is developed.
Abstract: We propose an adaptive voice playout method for handling network delay jitter in voice over packet (VOP) receivers. Our method allows playout delay increase during both silence periods and active speech; however, it allows playout delay decrease during silence periods only. Since the playout delay is increased during active speech, a speech extension algorithm is required. Therefore we have developed a novel, voicing-classification based speech extension algorithm for CELP speech coders. Though the complementary speech truncation algorithm is not needed for our adaptive playout mechanism, it is suitable for applications such as voice synchronization with other media.

Proceedings ArticleDOI
07 May 2001
TL;DR: A layered speech coding structure that is universally compatible with all CELP-based coders that encodes the reconstruction error signal from layer 1 using a low-delay, adaptive tree coder based upon the mean squared error (MSE) criterion.
Abstract: Many speech coding standards are based upon code-excited linear prediction (CELP), and it is desirable to develop layered coding methods that are compatible with this installed base of coders. We propose a layered speech coding structure that is universally compatible with all CELP-based coders. This structure encodes the reconstruction error signal from layer 1 using a low-delay, adaptive tree coder based upon the mean squared error (MSE) criterion. We note that rate distortion optimal successive refinement is achievable using two different distortion criteria and we derive expressions for the rate distortion function under autoregressive Gaussian assumptions on the source and the two different distortion measures. We demonstrate the universality of the approach by developing two-layer coders for a 3.65 kbps CELP coder, G.723.1, and G.729. We show that our layering method is favorably competitive with the MPEG-4 layering method at 8.7 kbps for both clean and noisy speech. Using tree coding and the MSE criterion in layer 2 improves speech naturalness when coding noisy speech.

Patent
Tanaka1, Seiko
29 May 2001
TL;DR: In this article, a speech coding and decoding system consisting of a low pass filter using an LPC parameter, and an efficient coding processing unit for generating a coded speech signal by referring to a code book for a speech signal when coding a speech and generating a noise signal by referred to the signal filtered by the low-pass filter when coding the information other than a speech, is described.
Abstract: A speech coding and decoding system consisting of a speech coding system and a speech decoding system, the speech coding system comprises a low pass filter using an LPC parameter, and an efficient coding processing unit for generating a coded speech signal by referring to a code book for a speech signal when coding a speech and generating a noise signal by referring to the code book for the signal filtered by the low pass filter when coding the information other than a speech, the speech decoding system comprises an efficient decoding processing unit for decoding the coded signal supplied from the speech coding system so to reproduce a speech signal, and a high pass filter using the LPC parameter for filtering a speech signal of an unvoiced sound area generated by the efficient decoding processing unit.

Patent
13 Sep 2001
TL;DR: In this article, a CELP-based speech coding with fine grain scalability was proposed, where a parameter encoder generates a basic bit-stream from LPC coefficients for a frame, pitch related information for all the sub-frames obtained by searching an adaptive codebook, and first pulse-related information for even sub-frame obtained by search an fixed codebook.
Abstract: Methods and systems for providing a CELP-based speech coding with fine grain scalability include a parameter encoder that generates a basic bit-stream from LPC coefficients for a frame, pitch-related information for all the sub-frames obtained by searching an adaptive codebook, and first pulse-related information for even sub-frames obtained by searching an fixed codebook. The parameter encoder also generates enhancement bits, which are preceded by the basic bit-stream, from second pulse-related information for odd sub-frames. The quality of synthesized speech is improved on a basis of one additional odd sub-frame pulse, as more of the second pulse-related information in the enhancement bits is received by a decoder.

Proceedings ArticleDOI
21 Oct 2001
TL;DR: New variations on CELP speech coders that specifically enhance the quality of encoded singing for individual singers are suggested that could be used in a low-bitrate singing voice codec which, in conjunction with multi-track structured coding schemes such as MPEG-4 structured audio, could provide a highly compressed yet high-quality representation of a complex audio scene.
Abstract: The technique of code excited linear prediction (CELP) has led to the development of voice coding systems that provide toll quality speech at very low bitrates. While speech and singing share many similarities in terms of production, standard speech coding implementations fall far short when transmitting the singing voice. This paper explores the reasons for this discrepancy and suggests new variations on CELP speech coders that specifically enhance the quality of encoded singing for individual singers. These modifications could be used in a low-bitrate singing voice codec which, in conjunction with multi-track structured coding schemes such as MPEG-4 structured audio, could provide a highly compressed yet high-quality representation of a complex audio scene.

Patent
Takahiro Unno1
13 Nov 2001
TL;DR: In this article, a CELP speech encoder has progressively weaker perceptual weighting filters for each of the successive enhancement layers and decoders have progressively weaker short-term postfilters for increased bit rates.
Abstract: Layered code-excited linear prediction (CELP) speech encoders have progressively weaker perceptual weighting filters for each of the successive enhancement layers and decoders have progressively weaker short-term postfilters for increased bit rates (increased number of enhancement layers decoded) and a long-term postfilter for all bit rates.

Dissertation
01 Jan 2001
TL;DR: The aim of the research presented here is to improve the speech quality produced by low bit rate vocoders, ideally bringing it close to that of higher bit rates CELP coders while retaining aLow bit rate.
Abstract: The past decade has seen a very fast growth of the telecommunications industry. Mobile telephony has evolved from a specialist application to being commonplace and affordable, and is now a mass-market industry. A similar evolution is expected from multimedia communications, where voice, video and data are all to be integrated into one device. These services require a large amount of bandwidth, which is a relatively cheap and expandable resource in wire based fixed networks. However it is at a premium in satellite or cellular radio systems. In order to cope with the growing demand and the increasing number of subscribers, it is necessary to make optimal use of the bandwidth available. This implies using efficient source coding technologies, including speech compression algorithms. Many of the recent cellular radio communication systems have used speech coders based upon the Code Excited Linear Prediction (CELP) model. These provide high speech quality at bit rates of 8 kb/s and above, however this reduces significantly when the bit rate is lowered. Vocoders on the other hand have been used for very low bit rate applications, where they provide low quality speech. This usually restricts their use to specialised applications such as private radio or military use. The aim of the research presented here is to improve the speech quality produced by low bit rate vocoders, ideally bringing it close to that of higher bit rates CELP coders while retaining a low bit rate. In order to achieve this it has been necessary to introduce new and refined parameter estimation and quantisation techniques, which were integrated in an improved vocoder model. The resulting coder was then adapted to a range of low and very low bit rate applications, and submitted as candidates to three major standardisation efforts.

Patent
28 Feb 2001
TL;DR: In this paper, a coding parameter control circuit is proposed to compute frame length from bit rate and coding delay, and provides the computed frame length data to a CELP coding circuit.
Abstract: A coding parameter control circuit 31 computes frame length from bit rate and coding delay, and provides the computed frame length data to a CELP coding circuit 32. On the basis of the computed frame length, the coding parameter control circuit 32 selects control parameters from a table, in which a plurality of control parameters for controlling the operation of the CELP coding circuit are set, on the basis of the bit rate, and provides the selected control parameters to the CELP coding circuit. The coding parameter control circuit provides the sub-frame length, and bit number distributed to the multi-pulse signal to the multi-pulse signal generation parameter setting circuit 33. The multi-pulse signal coding parameter setting circuit 33 computes pulse number of multi-pulse excitation signal, pulse position candidates of each pulse and candidate positions thereof from the sub-frame length and bit number of multi-pulse signal.

Patent
Ajit V. Rao1
29 Jun 2001
TL;DR: In this paper, the warp contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function, using only a subset of possible contours contained within a sub-range of the range of contours.
Abstract: A signal modification technique facilitates compact voice coding by employing a continuous, rather than piece-wise continuous, time warp contour to modify an original residual signal to match an idealized contour, avoiding edge effects caused by prior art techniques. Warping is executed using a continuous warp contour lacking spatial discontinuities which does not invert or overly distend the positions of adjacent end points in adjacent frames. The linear shift implemented by the warp contour is derived via quadratic approximation or other method, to reduce the complexity of coding to allow for practical and economical implementation. In particular, the algorithm for determining the warp contour uses only a subset of possible contours contained within a sub-range of the range of possible contours. The relative correlation strengths from these contours are modeled as points on a polynomial trace and the optimum warp contour is calculated by maximizing the modeling function.

Proceedings ArticleDOI
07 Nov 2001
TL;DR: The main result is that the system based on ANNs exceeds the best current performance standard (CELP), however, its speaker dependency hinders its potential standardization.
Abstract: This paper presents a comparative study between three voice compression systems for wireless telephony: CELP (code excited linear prediction), VSELP (vector sum excited linear prediction) and GSM 06. 10 (Global Standard for Mobile communications), and one system based on artificial neural networks (ANNs). The main result is that the system based on ANNs exceeds the best current performance standard (CELP). However, its speaker dependency hinders its potential standardization.

Journal ArticleDOI
TL;DR: Both the objective and subjective tests show that the subjective quality of the synthesized clean speech coded by the proposed 8.4-kbps wideband MELP coder is comparable to that of the ITU G.722 coding standard for digital transmission of wideband audio signals.