scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 1989"


PatentDOI
TL;DR: In this paper, a 26-bit spectrum filter coding scheme was used to jointly optimize pitch and gain parameter sets in a speech codec operating at low data rates using an iterative method, where the number of bits allocated to the pitch and excitation signals depend on whether the signals are significant or not.
Abstract: A speech codec operating at low data rates uses an iterative method to jointly optimize pitch and gain parameter sets. A 26-bit spectrum filter coding scheme may be used, involving successive subtractions and quantizations. The codec may preferably use a decomposed multipulse excitation model, wherein the multipulse vectors used as the excitation signal are decomposed into position and amplitude codewords. Multipulse vectors are coded by comparing each vector to a reference multipulse vector and quantizing the resulting difference vector. An expanded multipulse excitation codebook and associated fast search method, optionally with a dynamically-weighted distortion measure, allow selection of the best excitation vector without memory or computational overload. In a dynamic bit allocation technique, the number of bits allocated to the pitch and excitation signals depend on whether the signals are "significant" or "insignificant". Silence/speech detection is based on an average signal energy over an interval and a minimum average energy over a predetermined number of intervals. Adaptive post-filter and the automatic gain control schemes are also provided. Interpolation is used for spectrum filter smoothing, and an algorithm is provided for ensuring stability of the spectrum filter. Specially designed scalar quantizers are provided for the pitch gain and excitation gain.

110 citations


Proceedings ArticleDOI
Karl Hellwig1, Peter Vary1, D. Massaloux, J. P. Petit, C. Galand, M. Rosso 
27 Nov 1989
TL;DR: The speech coding scheme which will be used as the standard for the European mobile radio system has been selected by the CEPT Groupe Special-Mobile (GSM) as a result of formal subjective listening tests based on the regular-pulse excitation linear predictive coding technique (RPE-LPC) combined with long-term prediction (LTP).
Abstract: The speech coding scheme which will be used as the standard for the European mobile radio system has been selected by the CEPT Groupe Special-Mobile (GSM) as a result of formal subjective listening tests. It is based on the regular-pulse excitation linear predictive coding technique (RPE-LPC) combined with long-term prediction (LTP). The solution is called the RPE-LTP codec. The codec algorithm and the error protection scheme are presented. The net bit rate is 13.0 kb/s, and the gross bit rate, including error protection, is 22.8 kb/s. The experimental implementation based on VLSI signal processors is described. The speech quality obtained with the technique considered is far superior to that obtainable with present-day analog mobile radio systems. A duplex speech codec including error protection can be implemented with two VLSI sign processors with external data memories of about 1 K*16 b. >

65 citations


Patent
Richard D. Gitlin1, John Hartung1
02 Mar 1989
TL;DR: In this paper, the authors proposed a mechanism enabling a codec to communicate its presence to another codec on its high-bit-rate side of the connection is based on the transmission of predetermined synchronization patterns inserted in the signals it outputs in that direction.
Abstract: In a private telecommunications network, a plurality of digital PBXs are interconnected via pairs of codecs. The codecs of each pair, or "tandem", are each operative to encode 64 kilobit/second (kbps) mu-law speech so as to compress it to 16 kbps speech for transmission to the other codec in the tandem. The latter is operative to thereafter decompress the 16 kbps speech back to 64 kbps. Each codec has a second mode of operation in which, rather than decode the encoded speech, it preserves the bits thereof in its own output signal. The codec transitions to this mode whenever it recognizes the presence of another codec on its high-bit-rate side of the connection. As a result, only one encoding/decoding cycle is performed across the connection, thereby minimizing the speech-coding-induced distortion and delay therein. The mechanism enabling a codec to communicate its presence to another codec on its high-bit-rate side of the connection is based on the transmission of predetermined synchronization patterns inserted in the signals it outputs in that direction. In a second embodiment, codecs of the above-described type are used in a cellular mobile radio telecommunications system.

56 citations


Proceedings ArticleDOI
23 May 1989
TL;DR: The authors explore the benefits of time-varying bit allocation to excitation and LPC (linear predictive coding) parameters for the case of codebook-excited LPC, finding that gains due to variable bit allocation were most noticeable in the 6.4 kb/s system, especially with female speakers.
Abstract: The authors explore the benefits of time-varying bit allocation to excitation and LPC (linear predictive coding) parameters for the case of codebook-excited LPC. The overall bit rate in the experiment was 4.8, 6.4, or 8.0 kb/s. In each case, permissible bit rates for the LPC component were 0, 24, 36, or 48 bits per frame, one of which was selected for each speech frame using a brute-force search maximum performance. Average SNR gains over conventional time-invariant methods were modest, on the order of 1 to 2 dB, but gains for certain speech segments were as high as 3 to 5 dB. Perceptually, gains due to variable bit allocation were most noticeable in the 6.4 kb/s system, especially with female speakers. However, even in this case, the benefits of flexible bit allocation were somewhat offset by distortions due to other inadequacies in the coding algorithm. >

36 citations


Proceedings ArticleDOI
23 May 1989
TL;DR: A novel approach to narrow- and medium-band speech coding that can dynamically balance the transmission rate between the excitation and the spectral parameters is introduced, improving the subjective speech quality.
Abstract: The authors introduce a novel approach to narrow- and medium-band speech coding that can dynamically balance the transmission rate between the excitation and the spectral parameters. The coding algorithm, called multimode coding, operates several coding blocks, each of which has a different bit assignment in parallel, and selects the optimum coding block frame by frame based on an evaluation of the reproduced speech quality. This coding algorithm is applied to 4.8 and 8.0 kb/s CELP coders, and 2.0-2.4 dB of SNRseg improvement is achieved over conventional CELP coders. The spectral distortion measure is added as an evaluation function, improving the subjective speech quality. >

27 citations


Proceedings ArticleDOI
23 May 1989
TL;DR: The authors propose a split-band coder structure, where both hands are coded with analysis-by-synthesis techniques in order to take advantage of their high coding gain, and show that the obtained speech quality is close to the original.
Abstract: The problem of coding speech at 7 kHz is considered. A possible application of these codecs could be the videophone, especially for hands-free telephone use. The authors propose a split-band coder structure, where both hands are coded with analysis-by-synthesis techniques in order to take advantage of their high coding gain. Two techniques are considered: multipulse coding and codebook-excited linear prediction. The structures of two possible codecs are described, and indications are given of the considerations that led to their design. Their main characteristic is the use, as far as possible, of excitation model parameters optimized within the analysis by synthesis loop, still maintaining a reasonable computational complexity. Subjective and objective results, obtained with high-quality speech, are reported. They show that the obtained speech quality is close to the original. >

26 citations


Proceedings ArticleDOI
11 Jun 1989
TL;DR: A novel universal forward error correction (NUFEC) scheme, comprising direct high-coding-rate convolutional code generators and variable-rate Viterbi decoders, is proposed and by using these NUFEC VLSIs, a compact, high-speed, and high-Coding-gain FEC codec has been developed.
Abstract: A novel universal forward error correction (NUFEC) scheme, comprising direct high-coding-rate convolutional code generators and variable-rate Viterbi decoders, is proposed. To make this NUFEC scheme available in various communication systems, three kinds of CMOS VLSIs have been developed. They can operate up to 25 Mb/s and are easily applicable to various coding-rate convolutional codes. Furthermore, by using these NUFEC VLSIs, a compact, high-speed, and high-coding-gain FEC codec (NUFEC codec) has been developed. Experimental results on the Pe performance of this codec have shown powerful coding gains of 5.5, 4.5, 3.4, and 2.2 dB at Pe=1*10/sup -6/ in coding rates of 1/2, 3/4, 7/8, and 15/16, respectively. >

25 citations



Proceedings ArticleDOI
27 Nov 1989
TL;DR: The implementation of a video codec with a programmable parallel digital signal processor (DSP) is described and the advantages of DSP implementation over dedicated VLSI implementation are discussed.
Abstract: The implementation of a video codec with a programmable parallel digital signal processor (DSP) is described. The advantages of DSP implementation over dedicated VLSI implementation are discussed. System design concepts and the architecture for a multicomputer-type DSP system called NOVI are reviewed. This system has been developed as a prototype for program development and performance evaluation of parallel processing systems. Intraframe DCT (discrete cosine transform) codec is implemented by NOVI as an example of a transform-type video codec. The processing speed of the implemented codec is examined. This codec can process 9 frames/s (one frame consists of a 256*240 pixel image) using 34 PEs (processing elements): 16 PEs for the coder, 16 PEs for the decoder, and 2 PEs for the video I/O. Packet assembly and multiplexing are also performed by the codec. >

8 citations


Proceedings ArticleDOI
23 May 1989
TL;DR: A source coding scheme for high-quality audio signals based on adaptive transform coding is presented, that features psychoacoustic weighting and a joint bit allocation to stereo channels.
Abstract: A source coding scheme for high-quality audio signals based on adaptive transform coding is presented. Using the well known ATC algorithm of R. Zelinski and P. Noll (1977) as a starting point, the authors have developed an algorithm that features psychoacoustic weighting and a joint bit allocation to stereo channels. The real-time implementation uses a four-node processor network, each node consisting of a T800 transputer and a DSP32 signal processor. Both the codec optimization and the evaluation of subjective quality are facilitated by the fact that many of the codec parameters can be changed online. >

7 citations


Proceedings ArticleDOI
27 Nov 1989
TL;DR: A robust stereo HiFi audio codec designed for 384 kb/s ISDN (integrated services digital network) subrate (H0) channel applications has been developed and the coding algorithm brings about the robustness to withstand transmission bit errors in actual transmission environments such as satellite communications.
Abstract: A robust stereo HiFi audio codec designed for 384 kb/s ISDN (integrated services digital network) subrate (H0) channel applications has been developed. The codec is based on subband ADPCM (adaptive differential pulse code modulation). In order to realize high coding quality, a new adaptive bit assignment function is introduced which needs no side-information transmission. Moreover, the coding algorithm brings about the robustness to withstand transmission bit errors in actual transmission environments such as satellite communications, where conventional adaptive bit assignment is inherently inferior. The codec is implemented using floating-point LSI signal processors. Subjective test results show that the coding quality is comparable to that of compact discs. The proposed algorithm can be applied to a basic ISDN channel by limiting the signal bandwidth to 15 kHz and transmitting FM-quality monophonic signals at 128 kb/s. >


Journal ArticleDOI
TL;DR: This is the first VLSI realization of a DPCM codec with adaptive quantizer, and correct operation has been verified up to 26 MHz.
Abstract: A differential pulse-code modulation (DPCM) video codec with two-dimensional intrafield prediction and adaptive quantizer is presented. An approach for the arithmetic implementation of the DPCM structure and the design of a test chip, fabricated in a 1.5 mu m CMOS technology, is described. This is the first VLSI realization of a DPCM codec with adaptive quantizer. For the test chip transmitter or receiver mode, application as part of a three-dimensional interframe codec and processing of luminance or chrominance signals are optional. A line buffer and ten different quantizer characteristics are realized on-chip. Correct operation has been verified up to 26 MHz. >


Proceedings ArticleDOI
01 Nov 1989
TL;DR: A specialized multiprocessor environment for hybrid coding of visual communications signals in the range from ISDN basic access to primary rate transmission channels with a proprietary 4-wide SIMD parallel video processor with 80 MIPS and the software philosophy of the codec is described.
Abstract: The first part of the paper describes a specialized multiprocessor environment for hybrid coding of visual communications signals in the range from ISDN basic access to primary rate transmission channels. Most important is a proprietary 4-wide SIMD parallel video processor with 80 MIPS. The second part deals with the software philosophy of the codec. It uses preanalysis and prebuffering in the first phase of coding a frame. In the second phase, limited processing power and available channel bits are distributed optimally over time and over changed areas of one frame. Codec delay is halved with respect to conventional codecs.

Proceedings Article
18 Jul 1989
TL;DR: A single chip video picture motion estimator that operates upon a CIF specification video frame and provides motion vectors for (8*8) pixel blocks for displacements of up to +or-16 pixels and which can be cascaded for (16*16) block operation.
Abstract: Motion estimation is a key function in a video codec intended for use with low bit rate data transmission. Motion compensated codecs have received much attention in the literature recently since they provide a means by which high picture quality can be achieved with an economic bandwidth. This paper describes a single chip video picture motion estimator that operates upon a CIF specification video frame and provides motion vectors for (8*8) pixel blocks for displacements of up to +or-16 pixels and which can be cascaded for (16*16) block operation. The very close match between function, algorithm and architecture has allowed significant processing power to be implemented on a single chip. >

Proceedings ArticleDOI
01 Nov 1989
TL;DR: The Block-list-transform (BLT) coding algorithm, a generalized DPCM, is used to encode the frame differences between the previously reconstructed picture and the current picture, and preliminary results show reasonable coding performance on typical videophone sequences.
Abstract: A simple but efficient video codec is built for low-bit-rate videophone applications. The goal of this project is to construct a video codec at a low cost and good performance. In order to reduce hardware complexity, a simple interframe predictive coding structure is selected. The input pictures are partitioned into 2-D blocks, and only the blocks with significant frame differences are coded and transmitted to the decoder. The Block-list-transform (BLT) coding algorithm, a generalized DPCM, is used to encode the frame differences between the previously reconstructed picture and the current picture. Preliminary results using this coding system at 64 kbits/sec show reasonable coding performance on typical videophone sequences.


01 Aug 1989
TL;DR: Testing of low rate voice digitizing coder/ decoder (CODEC's) for use with the Aeronautical Mobile Satellite Service (AMSS) shows that the intelligibility of the low rate 4.8 kilobits per second CoDEC's is essentially equivalent to the intelligible of the 9.6 kbps CODEC.
Abstract: : The FAA is currently evaluating low rate voice digitizing coder/ decoder (CODEC's) for use with the Aeronautical Mobile Satellite Service (AMSS). Phase II of this evaluation consisted of air traffic control (ATC) personnel participating in an objective intelligibility test of several CODEC's under operational conditions. The results of the testing show that the intelligibility of the low rate 4.8 kilobits per second (kbps) CoDEC's is essentially equivalent to the intelligibility of the 9.6 kbps CODEC. The results also show that the 4.8 kbps CODEC's can operate with high intelligibility under conditions of high bit error rates and operational background noise. Keywords: Air traffic controllers; Digital voice communications; Voice coding.

Journal ArticleDOI
TL;DR: The performance results of a low-bit rate image codec for videophone and B /2 B channel ISDN applications and the resulting video sequences indicate that this codec can be a good candidate for a videophone in ISDN network.
Abstract: This article presents the performance results of a low-bit rate image codec for videophone and B /2 B channel ISDN applications. The codec at the encoder skips two frames over three images and encodes every one image over three using DCT-Huffman coding and sends together with the motion vectors computed over overlapped blocks. At the receiver, the missing frames are constructed by linear interpolation using coded frames and respective motion vectors. The effectiveness of the codec is evaluated by employing two video sequences as input. The resulting video sequences indicate that for videophone applications, an acceptable quality of image is obtained. Therefore, this codec can be a good candidate for a videophone in ISDN network. Finally, the implementation considerations of this codec are also given.

09 Oct 1989
TL;DR: Describes a hybrid speech coding system (vector transform coder (VTC) based on three of the most efficient coding techniques available, which is capable of producing very good quality coded speech at 16 and 9.6 kbits/s.
Abstract: Describes a hybrid speech coding system (vector transform coder (VTC)) based on three of the most efficient coding techniques available. These are transform coding, subband coding and vector quantisation. Transform coding and sub-band coding are both frequency-domain techniques. Vector quantisation is a block coding method and the major motivation for using it is a fundamental result of Shannon's rate-distortion theory which indicates that better performance is always achieved by coding data in blocks (vectors) rather than as scalars. This system is capable of producing very good quality coded speech at 16 and 9.6 kbits/s. In recent work comb filtering and artificial neural networks (ANN) methods have been employed in order to improve the efficiency of this system at 4.8 kbits/s. New methods have been developed for the design of vector quantiser codebooks as well as for the continuous updating of these quantisers. Due to the adaptive nature of these methods, considerable improvement has been achieved in the quality of the coded speech and a total increase of about 0.9 dB has been obtained in the signal to noise ratio at 4.8 kbits/s. The coder has recently been implemented in real-time by using an AT&T DSP32C digital signal processor chip.

Proceedings ArticleDOI
11 Apr 1989
TL;DR: CELP coding is a mature technique able to yield high-quality synthetic speech with some background noise at the mid-rate range, and has a tractable computational complexity and good robustness, making it the best present candidate.
Abstract: The authors present the results of a feasibility study comparing two basic classes of coders in the mid-rate range (6-8 kbps): CELP and sinusoid based coders. It is concluded that CELP coding is a mature technique able to yield high-quality synthetic speech with some background noise at the mid-rate range. It has a tractable computational complexity and good robustness, making it the best present candidate. Postprocessing can enhance its output quality, though further research is needed in this field. Sinusoidal coding, together with the use of narrowband basis functions, can produce synthetic speech of higher quality than that produced by CELP. Its robustness however, must be increased. This work suggests that a significant research effort into this technique is fully justified. >

Proceedings ArticleDOI
11 Apr 1989
TL;DR: A novel method of reduced complexity is described which permits an efficient implementation on on a commercial DSP (digital signal processor) in terms of computation time and code compactness.
Abstract: The regular pulse excitation technique for speech coding at low bit rates provides good speech quality but involves a heavy computational effort. A novel method of reduced complexity is described which permits an efficient implementation on on a commercial DSP (digital signal processor) in terms of computation time and code compactness. The algorithm uses vector quantization for a vocal tract all-pole model. A complete implementation of an 8 kb/s codec working in full-duplex mode is performed on the Motorola mathematical mu P DSP56000. According to many informal listening tests, the speech quality at 8-kb/s is very close to that achieved at 64 kb/s using standard pulse-code modulation. >