scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 1996"


Journal Article
TL;DR: An Analysis/Synthesis Audio Codec (ASAC) is presented, which allows the coding of audio signals at very low bit rates for applications like mobile communication or multimedia database access via modem and analog telephone lines.
Abstract: An Analysis/Synthesis Audio Codec (ASAC) is presented, which allows the coding of audio signals at very low bit rates for applications like mobile communication or multimedia database access via modem and analog telephone lines. Compression with bit rates between 6 kbit/s and 24 kbit/s is addressed. Furthermore the implementation of special effects like independent pitch change and speed change in the decoder is described.

70 citations



Proceedings ArticleDOI
07 May 1996
TL;DR: A split-band encoding scheme for 16 kbit/s wideband speech coding (50-7000 Hz), using 2 unequal subbands from 0-6 kHz and from 6-7 kHz, which was motivated by an experimental evaluation of the signal bandwidth of speech frames.
Abstract: We propose a split-band encoding scheme for 16 kbit/s wideband speech coding (50-7000 Hz), using 2 unequal subbands from 0-6 kHz and from 6-7 kHz. This approach was motivated by an experimental evaluation of the signal bandwidth of speech frames. The higher subband is simply represented by white noise with adjustment of the short term energy. For the lower subband code-excited linear prediction (CELP) is used. The analysis filter bank, which performs the unequal band splitting combined with critical subsampling of the sub-bands, is described. A bit error concealment technique and the bit allocation is also presented. By informal listening tests the speech quality was rated higher than the speech quality of the CCITT G.722 wideband codec operating at 48 kbit/s.

31 citations


Proceedings ArticleDOI
16 Sep 1996
TL;DR: This work replaces the arithmetic coding portion of the Taubman codec with block coding, and compares the encode/decode speed of this new coder with MPEG, which is one order of magnitude faster than MPEG-1.
Abstract: Scalable video compression is becoming increasingly more important in diverse, heterogeneous networks of today. In a previous work, Taubman and Zakhor (see IEEE Transactions on Image Processing, vol.3, no.5, p.572-88, 1994) developed a scalable codec capable of generating bit rates from tens of kilo bits per second to several mega bits per second with fine granularity of the available bit rates. This codec is based on 3-D subband coding and multi-rate quantization of subband coefficients, followed by arithmetic coding. We replace the arithmetic coding portion of the Taubman codec with block coding, and compare the encode/decode speed of this new coder with MPEG. Unlike MPEG, this codec requires symmetric computational power at the decoder and encoder and as such is useful in software only, real time, interactive video applications. We have found the encoding speed of the new encoder to be one order of magnitude faster than MPEG-1, without significant loss in compression efficiency.

25 citations


Patent
31 Jan 1996
TL;DR: In this article, a speech encoder/decoder (CODEC) is used to convert the analog signal to a digital signal and then converted the digital signal into an analog waveform for conversion to sound.
Abstract: A novel and improved digital FM audio processor for use in a dual-mode communication system selectively operative in either FM or code division multiple access (CDMA) modes. Analog voice or voice-band data is input to a speech encoder/decoder (CODEC) which converts the analog signal to a digital signal. The digital FM signal is read from the CODEC, filtered, compressed, up-sampled and combined with a transponded SAT signal and then modulated for RF transmission. On the receive side, the FM analog signal is received, demodulated, down-sampled, expanded, and filtered before being converted to the proper format (μ-law, a-law, or linear) for the speech CODEC. The CODEC then converts the digital FM audio signal into an analog waveform for conversion to sound. By performing the FM audio processing digitally, the same digital signal processing (DSP) firmware may integrated on the same application specific integrated circuit (ASIC) which is capable of performing audio processing of both FM and CDMA audio signals.

20 citations


Proceedings ArticleDOI
12 May 1996
TL;DR: A novel approach to a two-stage wavelet packet based scalable audio coding system is presented and two different structures have been designed and implemented, one in the time-domain, and its dual in the wavelet-domain; these are compared with an MPEG based scalable codec.
Abstract: Scalability, a well known concept in video coding, has only recently been introduced to audio coding. In this paper, a novel approach to a two-stage wavelet packet based scalable audio coding system is presented. Two different structures have been designed and implemented, one in the time-domain, and its dual in the wavelet-domain; these are compared with an MPEG based scalable codec. Results at different bit-rates are shown, while trade-offs and limitations together with future developments for further reduced bit-rates are discussed.

17 citations


Book ChapterDOI
01 Jan 1996
TL;DR: Comparison with both the standard JPEG coder and the RM8 implementation of the standard H.261 video codec shows that the presented codec provides improvements in both the peak signal-to-noise ratio and the picture quality.
Abstract: Publisher Summary This chapter discusses a coding method that is suitable for multimedia applications. This method is based on the efficient coding of image wavelet coefficients using zerotree multi-stage lattice vector quantization (VQ). This method is referred as successive approximation wavelet VQ (SA-W-VQ). The basic idea in SA-W-VQ is that the original blocks of wavelet coefficients are successively refined based on vectors of progressively decreasing magnitude and a finite set of prototype orientations. Block zero-tree prediction and adaptive arithmetic coding are incorporated to improve the efficiency of the codec. The chapter explains that this coding scheme achieves high compression ratios with good picture quality, maintaining a very simple implementation. Simulation results are provided to evaluate the coding performance of the described coding scheme for still image and low bit rate video coding. Comparison with both the standard JPEG coder and the RM8 implementation of the standard H.261 video codec shows that the presented codec provides improvements in both the peak signal-to-noise ratio and the picture quality.

16 citations


Book ChapterDOI
James D. Johnston1
01 Jan 1996
TL;DR: Recently, there has been a burst of work in coding of audio signals with an analog bandwidth of 20 Hz to 20 kHz and amplitude resolution of 16 bits or more, and these coders use knowledge of the perception process as their guide for lossy coding.
Abstract: Recently, there has been a burst of work in coding of audio signals Digital audio signals are typically signals with an analog bandwidth of 20 Hz to 20 kHz and amplitude resolution of 16 bits or more Digital audio coders are not, for the most part, lossless coders in the information-theoretic sense In fact they are quite lossy, using knowledge of the human auditory system in order to remove parts of the signal that the human auditory system cannot distinguish Because they use knowledge of the perception process as their guide for lossy coding, these coders are commonly referred to as “perceptual coders,” and the part of the signal they remove is referred to as “irrelevant”

12 citations


Proceedings ArticleDOI
05 May 1996
TL;DR: A low-power 16-bit DSP has been developed to realize a low bit-rate speech codec and the PDC half-ratespeech codec is implemented in the DSP with 36 mW at 1.8 V.
Abstract: A low-power 16-bit DSP has been developed to realize a low bit-rate speech codec A dual datapath architecture and low-power circuit design techniques are employed to reduce power consumption The PDC half-rate speech codec is implemented in the DSP with 36 mW at 18 V

9 citations



Proceedings ArticleDOI
28 Apr 1996
TL;DR: This work has developed a programmable 8-16 kbits/s low-delay speech codec, which is compatible with the G728 16 k bits/s ITU codec at its top rate and offers a graceful trade-off between the speech quality and bit rate in the 8- 16 kbit/s range.
Abstract: The intelligent, adaptively reconfigurable wireless systems of the near future require programmable source codecs in order to optimally configure the transceiver to adapt to time-variant channel and traffic conditions. Hence we developed a programmable 8-16 kbits/s low-delay speech codec, which is compatible with the G728 16 kbits/s ITU codec at its top rate and offers a graceful trade-off between the speech quality and bit rate in the 8-16 kbits/s range. The issues of robustness against channel errors strongly influenced the algorithmic design of the 8-16 kbits/s speech codec, and hence special attention is devoted to these issues. Source-matched Bose-Chaudhuri-Hocquenghem (BCH) codecs combined with unequal protection pilot-assisted 4- and 16-level quadrature amplitude modulation (4-QAM, 16-QAM) are employed in order to transmit both the 8 and the 16 kbits/s coded speech bits at a signalling rate of 10.4 kBd. In a bandwidth of 1728 kHz, which is used by the Digital European Cordless Telephone (DECT) system 55 duplex or 110 simplex time slots can be created. Good toll quality speech is delivered in an equivalent bandwidth of 15.71 kHz, if the channel signal-to-noise ratio (SNR) and signal-to-interference ratio (SIR) are in excess of about 18 and 26 dB for the lower and higher speech quality 4-QAM and 16-QAM modes, respectively.

Patent
30 Oct 1996
TL;DR: In this paper, the authors proposed a handshaking-based speech encoding method for a speech transmitting tansceiver (100) of a digital telecommunications system, in which it is possible to use a new codec (108, 128) and an old codec (106, 126) in parallel in the system.
Abstract: The invention enables the introduction of a codec (108, 128) according to a new speech encoding method into a speech transmitting tansceiver (100) of a digital telecommunications system, so that it is possible to use a 'new' codec (108, 128) and an 'old' codec (106, 126) in parallel in the system. A codec is selected by implementing a handshaking procedure according to the invention between the transceivers (100, 100'). The invention is based on handshaking in which a speech encoding method implemented in all the transceivers (100, 100') and previously used in the telecommunications system concerned is used first at the beginning of each connection. At the beginning of a phone call and after handover, the method checks whether both parties (100, 100') can also use the new speech encoding. The handshaking messages have been selected so that their effect on the quality of speech is minimal, and yet so that the probability of identifying the messages is maximal.

Journal ArticleDOI
TL;DR: The performance obtainable with four-tap wavelet filters for low bit rate audio coding is presented and a codec model has been designed and implemented based on wavelet packet algorithm and the model of auditory perception.
Abstract: The performance obtainable with four-tap wavelet filters for low bit rate audio coding is presented. For the investigation and comparison of the performance of these wavelet filters, a codec model has been designed and implemented based on wavelet packet algorithm and the model of auditory perception.


Proceedings ArticleDOI
M. Delprat1, C.C. Evci1
28 Apr 1996
TL;DR: The technological developments in the GSM from speech coding point of view along with advanced speech transmission techniques which will take GSM to the turn of century and even beyond are described.
Abstract: The new features and technical improvements will take the world leading European developed GSM system in meeting worldwide new customer demands and will compete successfully with other existing and future cellular systems. Among all, the most significant enhancements are the current development of enhanced full-rate (EFR) GSM high quality speech codec. Both PCS1900 standards and ETSI have selected the US1 codec for this purpose despite some drawbacks. This paper describes the technological developments in the GSM from speech coding point of view along with advanced speech transmission techniques which will take GSM to the turn of century and even beyond.

Proceedings ArticleDOI
Cheng Deyuan1
14 Oct 1996
TL;DR: An adaptive pitch prefilter before LP synthesis filter and a spectral postfilter after LP synthesisfilter are adopted in the speech decoder to enhance the reconstructed speech quality.
Abstract: An 8 kb/s ACELP speech codec is proposed. The encoding algorithm is based on code-excited linear prediction (CELP) with a multi-level pulse amplitude algebraic codebook (ACELP). Only integer pitch delay is searched. There are 8 non-zero pulses with fixed amplitudes, each of +1, or +0.5, or -1 or -0.5, or zero in each algebraic codevector of length of 64 samples. An efficient nonexhaustive codebook search method is developed. The codevector's gain and LSP frequencies are scalar quantized. An adaptive pitch prefilter before LP synthesis filter and a spectral postfilter after LP synthesis filter are adopted in the speech decoder to enhance the reconstructed speech quality.


Proceedings ArticleDOI
16 Sep 1996
TL;DR: A new approach for an embedded image codec based on the wavelet transform and auto-adaptive block coding of binary position information is presented, which treats sub-bands independently and is scalable with no corresponding implementation loss of coding efficiency.
Abstract: A new approach for an embedded image codec based on the wavelet transform (WT) and auto-adaptive block coding of binary position information is presented. The method, which treats sub-bands independently is simple and efficient. Experimental results show its performance to be close to that of the zero-tree codec even without further entropy coding. In addition, the codec is scalable with no corresponding implementation loss of coding efficiency.

Proceedings Article
01 Jan 1996
TL;DR: This paper proposes a new pitch alteration method that can change the pitch period in waveform coding by scaling the time-axis and compensating the spectrum.
Abstract: The waveform coding techniques are concerned with simply preserving the waveform shape of speech signal through a redundancy reduction process. In the case of speech synthesis, the waveform coding with high quality are mainly used to the synthesis by analysis. However, since the parameters of this coding are not classified into either excitation or vocal tract parameters, it is difficult to applying the waveform coding to the synthesis by rule. In order to applying the waveform coding to the synthesis by rule, the pitch alteration technique is required in prosody control. In this paper, we propose a new pitch alteration method that can change the pitch period in waveform coding by scaling the time-axis and compensating the spectrum. This is a kind of timefrequency domain methods where the phase components of the waveform are preserved with a little spectrum distortion of 2.5

Proceedings ArticleDOI
18 Nov 1996
TL;DR: By introducing an image shaping function before MPEG2 compression, 12-18 Mbit/s transmission is possible with less picture degradation, and this HDTV codec could be made compact by using standard TV processing units for MPEG2, which will be widely available in the near future.
Abstract: This paper describes the codec's specifications and results of subjective testing for decoded images at 12 Mbit/s to 36 Mbit/s. In this codec, the HDTV image is divided into four standard-TV-sized images, and multiple MPEG2 tools designed for standard TV are utilized. So, this codec could be made compact by using standard TV processing units for MPEG2, which will be widely available in the near future. Moreover, this HDTV codec has an architecture with which both HDTV processing and multiple standard TV processing is possible. As for the bit-rates, by introducing an image shaping function before MPEG2 compression, 12-18 Mbit/s transmission is possible with less picture degradation.

Proceedings Article
01 Sep 1996
TL;DR: This paper further extends a multiple layer video codec using affine motion compensation by incorporating a new block level and designing a coding control strategy, which makes the codec perform efficiently at very low bit rate and for small size image sequences.
Abstract: The performance of a very low bit rate video codec largely depends on the efficient use of motion compensated prediction technique and on a good coding control strategy. In our previous approach [6], we proposed a multiple layer video codec using affine motion compensation. In this paper, we further extend our affine compensated multi-layer codec by incorporating a new block level and designing a coding control strategy. A measure of coherent motion is used in the decision process which makes the codec perform efficiently at very low bit rate and for small size image sequences (QCIF and sub-QCIF format). The experimental results conduced on 15 MPEG test sequences in QCIF format show improvement in PSNR of 0.2 dB and reduction in bit rate of 0.9 kbits/second.


Proceedings ArticleDOI
01 Sep 1996
TL;DR: A new split-band LD-CELP wideband coder at 24 kbit/s is proposed and its performance and complexity are compared with those of the already known wide band LD- CELP.
Abstract: Nowaday 7 Khz wideband speech coding requires at least 48 kbit/s as it still depends on the ITU standard G.722. CELP coders have been developed for wideband systems achieving high quality speech coding at rates from 16 kbit/s to 32 kbit/s as the wideband LD-CELP at 32 kbit/s. In this paper, a new split-band LD-CELP wideband coder at 24 kbit/s is proposed and its performance and complexity are compared with those of the already known wideband LD-CELP.

Proceedings ArticleDOI
07 May 1996
TL;DR: A new low-complexity speech coding method called "Dual-Pulse CS-CelP (DP-CS-CELP)" at 7.8 kbit/s is proposed, based on ITU-T G.729, which achieves real-time speech coding and decoding on personal computers.
Abstract: Low-cost highly-efficient speech coding is important for personal multimedia communications. In this paper we propose a new low-complexity speech coding method called "Dual-Pulse CS-CELP (DP-CS-CELP)" at 7.8 kbit/s. This method is based on ITU-T G.729. To reduce the complexity, we applied a new excitation model to the random code vectors and simplified the LPC coding, adaptive codebook search, perceptual weighting, and the other structures. The number of operations for this encoder is 3.80 MOPS, which achieves real-time speech coding and decoding on personal computers. Although the efficiency of this coder is a little lower than that of G.729, the MOS listening test showed that the subjective quality was equivalent to or a little better than that of G.726 (32-kbit/s ADPCM).

Proceedings ArticleDOI
05 May 1996
TL;DR: This paper presents a modified multiband excitation (M/sup 2/BE) speech coding algorithm which can achieve high-quality synthesized speech at 4.8, 3.6, 2.4 and 1.2 kb/s.
Abstract: This paper presents a modified multiband excitation (M/sup 2/BE) speech coding algorithm which can achieve high-quality synthesized speech at 4.8, 3.6, 2.4 and 1.2 kb/s. The computer simulation results show that the synthesized speech is quite natural even at rate as low as 2.4 kb/s. This algorithm has been successfully implemented on a single TMS320C31 floating point processor and informal listening test demonstrates that the speech quality is quite good between 1.2-4.8 kb/s.

Proceedings ArticleDOI
14 Oct 1996
TL;DR: A general video codec is designed and implemented using generalized subband decomposition, vector quantization with variable bit allocation and motion compensation, capable of encoding color video in the QCIF format at 15.2 kbit/s and ten frames per second.
Abstract: A general video codec is designed and implemented using generalized subband decomposition, vector quantization with variable bit allocation and motion compensation. The resulting codec is capable of encoding color video in the QCIF format at 15.2 kbit/s and ten frames per second. Simulation results are presented and compared with other video coding methods, such as the H.261 and MPEG standards.

Proceedings ArticleDOI
21 Oct 1996
TL;DR: A speech code/decode algorithm which combines MBE and LPC speech model is proposed, which can operate at 2.4 kbps with much higher quality of synthesised speech than LPC-10e and less computation complexity than CELP, VSELP and so on.
Abstract: A speech code/decode algorithm which combines MBE and LPC speech model is proposed. In this model, the spectral envelope is represented using Linear Prediction Coefficients, which are coded using Line Spectrum Frequencies (LSFs). It can operate at 2.4 kbps with much higher quality of synthesised speech than LPC-10e and less computation complexity than CELP, VSELP and so on. Therefore it is particularly attractive for VLSI implementation.

Proceedings ArticleDOI
29 Sep 1996
TL;DR: A new channel error protection scheme for a low complexity speech codec that is suitable for the Personal Handy-phone System (PHS) multimedia communication and Deterioration of speech quality is suppressed by using CRC and parameter estimation for error protection.
Abstract: This paper proposes a new channel error protection scheme for a low complexity speech codec that is suitable for the Personal Handy-phone System (PHS) multimedia communication. Deterioration of speech quality is suppressed by using CRC and parameter estimation for error protection. Two types of codec are described: a 10-ms frame type that transmits 160 bits every 10 ms and a 15-ms frame type that transmits 160 bits every 15 ms. The computational complexity of these codecs is less than 5 MOPS. In a no-channel error environment, the speech quality is equal to that of G.726 at 32.0 kbit/s. With 0.3% channel error, both codecs offer more comfortable conversation than G.726. Moreover, at 1.0% channel error, the 10-ms frame type still provides comfortable conversation.