scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 1995"


Journal ArticleDOI
01 Jun 1995
TL;DR: Basic approaches to speech, wideband speech, and audio bit rate compressions in audiovisual communications are explained and it will become obvious that the use of the knowledge of auditory perception helps minimizing perception of coding artifacts and leads to efficient low bit rate coding algorithms which can achieve substantially more compression than was thought possible only a few years ago.
Abstract: Current and future visual communications for applications such as broadcasting videotelephony, video- and audiographic-conferencing, and interactive multimedia services assume a substantial audio component. Even text, graphics, fax, still images, email documents, etc. will gain from voice annotation and audio clips. A wide range of speech, wideband speech, and wideband audio coders is available for such applications. In the context of audiovisual communications, the quality of telephone-bandwidth speech is acceptable for some videotelephony and videoconferencing services. Higher bandwidths (wideband speech) may be necessary to improve the intelligibility and naturalness of speech. High quality audio coding including multichannel audio will be necessary in advanced digital TV and multimedia services. This paper explains basic approaches to speech, wideband speech, and audio bit rate compressions in audiovisual communications. These signal classes differ in bandwidth, dynamic range, and in listener expectation of offered quality. It will become obvious that the use of our knowledge of auditory perception helps minimizing perception of coding artifacts and leads to efficient low bit rate coding algorithms which can achieve substantially more compression than was thought possible only a few years ago. The paper concentrates on worldwide source coding standards beneficial for consumers, service providers, and manufacturers. >

62 citations


Journal ArticleDOI
TL;DR: The design and performance of a range of wireless videophone transceivers are presented and the system's robustness was increased using automatic repeat requests (ARQ), inevitably reducing the number of users supported, which was between 6 and 19 for the various systems.
Abstract: The design and performance of a range of wireless videophone transceivers are presented. Highly bandwidth efficient, fixed but with arbitrarily programmable rate, perceptually weighted discrete cosine transform (DCT) based video codecs are proposed for quarter common intermediate format (QCIF) videophone sequences. Perceptually weighted cost/gain controlled motion compensation and quad-class DCT-based compression is applied with and without run-length coding. Specifically, we propose video codecs having transmission rates in the range of 5-11.36 kbps and preselected the 11.36 kbps codec 1, the 8.52 kbps codec 2 and the 8 kbps codec 2a, for which we designed the intelligent reconfigurable systems 1-5. After sensitivity-matched binary Bose-Chaudhuri-Hocquenghem (BCH) forward error correction (FEC) coding the data rate associated with codec 1 and codec 2a became 20.32 kbps, while that of codec 2 was 15.24 kbps. Throughout these systems a partial forced update (PFU) technique was invoked in order to keep transmitter and receiver aligned amongst hostile channel conditions. When using codec 1 in system 1 and coherent pilot symbol assisted 16-level quadrature amplitude modulation (16-PSAQAM), an overall signalling rate of 9 kBd was yielded. Over lower quality channels the 4QAM mode of operation had to be invoked, which required twice as many time slots to accommodate the resulting 18 kBd stream, The system's robustness was increased using automatic repeat requests (ARQ), inevitably reducing the number of users supported, which was between 6 and 19 for the various systems. In a bandwidth of 200 kHz, similarly to the Pan-European GSM mobile radio system's speech channel, using system 1 for example, 16 and 8 videophone users can be supported in the 16QAM and 4QAM modes, respectively. >

60 citations


Journal ArticleDOI
15 Feb 1995
TL;DR: The integrated circuit combines a voiceband codec with a baseband codec plus auxiliary converters in an 80-pin TQFP package while running off a single 3 V power supply.
Abstract: Designed to satisfy the data conversion requirements of the Pan-European (GSM) cellular radio system, the chip is part of a four chip-set total GSM solution comprising the codec, the DSP, the digital ASIC, and the micro-controller. Fabricated in a 0.8 /spl mu/m double-poly, double-metal CMOS process, the integrated circuit combines a voiceband codec with a baseband codec plus auxiliary converters in an 80-pin TQFP package while running off a single 3 V power supply.

28 citations


Patent
Lawrence F. Heyl1
11 Apr 1995
TL;DR: In this article, an audio codec capable of handling complex control and routing of numerous sound inputs is described, which is obtained by weighting various sound inputs in accordance with weighting values and then digitally mixing the weighted sound inputs together.
Abstract: An audio codec capable of handling complex control and routing of numerous sound inputs is described. The complex control and routing is obtained by weighting various sound inputs in accordance with weighting values and then digitally mixing the weighted sound inputs together. The invention facilitates construction of the audio codec with mainly fixed gain amplifiers, instead of variable gain preamplifiers, thereby saving a large amount of die space and reducing time needed for testing.

27 citations



Proceedings ArticleDOI
15 May 1995
TL;DR: The method and results of a subjective evaluation are presented which was conducted in order to select a new speech codec for the Inmarsat mini-M system and it was concluded that one codec was able to deliver performance that is equivalent to, or better than, the IS-54 full-rate digital cellular 8 kbit/s VSELP codec.
Abstract: In this paper the method and results of a subjective evaluation are presented which was conducted in order to select a new speech codec for the Inmarsat mini-M system. The mini-M system is designed to provide the next generation of global, notebook-sized satellite terminals for transportable, land-mobile and maritime voice, facsimile, and data communications. Overall, six different codecs operating at a combined source and channel rate of 4.8 kbit/s were evaluated in a series of six subjective tests. From this, it was concluded that one codec was able to deliver performance that is equivalent to, or better than, the IS-54 full-rate digital cellular 8 kbit/s VSELP codec, and was selected for use in the mini-M system.

11 citations


Dissertation
01 Jan 1995
TL;DR: Results are presented which show that the variable-rate CELP speech coder for implementation on the TMS320C51 Digital Signal Processor obtains near equivalent quality compared with an 8 kbit/s fixed-rate system and significantly better quality than a fixedrate system with the same average rate.
Abstract: In a typical voice codec application, we wish to maximize system capacity while at the same time maintain an acceptable level of speech quality. Conventional speech coding algorithms operate at fixed rates regardless of the input speech. In applications where the system capacity is determined by the average rate, better performance can be achieved by using a variable-rate codec. Examples of such applications are CDMA based digital cellular and digital voice storage. . In order to achieve a high quality, low average bit-rate Code Excited Linear Prediction (CELP) system, it is necessary to adjust the output bit-rate according to an analysis of the immediate input speech statistics. This thesis describes a lowcomplexity variable-rate CELP speech coder for implementation on the TMS320C51 Digital Signal Processor. The system implementation is user-switchable between a fixed-rate 8 kbit/s configuration and a variable-rate configuration with a peak rate of 8 kbit/s and an average rate of 4-5 kbit/s based on a one-way conversation with 30% silence. In variable-rate mode, each speech frame is analyzed by a frame classifier in order to determine the desired coding rate. A number of techniques are considered for reducing the complexity of the CELP algorithm for implementation while minimizing speech quality degradation. In a fixed-point implementation, the limited dynamic range of the processor leads to a loss in precision and hence a loss in performance compared with a floating-point system. As a result, scaling is necessary to maintain signal precision and minimize speech quality degradation. A scaling strategy is described which offers no degradation in speech quality between the fixed-point and floating-point systems. We present results which show that the variable-rate system obtains near equivalent quality compared with an 8 kbit/s fixed-rate system and significantly better quality than a fixedrate system with the same average rate. To my parents and my fiance, with love.

9 citations


Proceedings ArticleDOI
G. Schroder1
20 Sep 1995

6 citations


Journal ArticleDOI
TL;DR: Simulation results show that the proposed method can reconstruct an image up to four times faster than that using the H261 alone, and the progressive build up obtained with this method is very pleasant to the human observer.

4 citations


Proceedings ArticleDOI
21 Apr 1995
TL;DR: A description of a complete hybrid codec based on OLA (Optimal Level Allocation) and HVS (Human Visual System) based classification is given and a performance comparison is made between this codec and an MPEG-2 like codec.
Abstract: At this time, almost all (de-facto) video coding standards are DCT based. It would be wrong though to think that DCT is the only practical way to reach a reasonable compression ratio for a reasonable codec complexity. In this paper, a description of a complete hybrid codec based on OLA (Optimal Level Allocation) and HVS (Human Visual System) based classification is given. Then a performance comparison is made between this codec and an MPEG-2 like codec.© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

3 citations


Proceedings ArticleDOI
26 Mar 1995
TL;DR: A new digital codec is described, which can transmit 1125/60 HDTV signals at a bit-rate of 15 to 45 Mbps in order to cover a wide range of applications such as SNG, distribution or contribution.
Abstract: A new digital codec is described, which can transmit 1125/60 HDTV signals at a bit-rate of 15 to 45 Mbps in order to cover a wide range of applications such as SNG, distribution or contribution. To achieve satisfactory picture quality at such low bit rates, including high quality stereo sounds, a data channel, a forward error correction code, and new advanced coding techniques are introduced into a conventional motion compensated interframe and intrafield adaptive DCT coding scheme. By using these key techniques, this system provides a significantly better coding performance than MPEG-2. In order to verify the transmission performance over an actual transmission link, field trials were carried out between Japan and the USA, which demonstrates that this codec is suitable for practical use in a wide range of HDTV applications. >

Journal ArticleDOI
TL;DR: A CELP algorithm in which the complexity of the search procedure in the adaptive codebook is greatly reduced by means of a modified model of the C ELP synthesizer, while keeping the usual perceptual weighting of the synthesis error in the analysis procedure.
Abstract: In this letter, we propose a CELP algorithm in which the complexity of the search procedure in the adaptive codebook is greatly reduced. This is achieved by means of a modified model of the CELP synthesizer, while keeping the usual perceptual weighting of the synthesis error in the analysis procedure. Simulation results show that the proposed algorithm can provide speech quality comparable to the one obtained with the conventional CELP codec. >

Proceedings ArticleDOI
20 Sep 1995
TL;DR: A variable bit-rate LowDelay Code-Excited Liner Prediction (ZD-CELP) speech coding algorithm that has been submitted to the ITU Telecommunication Standardization Sector and is recognized as the candidate algorithm for the expansion of ITU-T Recommendation G.
Abstract: This paper presents a variable bit-rate LowDelay Code-Excited Liner Prediction (ZD-CELP) speech coding algorithm that has been submitted to the ITU Tel{ecommunication Standardization Sector @TU-T) and is recognized as the candidate algorithm for the expansion of ITU-T Recommendation G. 728 operating at the lower bit-rate of 12.8 &ids and 9.6 kbids. The main purpose of this expansion is to provide a variable bitrate speech codec that will be used in the future Digital Circuit Multiplication Equipment @CUE).

Book ChapterDOI
01 Jan 1995
TL;DR: This chapter describes important parameters in CODEC (coder/decoder) testing, which include the total harmonic distortion (THD) and signal-to-noise ratio (SNR), which require the DSP process as well as the gain error.
Abstract: This chapter describes important parameters in CODEC (coder/decoder) testing. The CODEC device is the backbone of telephone communications. The important parameters in CODEC testing are the total harmonic distortion (THD) and signal-to-noise ratio (SNR), which require the DSP process as well as the gain error. The details of an on-bench test connection to test the above parameters in single-tone methodology along with the explanation of the circuit operation are presented. The CODEC also coordinates the timing between itself and the networks with which it interfaces. Data transmission is synchronized and multiplexed. It has been found that when the encoding section receives the analog signal, the signal is passed through filtering stages to suppress low-frequency noise and the anti-aliasing process is then applied to it. A process called “companding” is essential to the CODEC's performance. Digitization of the analog pulse is not uniform, because input pulses themselves are not uniform. Companding is used to upgrade the quality of the pulse, which increases the signal-to-noise ratio and reduces the peak power to prevent overloading. The most important noise affecting CODEC design is the idle channel noise (ICN), which exists even when there is no signal traffic. It has been found that ICN occurs as a result of the quantization of a 0 V input analog signal as the digital output bounces around the assigned binary value for zero along with any nonlinearity existing along the transmission line.

Proceedings ArticleDOI
05 Sep 1995
TL;DR: The authors compare the performance of different filter banks on the quality of the reconstructed speech signal and investigate the applicability of perceptual coding to speech coding.
Abstract: Investigates the applicability of perceptual coding to speech coding. The authors compare the performance of different filter banks on the quality of the reconstructed speech signal.

Proceedings ArticleDOI
14 Sep 1995
TL;DR: The main focus of this paper is to give an overview on the ongoing work towards the NBC (Non Backwards-Compatible) extension to MPEG-2 audio and MPEG-4 audio coding.
Abstract: Since 1988 the audio subgroup is ISO/IEC JTC1/SC29 WG11 (called MPEG) has been working on the standardization of high quality low bit-rate audio coding. Until now the standards 11172-3 (1) (MPEG-1 audio) and 13818-3 (2) (MPEG-2 audio) have been finished. The main focus of this paper is to give an overview on the ongoing work towards the NBC (Non Backwards-Compatible) extension to MPEG-2 audio and MPEG-4 audio coding. >


Proceedings ArticleDOI
20 Sep 1995
TL;DR: The MT-CELP as mentioned in this paper is a carefully optimized anulysis-by-synthesis speech coder with associated channel coding for a future enhanced fill-rate speech mode in GSM based systems.
Abstract: In the near future there will be a need for mobile services which can provide quality close to wire-line. This paper describes the Ericsson candidate codec to meet the requirements for a future enhanced fill-rate speech mode in GSM based systems. It is a carefully optimized anulysisby-synthesis speech coder called MT-CELP with associated channel coding. MOS tests show that all the requirements can be met.

Journal ArticleDOI
TL;DR: Entropy coding principles are applied to the ITU G.728 speech codec and it is shown that the average bit rate can be reduced to 14.5 kbit/s without a significant increase in the codec complexity.
Abstract: Entropy coding principles are applied to the 16 kbit/s ITU G.728 speech codec. It is shown that the average bit rate can be reduced to 14.5 kbit/s without a significant increase in the codec complexity. In very low bit rate audiovisual communication applications such as the videophone, the saved bits can be used to improve the output video quality.

Proceedings ArticleDOI
20 Sep 1995
TL;DR: The overall system is found to provide toll qudity speech with only a graceful degradation at increasing error rates, and it meets the preliminary quality requirements for the new standard.
Abstract: This paper describes the basic performance requirements for a new enhanced GSM codec and a proposal for this application based on the ITU 8 kb/s coder standard. A powerful channel wding scheme using more than half the gross b i t rate is designed and the performance of the codec ir formally evaluatedusing MOS testing. The overall system is found to provide toll qudity speech with only a graceful degradation at increasing bi t error rates, and it meets the preliminary quality requirements for the new standard.


Proceedings ArticleDOI
07 Jun 1995
TL;DR: A low bit-rate audio coding method based on minimum noise loudness criterion of a psychoacoustic model is proposed and a transform coder based on the human auditory model is designed.
Abstract: A low bit-rate audio coding method based on minimum noise loudness criterion of a psychoacoustic model is proposed. The lower bound of the bit rate versus the suggested perceptual distortion is also discussed. The problem encountered in processing wideband audio signal is its high bit-rate in limited bandwidth channel and high capacity requirement in storage media. Most of high quality audio compression methods heavily rely on human perception, or more precisely on simultaneous masking, to reduce the bit rate. In order to make an explicit and separate control to encode different frequency regions in the auditory spectrum, many compression methods for wideband audio are based on spectral decomposition via linear transform or subband coding. A transform coder based on the human auditory model is designed.

Proceedings ArticleDOI
P. Usai1, G. Cosier2, D. Pascal3, J. Sotscheck, M. Kappelan4 
18 Jun 1995
TL;DR: The paper describes the tests performed and gives an outline of the performance of the codec with voice signals under realistic network conditions, and the effects on the speech performance produced by the voice activity detector and related DTX system are not the main subject of the paper but information is given.
Abstract: The Pan-European cellular digital mobile radio system GSM uses a codec with a net bit rate of 13.0 kbit/s (gross bit rate including error protection 22.8 kbit/s), known as the "full rate" RPE-LTP (regular pulse excitation with long term prediction). GSM is now ready to dub channel capacity with the adoption of a new algorithm as an ETSI Recommendation, the candidate codec being called appropriately "half-rate" (gross bit rate 11.4 kbit/s). Internationally coordinated series of subjective listening experiments were planned and carried out during the exercise. Four main phases were necessary, called qualification, selection(s), optimisation and characterisation. The paper describes the tests performed and gives an outline of the performance of the codec with voice signals under realistic network conditions. The effects on the speech performance produced by the voice activity detector and related DTX system are not the main subject of the paper but information on this topic is given.

Proceedings ArticleDOI
06 Nov 1995
TL;DR: In a bandwidth of 200 kHz, similarly to the Pan-European GSM mobile radio system's speech channel, using systems 1 and 3 for example, 16 and 8 videophone users can be supported in the 16 QAM and 4QAM modes, respectively.
Abstract: A range of 5-11.36 kbps videophone codecs are proposed and the 11.36 kbps codec 1, the 8.52 kbps codec 2 and the 8 kbps codec 2a are embedded in the intelligent re-configurable systems 1-3. After sensitivity-matched binary Bose-Chaudhuri-Hocquenghem (BCH) forward error correction (FEC) coding the data rate associated with codec 1 and codec 2a became 20.32 kbps, while that of codec 2 was 15.24 kbps. When using codec 1 in system 1 and coherent pilot symbol assisted 16-level quadrature amplitude modulation (16-PSAQAM), an overall signalling rate of 9 kBd was yielded. Over lower quality channels the 4QAM mode of operation had to be invoked, which required twice as many time slots to accommodate the resulting 18 kBd stream. In a bandwidth of 200 kHz, similarly to the Pan-European GSM mobile radio system's speech channel, using systems 1 and 3 for example, 16 and 8 videophone users can be supported in the 16 QAM and 4QAM modes, respectively.

Patent
11 Oct 1995
TL;DR: In this article, an automatic gain control circuit controlled the gain for decoding by the use of a maximum prediction value generated at the time of coding, and the accurate gain control of the output signal was thus made possible at decoding.
Abstract: A voice codec apparatus for predictive coding is disclosed, in which an automatic gain control circuit controls the gain for decoding by the use of a maximum prediction value generated at the time of coding. The accurate gain control of the output signal is thus made possible at the time of decoding.

Proceedings ArticleDOI
20 Sep 1995
TL;DR: The proposed MultiPulse coder makes use of a spiky deconvolution algorithm to obtain the sparse exO, and has the low computational complexity and the flexibility to operate at a wide range of bit rates, by simply selecting the suitable pulse rate.
Abstract: The wideband speech coder proposed in this paper is based on a MultiPulse scheme developed in our Department [l] in 1992. The MP coder is a full-band scheme which makes forward linear prediction analysis every frame of 16 ms. The bit rate for this coder is able to swatch, every 2 msec, between five possible rates from 16 kbps to 33 kbps. The quality measures show a lower objective quality for the MP coder when wmparing with the standard but a similar subjective quality. 1 Wideband Speech Coding Wideband speech coding (50-7000 Hz) is required for the emerging high-quality communications. The channel capacity of these new teleservices allows to spend more than 16 kbps for the speech codec without the bandwith restriction of the conventional telephony. The reference coder for the wideband speech signals is the ITU-T standard G.722, a two-bands ADPCM coder with three possible bit rates, 64 kbps, 56 kbps and 48 kbps [2]. The current activities are concentrated on coding at 32 kbps without decreasing the quality of the standard at 64 kbps and with the goal of wideband speech coding at 16 kbps [3]. Few schemes for wideband speech coding have been proposed until now, most of them based on the CELP algorithm [4, 51. The main drawbacks of a traditional CELP, working with wideband speech, are the huge computational complexity and the memory requirements to store the codebooks. Our proposed MultiPulse coder makes use of a spiky deconvolution algorithm to obtain the sparse exOThis work has been partially funded by Northern Telecom and by the Spanish Research National Plan under grant no. TIC92-0800-C05-02. citation. The main characteristics of this coder are the low computational complexity and the flexibility to operate at a wide range of bit rates, by simply selecting the suitable pulse rate. For telephonic speech, we get a Variable Rate version with six different operating rates from 9.1 kbps to 4.8 kbps, offering a high to toll speech quality [6, 71.