scispace - formally typeset
Journal ArticleDOI

Low bit-rate speech coders for multimedia communication

R.V. Cox, +1 more
- 01 Dec 1996 - 
- Vol. 34, Iss: 12, pp 34-41
TLDR
The attributes of speech coders such as bit rate, complexity, delay, and quality are described, which are applicable to low-bit-rate multimedia communications.
Abstract
The International Telecommunications Union (ITU) has standardized three speech coders which are applicable to low-bit-rate multimedia communications. ITU Rec. G.729 8 kb/s CS-ACELP has a 15 ms algorithmic codec delay and provides network-quality speech. It was originally designed for wireless applications, but is applicable to multimedia communications as well. Annex A of Rec. G.729 is a reduced-complexity version of the CS-ACELP coder. It was designed explicitly for simultaneous voice and data applications that are prevalent in low-bit-rate multimedia communications. These two coders use the same bitstream format and can interoperate. The ITU Rec. G.723.1 6.3 and 5.3 kb/s speech coder for multimedia communications was designed originally for low-bit-rate videophones. Its frame size of 30 ms and one-way algorithmic codec delay of 37.5 ms allow for a further reduction in bit rate compared to the G.729 coder. In applications where low delay is important, the delay of G.723.1 may be too large. However, if the delay is acceptable, G.723.1 provides a lower-complexity alternative to G.729 at the expense of a slight degradation in quality. This article describes the attributes of speech coders such as bit rate, complexity, delay, and quality. Then it discusses the basic concepts of the three new ITU coders by comparing their specific attributes. The second part of this article describes the standardization process for each of these coders.

read more

Citations
More filters
Journal ArticleDOI

A robust voice activity detector for wireless communications using soft computing

TL;DR: This paper presents a voice detection algorithm which is robust to noisy environments, thanks to a new methodology adopted for the matching process, based on a pattern recognition approach in which the matching phase is performed by a set of six fuzzy rules, trained by means of a new hybrid learning tool.
Proceedings ArticleDOI

Comparison and optimization of packet loss repair methods on VoIP perceived quality under bursty loss

TL;DR: This work uses the Gilbert loss model to infer that changing the packet interval affects loss burstiness, which in turn influences forward error correction (FEC) performance, and performs subjective listening tests based on Mean Opinion Score to evaluate the effect of bursty loss on VoIP perceived quality.
Journal ArticleDOI

Rate-compatible convolutional codes for multirate DS-CDMA systems

TL;DR: New rate-compatible convolutional (RCC) codes with high constraint lengths and a wide range of code rates are presented and are shown to provide good performance and rate-matching capabilities.
Journal ArticleDOI

Performance evaluation and comparison of G.729/AMR/fuzzy voice activity detectors

TL;DR: A performance evaluation and comparison of G.729, AMR, and fuzzy voice activity detection (FVAD) algorithms was made using objective, psychoacoustic, and subjective parameters to evaluate the extent to which VADs depend on language, the signal-to-noise ratio, or the power level.
Journal ArticleDOI

Adaptive source rate control for real-time wireless video transmission

TL;DR: An adaptive source rate control (ASRC) scheme which can work together with the hybrid ARQ error control schemes to achieve efficient transmission of real-time video with low delay and high reliability is proposed.
References
More filters
Proceedings ArticleDOI

Code-excited linear prediction(CELP): High-quality speech at very low bit rates

TL;DR: A code-excited linear predictive coder in which the optimum innovation sequence is selected from a code book of stored sequences to optimize a given fidelity criterion, indicating that a random code book has a slight speech quality advantage at low bit rates.
Proceedings ArticleDOI

A new model of LPC excitation for producing natural-sounding speech at low bit rates

B. Atal, +1 more
TL;DR: This paper describes a new approach to the excitation problem that does not require a priori knowledge of either the voiced-unvoiced decision or the pitch period, and minimizes a perceptual-distance metric representing subjectively-important differences between the waveforms of the original and the synthetic speech signals.
Journal ArticleDOI

Speech coding: a tutorial review

TL;DR: The objective of this paper is to provide a tutorial overview of speech coding methodologies with emphasis on those algorithms that are part of the recent low-rate standards for cellular communications.
Journal ArticleDOI

Predictive coding of speech signals and subjective error criteria

TL;DR: Improved speech quality is obtained by efficient removal of formant and pitch-related redundant structure of speech before quantizing, and by effective masking of the quantizer noise by the speech signal.
Journal ArticleDOI

Advances in speech and audio compression

TL;DR: Current activity in speech compression is dominated by research and development of a family of techniques commonly described as code-excited linear prediction (CELP) coding, which offer a quality versus bit rate tradeoff that significantly exceeds most prior compression techniques.