scispace - formally typeset
Search or ask a question
Topic

Adaptive Multi-Rate audio codec

About: Adaptive Multi-Rate audio codec is a research topic. Over the lifetime, 1467 publications have been published within this topic receiving 19736 citations. The topic is also known as: AMR & Adaptive Multi-Rate.


Papers
More filters
Proceedings ArticleDOI
14 Apr 1991
TL;DR: The exploitation of left-right correlation in a subband code for stereophonic audio signals is investigated and preliminary results of a stereo codec are promising: at 192 kb/s good coding results have been obtained.
Abstract: The exploitation of left-right correlation in a subband code for stereophonic audio signals is investigated. A transform of left and right signals into decorrelated intensity and error signals is presented. Although this can be seen as the optimal exploitation of redundancy, it yields only marginal gain in bit rate. If the reduced phase-sensitivity of the human observer can be exploited by encoding only the intensity signal, a substantial gain can be obtained. Preliminary results of a stereo codec are promising: at 192 kb/s good coding results have been obtained. >

111 citations

PatentDOI
TL;DR: In this paper, a 26-bit spectrum filter coding scheme was used to jointly optimize pitch and gain parameter sets in a speech codec operating at low data rates using an iterative method, where the number of bits allocated to the pitch and excitation signals depend on whether the signals are significant or not.
Abstract: A speech codec operating at low data rates uses an iterative method to jointly optimize pitch and gain parameter sets. A 26-bit spectrum filter coding scheme may be used, involving successive subtractions and quantizations. The codec may preferably use a decomposed multipulse excitation model, wherein the multipulse vectors used as the excitation signal are decomposed into position and amplitude codewords. Multipulse vectors are coded by comparing each vector to a reference multipulse vector and quantizing the resulting difference vector. An expanded multipulse excitation codebook and associated fast search method, optionally with a dynamically-weighted distortion measure, allow selection of the best excitation vector without memory or computational overload. In a dynamic bit allocation technique, the number of bits allocated to the pitch and excitation signals depend on whether the signals are "significant" or "insignificant". Silence/speech detection is based on an average signal energy over an interval and a minimum average energy over a predetermined number of intervals. Adaptive post-filter and the automatic gain control schemes are also provided. Interpolation is used for spectrum filter smoothing, and an algorithm is provided for ensuring stability of the spectrum filter. Specially designed scalar quantizers are provided for the pitch gain and excitation gain.

110 citations

Journal ArticleDOI
TL;DR: A toll quality speech codec at 8 kb/s suitable for the future personal communications system and can support a frame erasure rate up to 3% with a degradation in its performance that is still worse than the ITU-T requirements.
Abstract: A toll quality speech codec at 8 kb/s suitable for the future personal communications system is presented. The codec is currently under standardization by the ITU-T (successor of CCITT) where the codec terms of reference were mainly determined considering PCS application. The encoding algorithm is based on algebraic code-excited linear prediction (ACELP) and has a speech frame of 10 ms. Efficient pitch and codebook search strategies, along with efficient quantization procedures, have been developed to achieve toll quality encoded speech with a complexity implementable on current fixed-point DSP chips. Formal subjective listening tests, performed by ITU-T SG 12, showed that the codec quality is equivalent to that of G.726 ADPCM at 32 kb/s in error-free conditions and it outperforms G.726 under error conditions. The codec performs adequately under tandeming conditions, and can support a frame erasure rate up to 3% with a degradation in its performance that is still worse than the ITU-T requirements, and this is one subject of study for the next phase. The algorithm has been implemented on a single fixed-point DSP for the ITU-T subjective rest, and required about 29 MIPS. An optimized version, however, requires 24 MIPS without any speech quality degradation. >

110 citations

Journal ArticleDOI
TL;DR: A new algorithm is proposed for steganography in low bit-rate VoIP audio streams by integrating information hiding into the process of speech encoding, thus maintaining synchronization between information hiding and speech encoding.
Abstract: Low bit-rate speech codecs have been widely used in audio communications like VoIP and mobile communications, so that steganography in low bit-rate audio streams would have broad applications in practice. In this paper, the authors propose a new algorithm for steganography in low bit-rate VoIP audio streams by integrating information hiding into the process of speech encoding. The proposed algorithm performs data embedding while pitch period prediction is conducted during low bit-rate speech encoding, thus maintaining synchronization between information hiding and speech encoding. The steganography algorithm can achieve high quality of speech and prevent detection of steganalysis, but also has great compatibility with a standard low bit-rate speech codec without causing further delay by data embedding and extraction. Testing shows, with the proposed algorithm, the data embedding rate of the secret message can attain 4 bits/frame (133.3 bits/second).

109 citations

Proceedings ArticleDOI
19 Apr 2009
TL;DR: This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding, which results in a codec that exhibits consistently high quality for speech, music and mixed audio content.
Abstract: Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding.

108 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
79% related
Data compression
43.6K papers, 756.5K citations
78% related
Decoding methods
65.7K papers, 900K citations
78% related
Computational complexity theory
30.8K papers, 711.2K citations
76% related
Hidden Markov model
28.3K papers, 725.3K citations
75% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202310
202214
20201
20193
20183
201721