scispace - formally typeset
Search or ask a question
Topic

Adaptive Multi-Rate audio codec

About: Adaptive Multi-Rate audio codec is a research topic. Over the lifetime, 1467 publications have been published within this topic receiving 19736 citations. The topic is also known as: AMR & Adaptive Multi-Rate.


Papers
More filters
Patent
Yang Gao1
15 Sep 2000
TL;DR: In this paper, a speech compression system with a fixed codebook structure and a new search routine is proposed for speech coding, which is capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech.
Abstract: A speech compression system with a special fixed codebook structure and a new search routine is proposed for speech coding. The system is capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech. The codebook structure uses a plurality of subcodebooks. Each subcodebook is designed to fit a specific group of speech signals. A criterion value is calculated for each subcodebook to minimize an error signal in a minimization loop as part of the coding system. An external signal sets a maximum bitstream rate for delivering encoded speech into a communications system. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. Each codec is selectively activated to encode and decode the speech signals at different bit rates to enhance overall quality of the synthesized speech at a limited average bit rate.

33 citations

Proceedings ArticleDOI
04 Aug 2002
TL;DR: In this article, a non-linear enhancement technique called audio-visual codebook dependent cepstral normalization (AVCDCN) was proposed for both audio-only and audio-Visual speech recognition.
Abstract: We introduce a non-linear enhancement technique called audio-visual codebook dependent cepstral normalization (AVCDCN) and we consider its use with both audio-only and audio-visual speech recognition. AVCDCN is inspired from CDCN, an audio-only enhancement technique that approximates the nonlinear effect of noise on speech with a piecewise constant function. Our experiments show that the use of visual information in AVCDCN allows significant performance gains over CDCN.

33 citations

Proceedings ArticleDOI
A. Uvliden1, S. Bruhn, R. Hagen
01 Nov 1998
TL;DR: This work reviews the general AMR system concept and discusses the capacity and quality benefits that can be achieved and an example solution for GSM is described including speech coding, channel coding, inband signaling, and the adaptation scheme.
Abstract: Adaptive multi-rate (AMR) is an emerging speech service currently being standardized in the ETSI for the GSM system. The new AMR standard will be flexible by adapting the error protection level and the allocated radio resources. A trade-off between speech quality and system capacity can be achieved for a variety of radio channel and operating conditions. The adaptation of the protection level will be fast and speech service specific. Besides the basic source and channel codec for speech signal payload, the AMR system concept further includes channel state tracking and inband transmission of adaptation data. We review the general AMR system concept and discuss the capacity and quality benefits that can be achieved. An example solution for GSM is described including speech coding, channel coding, inband signaling, and the adaptation scheme.

33 citations

Journal Article
TL;DR: This paper describes the delay sources and magnitude of the most common audio codecs and thus provides a guideline for the choice of themost suitable codec for a given application.
Abstract: Digital audio processing has been revolutionized by perceptual audio coding in the past decade. The main parameter to benchmark different codecs is the audio quality at a certain bit-rate. For many applications, however, delay is another key parameter which varies between only a few and hundreds of milliseconds depending on the algorithmic properties of the codec. Latest research results in low delay audio coding can significantly improve the performance of applications such as communications, digital microphones, and wireless loudspeakers with lip synchronicity to a video signal. This paper describes the delay sources and magnitude of the most common audio codecs and thus provides a guideline for the choice of the most suitable codec for a given application.

32 citations

Journal ArticleDOI
TL;DR: This paper presents first-time empirical evidence for masking in the perception of wideband vibrotactile signals, and presents a bitrate scalable haptic texture codec, which incorporates the masking model and describes its subjective and objective performance evaluation.
Abstract: Applications involving indirect interpersonal communication, such as collaborative design/assembly/exploration of physical objects, can benefit strongly from the transmission of contact-based haptic media, in addition to the more traditional audiovisual media. Inclusion of haptic media has been shown to improve immersiveness, task performance, and the overall experience of task execution. While several decades of research have been dedicated to the acquisition, processing, coding, and display of audio and video streams, similar aspects for haptic streams have been addressed only recently. Simultaneous masking is a perceptual phenomenon widely exploited in the compression of audio data. In the first part of this paper, to the best of our knowledge, we present first-time empirical evidence for masking in the perception of wideband vibrotactile signals. Our results show that this phenomenon for haptics is very similar to its auditory analog. Signals closer in frequency to a powerful masker ( 25 dB above detection threshold) are masked more strongly (peak threshold-shifts of up to 28 dB) than those away from the masker (threshold-shifts of 15–20 dB). The masking curves approximately follow the masker's spectral profile. In the second part of this paper, we present a bitrate scalable haptic texture codec, which incorporates the masking model and describe its subjective and objective performance evaluation. Experiments show that we can drive down the codec output bitrate to a very low value of 2.3 kbps, without the subjects being able to reliable discriminate between the codec input and distorted output texture signals.

32 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
79% related
Data compression
43.6K papers, 756.5K citations
78% related
Decoding methods
65.7K papers, 900K citations
78% related
Computational complexity theory
30.8K papers, 711.2K citations
76% related
Hidden Markov model
28.3K papers, 725.3K citations
75% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202310
202214
20201
20193
20183
201721