scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 2018"


Journal ArticleDOI
TL;DR: The presented experiments demonstrate that the proposed randomizations yield uncorrelated signals, that perceptual quality is competitive, and that the complexity of the proposed methods is feasible for practical applications.
Abstract: Efficient coding of speech and audio in a distributed system requires that quantization errors across nodes are uncorrelated. Yet, with conventional methods at low bitrates, quantization levels become increasingly sparse, which does not correspond to the distribution of the input signal and, importantly, also reduces coding efficiency in a distributed system. We have recently proposed a distributed speech and audio codec design, which applies quantization in a randomized domain such that quantization errors are randomly rotated in the output domain. Similar to dithering, this ensures that quantization errors across nodes are uncorrelated and coding efficiency is retained. In this paper, we improve this approach by proposing faster randomization methods, with a computational complexity of $\mathcal O(N\log N)$ . The presented experiments demonstrate that the proposed randomizations yield uncorrelated signals, that perceptual quality is competitive, and that the complexity of the proposed methods is feasible for practical applications.

18 citations


Book ChapterDOI
01 Jan 2018
TL;DR: Perceptual transform-based audio coding schemes developed up to now are briefly reviewed including the family of ISO/IEC MPEG audio coding standards, proprietary audio compression algorithms, broadcasting/speech/data communication codecs, as well as open-free, patent royalty-free audio/speech codecs.
Abstract: In general, audio coding or audio compression algorithms are used to obtain compact digital representation of high-quality audio signals for their efficient transmission and storage. The central objective in audio coding is to represent the signal with a minimum number of bits while achieving its transparent reproduction. Besides speech coding schemes based on linear prediction methods especially tailored for efficient speech compression, the developed perceptual transform-based audio coding schemes gained a greater attention, particularly for applications in consumer electronics. Typically, any transform-based audio coding scheme utilizes a near-perfect quadrature mirror filter (QMF) and/or perfect reconstruction cosine-modulated filter bank to obtain a block-wise representation of the audio signal in the frequency domain. Perceptual transform-based audio coding schemes developed up to now are briefly reviewed including the family of ISO/IEC MPEG audio coding standards, proprietary audio compression algorithms, broadcasting/speech/data communication codecs, as well as open-free, patent royalty-free audio/speech codecs. The discussion is concentrated especially on adopted near-perfect QMF and perfect reconstruction cosine-modulated filter banks, processing methods, and specified transform block sizes.

2 citations


Proceedings ArticleDOI
04 Jul 2018
TL;DR: The results revealed considerable shifts of formants when compressed by the AMR codec and indicate that the extent of the shifts differs not only for individual formants but also for the two genders, vowel qualities and the software used.
Abstract: Automatic formant measurement is generally reliable but can be affected by various factors, such as telephone transmission. As forensic speaker identification often involves comparison of direct (face-to-face) speech with a telephone recording, it is necessary to examine what effect telephony has on the speech signal. This study focuses on the impact of the AMR codec - this codec being the standard in mobile telephony - on formants. In comparison with previous studies, our study analyses the impact of both versions of the codec (narrowband and wideband) at all possible bit rates and on a large amount of data. Furthermore, the effect was examined in two processing tools - Praat and VoiceSauce. Our results revealed considerable shifts of formants when compressed by the codec and indicate that the extent of the shifts differs not only for individual formants but also for the two genders, vowel qualities and the software used.

2 citations