scispace - formally typeset
Search or ask a question
Topic

Adaptive Multi-Rate audio codec

About: Adaptive Multi-Rate audio codec is a research topic. Over the lifetime, 1467 publications have been published within this topic receiving 19736 citations. The topic is also known as: AMR & Adaptive Multi-Rate.


Papers
More filters
Proceedings ArticleDOI
21 Oct 1996
TL;DR: A speech code/decode algorithm which combines MBE and LPC speech model is proposed, which can operate at 2.4 kbps with much higher quality of synthesised speech than LPC-10e and less computation complexity than CELP, VSELP and so on.
Abstract: A speech code/decode algorithm which combines MBE and LPC speech model is proposed. In this model, the spectral envelope is represented using Linear Prediction Coefficients, which are coded using Line Spectrum Frequencies (LSFs). It can operate at 2.4 kbps with much higher quality of synthesised speech than LPC-10e and less computation complexity than CELP, VSELP and so on. Therefore it is particularly attractive for VLSI implementation.
Proceedings ArticleDOI
12 May 2011
TL;DR: This work focuses on optimizing C code by assembly in Basic Operators (BASOP) that shows the significant improvement of complexity and execution time of G.711.1 codec on ARM processor.
Abstract: ITU-T wideband speech codec G.711.1[1] extends and interoperates with widely-used narrowband G.711 codec. This results in an efficient deployment of G.711.1 codec over existing G.711-based VoIP to provide higher voice quality. However, processing capacity of commercial processors is very limited to implement wideband codec; therefore the codec needs to be optimized for real-time processing. This work focuses on optimizing C code by assembly in Basic Operators (BASOP) [2], that shows the significant improvement of complexity and execution time of G.711.1 codec on ARM processor.
Proceedings ArticleDOI
04 Jun 2023
TL;DR: In this article , an open-source, streamable, and real-time neural audio codec that achieves strong performance along all three axes: it can reconstruct highly natural sounding 48 kHz speech signals while operating at only 12 kbps and running with less than 6 ms (GPU)/10 ms (CPU) latency.
Abstract: A good audio codec for live applications such as telecommunication is characterized by three key properties: (1) compression, i.e. the bitrate that is required to transmit the signal should be as low as possible; (2) latency, i.e. encoding and decoding the signal needs to be fast enough to enable communication without or with only minimal noticeable delay; and (3) reconstruction quality of the signal. In this work, we propose an open-source, streamable, and real-time neural audio codec that achieves strong performance along all three axes: it can reconstruct highly natural sounding 48 kHz speech signals while operating at only 12 kbps and running with less than 6 ms (GPU)/10 ms (CPU) latency. An efficient training paradigm is also demonstrated for developing such neural audio codecs for real-world scenarios. Both objective and subjective evaluations using the VCTK corpus are provided. To sum up, AudioDec is a well-developed plug-and-play benchmark for audio codec applications.
Posted ContentDOI
07 Jul 2022
TL;DR: In this article , the authors present Neural End-2-End Speech Codec (NESC), a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps.
Abstract: Neural networks have proven to be a formidable tool to tackle the problem of speech coding at very low bit rates. However, the design of a neural coder that can be operated robustly under real-world conditions remains a major challenge. Therefore, we present Neural End-2-End Speech Codec (NESC) a robust, scalable end-to-end neural speech codec for high-quality wideband speech coding at 3 kbps. The encoder uses a new architecture configuration, which relies on our proposed Dual-PathConvRNN (DPCRNN) layer, while the decoder architecture is based on our previous work Streamwise-StyleMelGAN. Our subjective listening tests on clean and noisy speech show that NESC is particularly robust to unseen conditions and signal perturbations.
Proceedings ArticleDOI
16 Apr 1990
TL;DR: A 4.8 kb/s voice codec based on a coding algorithm using pitch-synchronous DFT (discrete Fourier transform) is described, and adaptive postspectral emphasis was shown to be as effective as adaptive postfiltering for enhancing the output signal quality.
Abstract: A 48 kb/s voice codec based on a coding algorithm using pitch-synchronous DFT (discrete Fourier transform) is described This algorithm combines waveform repetition with pitch-synchronous DFT spectral coding in which the phase information and nonharmonic spectral components are ignored This implementation introduces adaptive postspectral emphasis for the enhancement of the reproduced speech quality The prototype codec uses three Fujitsu MB86232 digital signal processor (DSP) chips The reproduced speech signal quality was evaluated, and the performance of adaptive postspectral emphasis was compared with that of adaptive postfiltering The reproduced speech was found to be good Pitch-synchronous DFT thus expresses the power spectra well Adaptive postspectral emphasis was shown to be as effective as adaptive postfiltering for enhancing the output signal quality This does not add complexity to frequency-domain coding >

Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
79% related
Data compression
43.6K papers, 756.5K citations
78% related
Decoding methods
65.7K papers, 900K citations
78% related
Computational complexity theory
30.8K papers, 711.2K citations
76% related
Hidden Markov model
28.3K papers, 725.3K citations
75% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202310
202214
20201
20193
20183
201721