scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 2015"


Proceedings ArticleDOI
19 Apr 2015
TL;DR: An overview of the underlying architecture as well as the novel technologies in the EVS codec are given and listening test results showing the performance of the new codec in terms of compression and speech/audio quality are presented.
Abstract: The recently standardized 3GPP codec for Enhanced Voice Services (EVS) offers new features and improvements for low-delay real-time communication systems. Based on a novel, switched low-delay speech/audio codec, the EVS codec contains various tools for better compression efficiency and higher quality for clean/noisy speech, mixed content and music, including support for wideband, super-wideband and full-band content. The EVS codec operates in a broad range of bitrates, is highly robust against packet loss and provides an AMR-WB interoperable mode for compatibility with existing systems. This paper gives an overview of the underlying architecture as well as the novel technologies in the EVS codec and presents listening test results showing the performance of the new codec in terms of compression and speech/audio quality.

91 citations


Journal ArticleDOI
TL;DR: A set of steganalysis features of the probability of same pulse position of the adaptive multirate (AMR) audio steganography schemes, applied to the proposed features and used as the steganalyzer.
Abstract: This paper presents a method for detection of adaptive multirate (AMR) audio steganography. AMR audio codec is an audio data compression scheme optimized for speech coding, and widely used in some mobile telecommunications system. The AMR audio steganography schemes are emerging recently and they embed secret messages by modifying the nonzero pulse positions which are determined by fixed codebook search in AMR compression procedure. Those methods have high embedding capacity and good imperceptivity. We have observed that those steganography schemes will cause the probability of same pulse positions in the same track increasing. Based on this phenomenon, this paper presents a set of steganalysis features of the probability of same pulse position. The support vector machine is applied to the proposed features and used as the steganalyzer. The performance of the scheme is tested on a database containing $\sim 140$ 714 audios. Experimental results show that the correct detection rate of our proposed method is >90% when the embedding bit rate is 30% or above, and can reach above 85% for cover audios.

47 citations


Proceedings ArticleDOI
19 Apr 2015
TL;DR: An in-depth insight is provided into 3GPP's rigorous and transparent processes that made it possible for the mobile industry, with its many competing players, to successfully develop and standardize a codec in an open, fair and constructive process.
Abstract: A new codec for Enhanced Voice Services (EVS), the successor of the current mobile HD voice codec AMR-WB, was standardized by the 3rd Generation Partnership Project (3GPP) in September 2014. The EVS codec addresses 3GPP's needs for cutting-edge technology enabling operation of 3GPP mobile communication systems in the most competitive means in terms of communication quality and efficiency. This paper provides an in-depth insight into 3GPP's rigorous and transparent processes that made it possible for the mobile industry, with its many competing players, to successfully develop and standardize a codec in an open, fair and constructive process. This paper also enables an understanding of this achievement by providing an overview of the EVS codec technology, the standard specifications, and the performance of the codec that will elevate HD voice services to the next quality level.

40 citations


Journal ArticleDOI
TL;DR: This paper presents first-time empirical evidence for masking in the perception of wideband vibrotactile signals, and presents a bitrate scalable haptic texture codec, which incorporates the masking model and describes its subjective and objective performance evaluation.
Abstract: Applications involving indirect interpersonal communication, such as collaborative design/assembly/exploration of physical objects, can benefit strongly from the transmission of contact-based haptic media, in addition to the more traditional audiovisual media. Inclusion of haptic media has been shown to improve immersiveness, task performance, and the overall experience of task execution. While several decades of research have been dedicated to the acquisition, processing, coding, and display of audio and video streams, similar aspects for haptic streams have been addressed only recently. Simultaneous masking is a perceptual phenomenon widely exploited in the compression of audio data. In the first part of this paper, to the best of our knowledge, we present first-time empirical evidence for masking in the perception of wideband vibrotactile signals. Our results show that this phenomenon for haptics is very similar to its auditory analog. Signals closer in frequency to a powerful masker ( 25 dB above detection threshold) are masked more strongly (peak threshold-shifts of up to 28 dB) than those away from the masker (threshold-shifts of 15–20 dB). The masking curves approximately follow the masker's spectral profile. In the second part of this paper, we present a bitrate scalable haptic texture codec, which incorporates the masking model and describe its subjective and objective performance evaluation. Experiments show that we can drive down the codec output bitrate to a very low value of 2.3 kbps, without the subjects being able to reliable discriminate between the codec input and distorted output texture signals.

32 citations


Proceedings ArticleDOI
19 Apr 2015
TL;DR: An LPC and MDCT-based audio coder part of the new 3GPP codec for Enhanced Voice Services is presented, which aims to solve the constraints of speech coders operating in time domain with frequency domain mode.
Abstract: Speech coders operating in time domain can be extended with a frequency domain mode to improve encoding of music, even though this is challenging at low delay. In such a scenario, the short analysis window limits the benefit of the transform coder, while a delayless switch between the two coders constrains the system further. The paper presents an LPC and MDCT-based audio coder part of the new 3GPP codec for Enhanced Voice Services, which aims to solve the issues. Several advanced coding tools are introduced to alleviate the constraints: transient handling is improved, harmonic structures are better preserved, and the modeling of the zero-quantized frequencies is enhanced. Test results show that the obtained low-delay switched coder brings a clear improvement over a speech coder and is competitive even in comparison to audio coders with higher delay.

32 citations


Proceedings ArticleDOI
19 Apr 2015
TL;DR: All aspects of the advances brought during the EVS development on packet loss concealment are outlined, by presenting a high level description of all technical features present in the final standardized codec.
Abstract: EVS, the newly standardized 3GPP Codec for Enhanced Voice Services (EVS) was developed for mobile services such as VoLTE, where error resilience is highly essential. The presented paper outlines all aspects of the advances brought during the EVS development on packet loss concealment, by presenting a high level description of all technical features present in the final standardized codec. Coupled with jitter buffer management, the EVS codec provides robustness against late or lost packets. The advantages of the new EVS codec over reference codecs are further discussed based on listening test results.

30 citations


Journal ArticleDOI
TL;DR: Experimental results show that the self-embedding speech signal is recoverable with proper speech quality for high tampering rates, without significant loss in the quality of the original speech signal.
Abstract: Authentication and tampering detection of the digital signals is one of the main applications of the digital watermarking. Recently, watermarking algorithms for digital images are developed to not only detect the image tampering, but also to recover the lost content to some extent. In this paper, a new watermarking scheme is introduced to generate digital self-embedding speech signals enjoying the self-recovery feature. For this purpose, the compressed version of the speech signal generated by a speech codec and protected against the tampering by the proper channel coding is embedded into the original speech signal. Experimental results show that the self-embedding speech signal is recoverable with proper speech quality for high tampering rates, without significant loss in the quality of the original speech signal.

24 citations


Proceedings ArticleDOI
19 Apr 2015
TL;DR: Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant, and the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.
Abstract: Unified speech and audio codecs often use a frequency domain coding technique of the transform coded excitation (TCX) type. It is based on modeling the speech source with a linear predictor, spectral weighting by a perceptual model and entropy coding of the frequency components. While previous approaches have used neighbouring frequency components to form a probability model for the entropy coder of spectral components, we propose to use the magnitude of the linear predictor to estimate the variance of spectral components. Since the linear predictor is transmitted in any case, this method does not require any additional side info. Subjective measurements show that the proposed methods give a statistically significant improvement in perceptual quality when the bit-rate is held constant. Consequently, the proposed method has been adopted to the 3GPP Enhanced Voice Services speech coding standard.

23 citations


Patent
25 Feb 2015
TL;DR: In this paper, a video encoding and decoding system that implements an adaptive transfer function method internally within the codec for signal representation is described. But the transfer function may be the same as the transfer functions of the input video data or may be a transfer function internal to the codec, and the encoded video data may be decoded and expanded into the dynamic range of display(s).
Abstract: A video encoding and decoding system that implements an adaptive transfer function method internally within the codec for signal representation. A focus dynamic range representing an effective dynamic range of the human visual system may be dynamically determined for each scene, sequence, frame, or region of input video. The video data may be cropped and quantized into the bit depth of the codec according to a transfer function for encoding within the codec. The transfer function may be the same as the transfer function of the input video data or may be a transfer function internal to the codec. The encoded video data may be decoded and expanded into the dynamic range of display(s). The adaptive transfer function method enables the codec to use fewer bits for the internal representation of the signal while still representing the entire dynamic range of the signal in output.

23 citations


Proceedings ArticleDOI
Anssi Rämö1, Henri Toukomaa1
19 Apr 2015
TL;DR: Comparison to Opus, IETF driven open source codec as well as industry standard voice codecs: 3GPP AMR and AMR-WB, and ITU-T G.718B, G.1C and G.719 as wellAs direct signals at varying bandwidths was made.
Abstract: This paper discusses the voice and audio quality characteristics of EVS, the recently standardized 3GPP codec. Comparison to Opus, IETF driven open source codec as well as industry standard voice codecs: 3GPP AMR and AMR-WB, and ITU-T G.718B, G.722.1C and G.719 as well as direct signals at varying bandwidths was made. Voice and audio quality was evaluated with three subjective listening tests containing clean and noisy speech in Finnish language as well as a mixed condition test containing both speech and music intermixed. Nine-scale subjective mean opinion score was calculated for all tested conditions.

22 citations


Proceedings ArticleDOI
19 Apr 2015
TL;DR: The time-domain bandwidth extension (TBE) framework employed to code wideband and super-wideband speech in the newly standardized 3GPP EVS codec shows significantly improved quality compared to the other standardized SWB codecs under both clean speech and speech with background noise.
Abstract: This paper describes the time-domain bandwidth extension (TBE) framework employed to code wideband and super-wideband speech in the newly standardized 3GPP EVS codec. The TBE algorithm uses a nonlinear harmonic modeling technique that incorporates principles of time-domain envelope-modulated noise mixing. At 13.2 kbps, the super-wideband coding of speech uses as low as 1.55 kbps for encoding the spectral content from 6.4–14.4 kHz. Subjective evaluation results from ITU-T P.800 Mean Opinion Score (MOS) tests are provided, showing significantly improved quality compared to the other standardized SWB codecs under both clean speech and speech with background noise.

Patent
11 Sep 2015
TL;DR: In this paper, a method of sending and receiving (or transceiving) audio stream in a wireless communication system is described, where a first device includes receiving reference codec information from a second device, performing an audio codec negotiation procedure for determining the audio codec to be used to send and receive audio stream to and from a third device.
Abstract: Disclosed herein is a method of sending and receiving (or transceiving) audio stream in a wireless communication system. The method performed by a first device includes receiving reference codec information from a second device, performing an audio codec negotiation procedure for determining an audio codec to be used to send and receive audio stream to and from a third device, checking whether the audio codec determined through the audio codec negotiation procedure is identical with the audio codec supportable by the second device, and operating in bypass mode or conversion mode based on a result of the check.

Journal ArticleDOI
TL;DR: Based on the analysis, it is found that the sample repetition rate of the AMR decompressed waveform is significantly greater than the regular waveform, and the experimental results show that this feature is robust and effective.

Proceedings ArticleDOI
28 Dec 2015
TL;DR: Objective measures show that a more reliable switching decision is achievable and a reliable speech and music discriminator (SMD) for such an application is designed.
Abstract: Switching between speech coding and generic audio coding schemes was recently proven to be very efficient for coding a large range of audio materials at low bit-rates. However, it strongly relies on a robust classification of the input signal. The aim of the paper is to design a reliable speech and music discriminator (SMD) for such an application. Main attention was laid on getting a good tradeoff between accuracy, reactivity and stability of the decision while keeping the delay and complexity reasonably low. To this end, short-term and long-term features are dissociated before being conveyed to two different classifiers. The two classifier outputs are combined in a final decision using a hysteresis. Objective measures show that a more reliable switching decision is achievable. The SMD was successfully implemented in MPEG Unified Speech and Audio Coding (USAC). It allows the codec to show unprecedented audio quality.

Proceedings ArticleDOI
01 Feb 2015
TL;DR: This work characterizes a speech codec in a Compressive Sensing (CS) framework and demonstrates simultaneous compression and de-noising of speech by CS, and Appropriate quantization of CS measurements to design medium bit-rate codec.
Abstract: Medium bit rate hybrid speech coding schemes have gained much interest in the recent years and many of them have been standardized for various applications. This work characterizes a speech codec in a Compressive Sensing (CS) framework. We mainly demonstrate two aspects 1) Simultaneous compression and de-noising of speech by CS 2) Appropriate quantization of CS measurements to design medium bit-rate codec. The proposed scheme renders better quality speech compared to CELP, the widely used hybrid coding scheme, at the same bit rates. The CS speech codec has the added advantage of inherent noise suppression and easy scalability, without complex parameter extractions and voice activity detections.

Proceedings ArticleDOI
28 Dec 2015
TL;DR: Stereo coding aspect of this block is demonstrated that, by using specially chosen spectral configurations when deriving the parametric side-information in the encoder, perceptual artifacts can be reduced and the spatial processing in the decoder can remain real-valued.
Abstract: Traditional audio codecs based on real-valued transforms utilize separate and largely independent algorithmic schemes for parametric coding of noise-like or high-frequency spectral components as well as channel pairs. It is shown that in the frequency-domain part of coders such as Extended HE-AAC, these schemes can be unified into a single algorithmic block located at the core of the modified discrete cosine transform path, enabling greater flexibility like semi-parametric coding and large savings in codec delay and complexity. This paper focuses on the stereo coding aspect of this block and demonstrates that, by using specially chosen spectral configurations when deriving the parametric side-information in the encoder, perceptual artifacts can be reduced and the spatial processing in the decoder can remain real-valued. Listening tests confirm the benefit of our proposal at intermediate bit-rates.

Proceedings ArticleDOI
Stefan Bruhn1, Tomas Frankkila1, Frederic Gabin1, Karl Hellwig1, Maria Hultström1 
01 Dec 2015
TL;DR: System aspects relating to EVS codec introduction in VoLTE and CS networks as well as interworking and mobility with legacy systems and services are described.
Abstract: The Enhanced Voice Services (EVS) codec was standardized by 3GPP in 2014. This codec offers significant gains in voice quality, efficiency, channel error robustness over any other existing speech codec and far better music quality. Operators run voice services on a large installed base of 3GPP Circuit-Switched (CS) 2G (GERAN) or 3G (UTRAN) radio networks. These networks offer mobile voice service either as HD voice using the AMR-WB codec, or as traditional narrowband (NB) voice service, based on the AMR codec. Voice over LTE (VoLTE) is currently being deployed throughout the world with HD Voice. The EVS codec will be first introduced as a straightforward VoLTE upgrade. It is also expected that EVS will be deployed over 3G CS networks. This paper describes system aspects relating to EVS codec introduction in VoLTE and CS networks as well as interworking and mobility with legacy systems and services.

Journal ArticleDOI
TL;DR: The rate-distortion performance of the proposed codec is superior to the state-of-the-art CS-based video codec, although there is still a considerable gap between it and traditional video codec.
Abstract: This paper presents a compressive-sensing- (CS-) based video codec which is suitable for wireless video system requiring simple encoders but tolerant, more complex decoders. At the encoder side, each video frame is independently measured by block-based random matrix, and the resulting measurements are encoded into compressed bitstream by entropy coding. Specifically, to reduce the quantization errors of measurements, a nonuniform quantization is integrated into the DPCM-based quantizer. At the decoder side, a novel joint reconstruction algorithm is proposed to improve the quality of reconstructed video frames. Firstly, the proposed algorithm uses the temporal autoregressive (AR) model to generate the Side Information (SI) of video frame, and next it recovers the residual between the original frame and the corresponding SI. To exploit the sparse property of residual with locally varying statistics, the Principle Component Analysis (PCA) is used to learn online the transform matrix adapting to residual structures. Extensive experiments validate that the joint reconstruction algorithm in the proposed codec achieves much better results than many existing methods with consideration of the reconstructed quality and the computational complexity. The rate-distortion performance of the proposed codec is superior to the state-of-the-art CS-based video codec, although there is still a considerable gap between it and traditional video codec.

Book ChapterDOI
20 Sep 2015
TL;DR: This review survey the dynamic functioning of the Opus codec within a Web Real-Time Communication (WebRTC) framework based on the Google Chrome browser and finds that WebRTC framework-coded speech achieves a similar MOS assessment compared to stand-alone Opus coding.
Abstract: The Internet Engineering Task Force (IETF) – the open Internet standards-development body – considers the Opus codec as a highly versatile audio codec for interactive voice and music transmission. In this review we survey the dynamic functioning of the Opus codec within a Web Real-Time Communication (WebRTC) framework based on the Google Chrome browser. The codec behavior and the effectively utilized features during the active communication process are tested and analyzed under various testing conditions. In the experiments, we verify the Opus performance and interactivity. Relevant codec parameters can easily be adapted in application development. In addition, WebRTC framework-coded speech achieves a similar MOS assessment compared to stand-alone Opus coding.


Proceedings ArticleDOI
06 Jul 2015
TL;DR: This plenary session will cover speech processing research advances with the emphasis on speech and audio coding methods and how long-term speech parameters can be used as predictors of other diseases such as tremors, Alzheimer's etc.
Abstract: This plenary session will cover speech processing research advances with the emphasis on speech and audio coding methods. In the session, we will discuss the fundamental principles, techniques, and algorithms used in current coding applications including a summary of codecs for telecommunication standards. The session will start with a discussion on: the basic speech representation methods, the performance measures used to evaluate coded speech, and the role of the standards. Brief algorithm descriptions include: ADPCM, sub-band coding, adaptive transform coding, sinusoidal transform coding (STC), linear predictive coding (LPC), and analysis-by-synthesis LPC (sparse excitation, code excited LPC, and ACELP). The presentation will feature audio, and computer demonstrations of recent speech coding standards including voice-over IP algorithms. The plenary session will also cover wideband audio standards such as MPEG audio and other layers (e.g., MP3, AAC). Recent algorithms will also be described including the following: Variable-Rate Multimode Wideband (VMR-WB), Speex, G722.1, OGG Vorbis 2012, iLBC, SELT, SILK, Opus 2013, Qualcomm wideband 5G codecs. At the end of the session, we will cover briefly recent applications that use voice features for detecting speech pathologies, and also discuss how long-term speech parameters can be used as predictors of other diseases such as tremors, Alzheimer's etc.

01 Jun 2015
TL;DR: This document defines the Real-time Transport Protocol payload format for packetization of Opus-encoded speech and audio data necessary to integrate the codec in the most compatible way and describes media type registrations for the RTP payload format.
Abstract: This document defines the Real-time Transport Protocol (RTP) payload format for packetization of Opus-encoded speech and audio data necessary to integrate the codec in the most compatible way. It also provides an applicability statement for the use of Opus over RTP. Further, it describes media type registrations for the RTP payload format.

Book ChapterDOI
01 Jan 2015
TL;DR: This paper estimated the maximum number of LTE users can be supported over enhanced node B (eNodeB) using different bandwidth levels and observed that AMR codec with semi-persistent scheduling scheme is utilizing less number of control channels and accommodates more number of users.
Abstract: Long-term evolution (LTE) network is a fully IP-based and does not include a circuit-switched domain for voice communication as known from GSM and UMTS networks. Continuous switching from active to inactive state is one of the challenging tasks for VoIP in LTE. Because of these switching and retransmissions, control channels come into function. Control channel is one of the major limitations for capacity in VoIP. In this paper, we analyzed different scheduling techniques to reduce the number of control channels in the network. We estimated the maximum number of LTE users can be supported over enhanced node B (eNodeB) using different bandwidth levels. From the numerical results, we observed that AMR codec with semi-persistent scheduling scheme is utilizing less number of control channels and accommodates more number of users.

Proceedings ArticleDOI
01 Jan 2015
TL;DR: The audio quality assessed by the objective and subjective manners shows that the proposed design outperforms the well-known commercial codec.
Abstract: Wireless audio service attracts increasing attention both in mobile and in home audio market. For high quality audio service especially on wireless network, the audio codec with low latency is indispensible since the latency prevents a user from being immersed in the service. This paper presents a low latency audio coder design with a short length window. Side effects due to the short length window are mitigated by the proposed coding tools such as pitch coding and low frequency masking threshold control. The audio quality assessed by the objective and subjective manners shows that the proposed design outperforms the well-known commercial codec.

Proceedings ArticleDOI
01 Dec 2015
TL;DR: A frame interleaving algorithm is developed to reorder the stereo video frames into a monocular video, such that the proposed codec can gain advantage from inter-views and temporal correlations to improve its coding performance.
Abstract: Development of stereo video codecs in latest multi-view extension of HEVC (MV-HEVC) with higher compression efficiency has been an active area of research. In this paper, a frame interleaved stereo video coding scheme based on MVHEVC standard codec is proposed. The proposed codec applies a reduced layer approach to encode the frame interleaved stereo sequences. A frame interleaving algorithm is developed to reorder the stereo video frames into a monocular video, such that the proposed codec can gain advantage from inter-views and temporal correlations to improve its coding performance. To evaluate the performance of the proposed codec; three standard multi-view test video sequences, named “Poznan_Street”, “Kendo” and “Newspaper1”, were selected and coded using the proposed codec and the standard MV-HEVC codec at different QPs and bitrates. Experimental results show that the proposed codec gives a significantly higher coding performance to that of the standard MV-HEVC codec at all bitrates.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: A novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay.
Abstract: In this paper a novel technique is presented to efficiently mix traditional ACELP time domain coding with a frequency domain coding model to improve the quality of generic audio signals coded at low bitrates without additional delay. The paper discusses how to integrate parts of a traditional Algebraic Code Excited Linear Prediction (ACELP) speech codec to create a time-domain contribution which coexists with a frequency based coding model. A mechanism to determine the value of the time-domain contribution is proposed and a method is described how the frequency-domain contribution might be added without increasing the overall delay of the codec. The proposed method forms part of the recently standardised 3GPP EVS codec.

Proceedings ArticleDOI
19 Apr 2015
TL;DR: A low-complexity version of the closed-loop approach, based on similar decisions which compute the coding distortion of each mode and select the one with the lowest distortion, which yields similar performance and lower complexity.
Abstract: Several state-of-the-art switched audio codecs employ the closed-loop mode decision to select the best coding mode at every frame. The closed-loop mode selection is known to have good performance but also high complexity. The new approach we propose in this paper is a low-complexity version of the closed-loop approach, based on similar decisions which compute the coding distortion of each mode and select the one with the lowest distortion. Our approach differs mainly in the way the coding distortions are calculated. We are able to notably reduce the complexity by only estimating the distortions without encoding and decoding the input for each mode. The new approach was implemented in the EVS codec standard and evaluated both objectively and subjectively. Compared to the closed-loop approach, it yields similar performance and lower complexity.

30 Sep 2015
TL;DR: A major effort to quantify the speech intelligibility associated with a range of narrowband, wideband, and fullband digital audio coding algorithms in various acoustic noise environments and identifies codec modes that produce MRT intelligibility values that meet or exceed those of analog FM.
Abstract: We describe a major effort to quantify the speech intelligibility associated with a range of narrowband, wideband, and fullband digital audio coding algorithms in various acoustic noise environments. The work emphasizes the relationship between these intelligibility results and analogous ones for an analog FM land-mobile radio reference. The initial phase of this project includes 54 noise environments and 83 audio codec modes. We use an objective intelligibility estimator to narrow the scope and then design a practically sized modified rhyme test (MRT) covering 6 challenging yet relevant noise environments and 28 codec modes for a total of 168 conditions. The MRT used 36 subjects to produce 432 trials for each condition. Results show that intelligibility depends strongly on noise environment, data rate, and audio bandwidth. For each noise environment we identify codec modes that produce MRT intelligibility values that meet or exceed those of analog FM. We expect that these results can inform some of the design and provisioning decisions required in the development of mission-critical voice applications for LTE.

Book ChapterDOI
20 Sep 2015
TL;DR: The article establishes the general trends of speech coding algorithms based on linear prediction and the main procedures of their forming and results of experimental studies of the developed adaptive low bit-rate coding algorithms are presented.
Abstract: The article establishes the general trends of speech coding algorithms based on linear prediction. The task of adaptation of speech codec to the statistical characteristics of the coding parameters is set and accomplished. The main procedures of their forming are examined. The results of experimental studies of the developed adaptive low bit-rate coding algorithms are presented. The benefits of the quality of remade speech in comparison with algorithms on FS1015, FS1017 and FS1016 standards and Full-rate GSM are displayed.

Patent
12 May 2015
TL;DR: In this paper, the authors propose a method to change the coding rate of a multimode audio codec by determining that the request corresponds to a coding rate lower than the requested coding rate.
Abstract: There is inter alia a method comprising: receiving a request to change the coding rate of a multimode audio codec; determining that the request corresponds to a coding rate of another mode of operation of the multimode audio codec; determining a frame of an input audio signal of the multimode audio codec to be an active region of the audio signal; maintaining a current operating mode of the multimode audio codec; and reducing the coding rate of the multimode audio codec to a coding rate lower than the requested coding rate.