scispace - formally typeset
Search or ask a question

Showing papers on "Enhanced Variable Rate Codec published in 2004"


01 Dec 2004
TL;DR: This memo defines an Experimental Protocol for the Internet community that enables graceful speech quality degradation in the case of lost frames, which occurs in connection with lost or delayed IP packets.
Abstract: This document specifies a speech codec suitable for robust voice communication over IP. The codec is developed by Global IP Sound (GIPS). It is designed for narrow band speech and results in a payload bit rate of 13.33 kbit/s for 30 ms frames and 15.20 kbit/s for 20 ms frames. The codec enables graceful speech quality degradation in the case of lost frames, which occurs in connection with lost or delayed IP packets. This memo defines an Experimental Protocol for the Internet community.

104 citations


Journal ArticleDOI
TL;DR: A region-based video codec is presented, which is compatible with the H.263+ standard, and its associated rate control algorithm for low variable-bit-rate (VBR) video, which incorporates traditional block DCT coding as well as object-based coding.
Abstract: This paper presents a region-based video codec, which is compatible with the H.263+ standard, and its associated rate control algorithm for low variable-bit-rate (VBR) video. The proposed region-based coding scheme is a hybrid method that incorporates traditional block DCT coding as well as object-based coding. To achieve this, we adopt H.263+ as the platform, and develop a fast macroblock-based segmentation method to implement the new region-based codec. The associated rate control solution includes rate control in three levels: encoding frame selection, frame-layer rate control and macroblock-layer rate control. The goal is to improve human visual perceptual quality at low bit rates. The efficiency of the proposed rate control algorithm applied to the region-based H.263+ codec is demonstrated via several typical test sequences.

47 citations


PatentDOI
TL;DR: In this paper, a method for preprocessing digital audio data is provided in order to prevent the problem of pause in music signals in a cellular phone, in which AGC (Automatic Gain Control) preprocessing and PHE (Pitch Harmonics Enhancement) is performed to the digital data having low dynamic range.
Abstract: Recently, with the wider use of cellular phones, more and more users listen to music via their cellular phones, and thus, the perceptual sound quality of music provided via the cellular phones became more critical. Since music signals are encoded by a voice encoding method optimized to human voice signals such as EVRC (Enhanced Variable Rate Coding) in a cellular communication system, the music signals are often distorted by such encoding method, and listeners experience pauses in music caused by such voice-optimized encoding method. To improve the perceptual sound quality of music, a method for preprocessing digital audio data is provided in order to prevent the problem of pause in music signals in a cellular phone. In particular, AGC (Automatic Gain Control) preprocessing and PHE (Pitch Harmonics Enhancement) is performed to the digital audio data having low dynamic range. By this method, the number of pauses in music signal is reduced, and the perceptual sound quality of the music is improved.

31 citations


Patent
02 Dec 2004
TL;DR: In this paper, a communication system in which a profile of codecs is loaded at the originating end is presented, and packets are coded and decoded using only the codec identified by a bit map that is available to the DSP channel at both the originating and the terminating end.
Abstract: A communication system in which a profile of codecs is loaded at the originating end. The originating end sends the profile of codecs to the terminating end and receives in return an indication of which codec to use. The originating end and the terminating end create a reduced profile that includes the codec identified by the terminating side and it's peers. The peers are codecs that use the same amount of resources as the selected codec or that use less resources than the selected codec. Even though the reduced profile includes more than one codec, packets are coded and decoded using only the codec identified by a bit map that is available to the DSP channel at both the originating end and the terminating end. At a later time a change can be made to a different codec in the reduced profile by changing the bitmap and without closing and re-opening the channel.

19 citations


Proceedings ArticleDOI
17 May 2004
TL;DR: A new approach to reduce environmental background noise by modifying the codec parameters is discussed, which can be done as pre-processing before speech encoding or in the network by decoding the bitstream.
Abstract: The transmission of speech in mobile or packet networks requires the use of a speech codec. In order to improve the quality of speech in a noisy environment, a noise reduction algorithm is used. This noise reduction can either be done as pre-processing before speech encoding or in the network by decoding the bitstream, performing the speech enhancement in the time and/or frequency domain and re-encoding the speech. Both methods are computationally expensive. In this paper a new approach to reduce environmental background noise by modifying the codec parameters is discussed.

17 citations


Proceedings ArticleDOI
J. Makinen1, J. Vainio1
05 Apr 2004
TL;DR: A source signal based rate adaptation algorithm for AMR codec in GSM system can be used to increase the system capacity and further increase the robustness of GSM AMR Codec.
Abstract: Adaptive multirate (AMR) codec was standardised for GSM in 1999. AMR offers substantial improvement over previous GSM speech codecs in error robustness by adapting speech and channel coding depending on channel conditions. However, current standard do not exploit the multirate capability of AMR codec in source signal based adaptation that would optimise the average bit-rate vs. quality trade-off. This paper presents a source signal based rate adaptation algorithm for AMR codec in GSM system. Together with fast power control, it can be used to increase the system capacity and further increase the robustness of GSM AMR codec.

17 citations


Proceedings ArticleDOI
26 Sep 2004
TL;DR: Several optimization techniques are presented for efficient implementation of ITU G.729 standard (CS-ACELP, conjugate structure algebraic code excited linear prediction) of 8 Kbit/s bit rate on a real time digital signal processor (DSP), with the aim of overcoming the limitation of computational burden and also scaling this application for enhanced speed to process more channels.
Abstract: Spectral efficiency is the most important aspect in wireless communication systems and cellular mobile radio As speech transmissions are the most used form of communications in the personal communication systems, the low bit speech codecs play an important role in determining the system's spectral efficiency A toll quality low bit rate speech codec that was proposed to meet the personal communication system's requirement is the CS-ACELP speech codec The speech codec has high robustness to withstand high-bit error rates and performs well in tandeming conditions, hence leading to efficient bandwidth utilization and increased channel capacity In this paper several optimization techniques are presented for efficient implementation of ITU G729 standard (CS-ACELP, conjugate structure algebraic code excited linear prediction) of 8 Kbit/s bit rate on a real time digital signal processor (DSP), with the aim of overcoming the limitation of computational burden and also scaling this application for enhanced speed to process more channels These techniques are in general applicable to any speech codec and DSP processor platform

14 citations


Patent
17 Nov 2004
TL;DR: In this article, the authors proposed a quick searching algorithm of fixed code book for voice encoding, which carries on four times iterative circular search in full speed, acquires the best code vector after each time of iterative round-robin search, the four best code vectors are synthesized into four synthesis voices, thus the best vector can be acquired by judging that the weight mean square error between the four synthesis voice and the original voice is minimal.
Abstract: The invention is a kind of quick searching algorithm of fixed code book for voice encoding, it carries on four times iterative circular search in full speed, acquires the best code vector after each time of iterative circular search, the four best code vectors are synthesized into four synthesis voices, thus the best code vector can be acquired by judging that the weight mean square error between the four synthesis voice and the original voice is minimal. The steps of each time of circular search are: confirms the position of each pulse of the eight pulses on each tract; confirms a pulse position on one tract, the process doesn't uses iterative circulation. The method reduces the complexity of the EVRC algorithm, the search time be reduced about 25.8%, the efficiency is upgraded greatly.

12 citations


Journal ArticleDOI
TL;DR: The research presented here investigates the application of the narrow-band adaptive multirate speech codec and the wide-band AMR (WB-AMR) codec, both originally designed for the 200 kHz GSM channel, in the TDMA (TIA/EIA-136) 30-kHz system.
Abstract: A new system enhancement method is proposed for the EIA/TIA-136 system offering both channel operational range extension and improved performance within the current operational range. The existing time-division multiple-access (TDMA) (136) speech codec, the IS-641 enhanced full rate vocoder, operates at a fixed bit rate and does not allow the reallocation of bits to channel error protection as channel conditions degrade. The research presented here investigates the application of the narrow-band adaptive multirate (NB-AMR) speech codec and the wide-band AMR (WB-AMR) codec, both originally designed for the 200 kHz GSM channel, in the TDMA (TIA/EIA-136) 30-kHz system. In particular, we investigate adaptively allocating bits between NB/WB speech coding and error control coding within the limited channel bandwidth. Four modes out of 17 have been carefully chosen for the new TDMA/AMR system. Switching between codec rates as channel conditions change produces range extension below a C/I of 15 dB while also improving performance in the existing operational range above 15 dB. We keep the time slot formats unchanged so that our method is completely compatible with existing 136 systems.

11 citations


Proceedings ArticleDOI
06 Sep 2004
TL;DR: This paper investigates a gain loss control method in the speech parameter domain and finds that embedding the noise and echo reduction into the speech codec and performing them on the parameters decrease the complexity and permits to integrate them in network without delay or tandeming problems.
Abstract: When transmitting speech in mobile communication systems, speech is compressed using a speech codec. To improve the speech quality, disturbing background noise and acoustic echo are attenuated. A new approach consists in embedding these methods into the speech codec. The source-coding performed by the encoder removes redundancy of the speech signal and thus the bitrate is decreased. Accordingly, the advantage of embedding the noise and echo reduction into the speech codec and performing them on the parameters decrease the complexity and permits to integrate them in network without delay or tandeming problems. In this paper we investigate a gain loss control method in the speech parameter domain.

7 citations


Patent
Pankaj Rabha1
29 Sep 2004
TL;DR: In this paper, the LSP parameters interpolated from EVRC to G.729ab were used as input to a closed-loop pitch search, and fixed codebook pulses found from a search limited to positions of EVRC fixed codebooks together with positions of target-impulse correlation maxima on the subframe tracks or full track search.
Abstract: Transcoding from EVRC to G.729ab with LSP parameters interpolated from EVRC to G.729ab, EVRC pitch used as input to G.729ab closed-loop pitch search, and G.729ab fixed codebook pulses found from a search limited to positions of EVRC fixed codebook pulses together with positions of target-impulse correlation maxima on the subframe tracks or full track search if no EVRC pulses.

Proceedings ArticleDOI
Imre Varga1
01 Jan 2004
TL;DR: This contribution reports on the work in 3GPP release 6 on the standardization of a new audio codec for mobile multimedia applications including packet-switched streaming (PSS) and multimedia messaging (MMS).
Abstract: This contribution reports on the work in 3GPP release 6 on the standardization of a new audio codec for mobile multimedia applications including packet-switched streaming (PSS) and multimedia messaging (MMS). First, the design constraints, performance requirements, test plans, selection rules were finalized for both PSS/MMS audio codecs and for the extended AMR-WB codec (AMR-WB+). The candidate codecs were as follows: MPEG4 HE-AAC codec ("AAC+1') for low and high bit-rate range, coding technologies codec ("Enhanced AAC+") for low and high bit-rate range, and Ericsson, Nokia and VoiceAge AMR-WB+ candidate codec for low bit-rate range. Next, extensive subjective listening testing was conducted. The test results showed good performance for the enhanced AAC+ and for the AMR-WB+ candidates.

Proceedings ArticleDOI
J. Makinen1, P. Ojala1, H. Toukomaa1
18 Nov 2004
TL;DR: The adaptive multi-rate (AMR) speech codec offers substantial improvement over previous GSM speech codecs in error robustness by adapting speech and channel coding depending on channel conditions as mentioned in this paper.
Abstract: The adaptive multi-rate (AMR) speech codec offers substantial improvement over previous GSM speech codecs in error robustness by adapting speech and channel coding depending on channel conditions In GSM AMR, the trade-off between speech quality and average bit rate can be further improved by source signal based rate adaptation (SBRA) Together with fast power control, SBRA GSM AMR can be used as a variable rate codec, bringing reduced average bit rate and contributing to an increase in system capacity SBRA GSM AMR was tested against a currently standardised SMV (selectable mode vocoder) variable rate speech codec The paper also presents the general descriptions of both SBRA GSM AMR and SMV codecs

Journal Article
TL;DR: The implemented AMR-WB system used 218 kbytes of program memory and 92 k bytes of data memory and its proper operation was confirmed by comparing a decoded speech signal sample-by-sample with that of PC-based simulation, validates real-time operation of the Implemented system.
Abstract: This paper deals with analysis and real-time Implementation of a wide band adaptive multirate speech codec (AMR-WB) using a fixed-point DSP of TI's TMS320C6201 In the AMR-WB codec, input speech is divided into two frequency bands, lower and upper bands, and processed independently The lower band signal is encoded based on the ACELP algorithm and the upper band signal is processed using the random excitation with a linear prediction synthesis filter The implemented AMR-WB system used 218 kbytes of program memory and 92 kbytes of data memory And its proper operation was confirmed by comparing a decoded speech signal sample-by-sample with that of PC-based simulation Maximum required time of 5 75 ms for processing a frame of 20 ms of speech validates real-time operation of the Implemented system

Patent
13 Oct 2004
TL;DR: In this paper, a method and system for extracting/combining/transmitting audio data by using a microphone of a mobile terminal are provided to allow a user to combine video data and a user's voice, and transmit the corresponding multimedia data to a receiver.
Abstract: PURPOSE: A method and system for extracting/combining/transmitting audio data by using a microphone of a mobile terminal are provided to allow a user to combine video data and a user's voice, and transmit the corresponding multimedia data to a receiver. CONSTITUTION: When a key input for photographing and recording is inputted from a key input unit(S310), a CPU(Central Processing Unit) transmits a record function for setting a sound media as a recording audio in order to record audio data to an EVRC(Enhanced Variable Rate Coding) vocoder(S320). When a key input for an audio volume is inputted from the key input unit(S330), the CPU transfers a volume setting function for setting a volume of EVRC audio data to be extracted and a volume set value designated by the key input unit to the EVRC vocoder(S340). When the EVRC audio data is received from the EVRC vocoder(S350), the CPU stores the EVRC audio data in a data storing unit(S360).

Patent
18 Dec 2004
TL;DR: In this paper, a method for sharing sound data of a terminal having a plurality of modems, and a method therefor are provided to process sound data to be shared by modem chips different in format, thereby efficiently using system resources.
Abstract: PURPOSE: A device for sharing sound data of a terminal having a plurality of modems, and a method therefor are provided to process sound data to be shared by modem chips different in format, thereby efficiently using system resources. CONSTITUTION: An AMR(Adaptive Multi-Rate) vocoder(220) receives an analog speaker signal output from a speaker output end of a synchronous EVRC(Enhanced Variable Rate Codec) vocoder(210) of a terminal through a microphone input end. The EVRC vocoder(210) receives an analog speaker signal output from a speaker output end of the AMR vocoder(220) through a microphone input end. An ADPCM(Adaptive Differential Pulse Code Modulation) vocoder(230) processes sound data of the AMR vocoder(220) and the EVRC vocoder(210).

Patent
09 Jan 2004
TL;DR: In this article, a method for preprocessing digital audio data is provided in order to prevent the problem of pause in music signals in a cellular phone, in particular, AGC (Automatic Gain Control) preprocessing and PHE (Pitch Harmonics Enhancement) is performed to the digital data having low dynamic range.
Abstract: Since music signals are encoded by a voice encoding method optimized to human voice signals such as EVRC (Enhanced Variable Rate Coding) in a cellular communication system, the music signals are often distorted by such encoding method, and listeners experience pauses in music caused by such voice-optimized encoding method. To improve the perceptual sound quality of music, a method for preprocessing digital audio data is provided in order to prevent the problem of pause in music signals in a cellular phone. In particular, AGC (Automatic Gain Control) preprocessing and PHE (Pitch Harmonics Enhancement) is performed to the digital audio data having low dynamic range. By this method, the number of pauses in music signal is reduced, and the perceptual sound quality of the music is improved.

Proceedings ArticleDOI
21 Nov 2004
TL;DR: The implementation methodology and optimization techniques commonly used to realize speech coding algorithms and their implementations at bit rates lesser than 8 kbps are emphasized.
Abstract: In the recent past there have been a lot of efforts in the development of speech coding algorithms and their implementations at bit rates lesser than 8 kbps. To support the development of such complex algorithms, computational power of digital signal processors (DSP's) increased tremendously. This lead to DSP's enriched with dedicated hardware support for various application specific features. The software development tools and compiler support has also improved to a large extent. This has slashed down the effort time in implementing the speech codecs. On the other hand, the cost of development tools may be prohibitive for non-vendors and at times high level code conversion tools may not be present at all. This paper emphasizes on the implementation methodology and optimization techniques commonly used to realize such systems where codec implementation in all assembly is necessary. The codec that was chosen for implementing these techniques was the GSM half rate (GSM HR) speech codec. The techniques described here are not limited to only one speech codec but are applicable to any kind of speech codec.