scispace - formally typeset
Search or ask a question

Showing papers on "Enhanced Variable Rate Codec published in 2003"


Book ChapterDOI
01 Jan 2003
TL;DR: In this chapter those parts of the H.263 standard that make this codec more efficient than its predecessors will be explained.
Abstract: The H.263 Recommendation specifies a coded representation that can be used for compressing the moving picture components of audio-visual services at low bit rates. Detailed specifications of the first generation of this codec under the test model (TM) to verify the performance and compliance of this codec were finalised in 1995. The basic configuration of the video source algorithm in this codec is based on ITU-T Recommendation H.261, which is a hybrid of interpicture prediction to utilise temporal redundancy and transform coding of the residual signal to reduce spatial redundancy. However, during the course of the development of H.261 and the subsequent advances on video coding in MPEG-1 and MPEG-2 video codecs, substantial experience was gained, which has been exploited to make H.263 an efficient encoder. In this chapter those parts of the H.263 standard that make this codec more efficient than its predecessors will be explained.

82 citations


Patent
Yang Gao1, Adil Benyassine1, Jes Thyssen1, Eyal Shlomot1, Huan-Yu Su1 
08 Apr 2003
TL;DR: In this article, a speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed, which optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech.
Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

64 citations


Patent
Yang Gao1, Adil Benyassine1, Jes Thyssen1, Eyal Shlomot1, Huan-Yu Su1 
08 Apr 2003
TL;DR: In this article, a speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed, which optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech.
Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

25 citations


Patent
08 Jan 2003
TL;DR: In this article, the authors propose a method to decode a CELP-based compressed voice bitstream from source to destination codec by unpacking the parameters from the input CelP bistream and interpolating the unpacked parameters from a difference of destination codec parameters and source codec parameters.
Abstract: Transcoding a CELP based compressed voice bitstream from source codec to destination codec relate to embodiments of a system and method. The method includes processing a source codec input bitstream to unpack (1) CELP parameters from the input CELP bistream and may interpolate (2) the unpacked CELP parameters from is a difference of destination codec parameters and source codec parameters exists. If the method maps (4) CELP from source codec format to a destination codec format, the parameter mapping strategy may be singly preset or selected (3). The method inludes encoding the CELP parameters for the destination codec and processing a destination CELP bitstream by packing (7) the CELP parameters for the destination codec.

24 citations


Patent
14 Oct 2003
TL;DR: In this article, an AGC (Automatic Gain Control) preprocessing is performed to the audio data having low dynamic range to reduce the number of pauses in the audio signal.
Abstract: Recently, with the wider use of cellular phones, more and more users listen to music via their cellular phones, and thus, the sound quality of music provided via the cellular phones became more critical. Since music signals are encoded by a voice encoding method optimized to human voice signals such as EVRC (Enhanced Variable Rate Coding) in a cellular communication system, the music signals are often distorted by such encoding method, and listeners experience pauses in music caused by such voice-optimized encoding method. To improve the sound quality of music, a method for preprocessing audio data is provided in order to prevent the problem of pause in music signals in a cellular phone. In particular, AGC (Automatic Gain Control) preprocessing is performed to the audio data having low dynamic range. By this method, the number of pauses in music signal is reduced, and the sound quality of the music is improved.

24 citations


01 Jul 2003
TL;DR: This document describes the RTP payload format for Enhanced Variable Rate Codec (EVRC) Speech and Selectable Mode Vocoder (SMV) Speech, where a bundled/interleaved format is included to reduce the effect of packet loss on speech quality and amortize the overhead of theRTP header over more than one speech frame.
Abstract: This document describes the RTP payload format for Enhanced Variable Rate Codec (EVRC) Speech and Selectable Mode Vocoder (SMV) Speech. Two sub-formats are specified for different application scenarios. A bundled/interleaved format is included to reduce the effect of packet loss on speech quality and amortize the overhead of the RTP header over more than one speech frame. A non-bundled format is also supported for conversational applications.

21 citations


Proceedings ArticleDOI
06 Jul 2003
TL;DR: A practical low-complexity real-time video codec for mobile devices that can significantly reduce the computational cost, including a predictive algorithm for motion estimation, the integer discrete cosine transform (IntDCT), and a DCT/quantizer bypass technique is developed.
Abstract: Real-time software-based video codec is widely used on PCs with relatively strong computing capability. However, mobile devices, such as pocket PCs and handheld PCs, still suffer from weak computational power, short battery lifetime and limited display capability. We developed a practical low-complexity real-time video codec for mobile devices. Several methods that can significantly reduce the computational cost are adopted in this codec and described in this paper, including a predictive algorithm for motion estimation, the integer discrete cosine transform (IntDCT), and a DCT/quantizer bypass technique. A real-time video communication implementation of the proposed coded is also introduced. Experiments show that substantial computation reduction is achieved while the loss in video quality is negligible. The proposed codec is very suitable for scenarios where low-complexity computing is required.

19 citations


Patent
Jari Mäkinen1, Pasi Ojala1
30 Oct 2003
TL;DR: In this paper, a method for performing variable rate speech coding in the speech codec comprising a plurality of speech codec modes operating at different bit rates, the speech encoded by said speech codec being arranged for transmission in a telecommunications network is described.
Abstract: A method for performing variable rate speech coding in the speech codec comprising a plurality of speech codec modes operating at different bit rates, the speech encoded by said speech codec being arranged for transmission in a telecommunications network. Information on an active speech codec mode set to be supported is received from the telecommunications network, in response to which the supported speech codec modes that correspond to the active codec mode set determined in the telecommunications network will be activated. Thereafter, speech signals to be applied to the speech codec are encoded with the activated speech codec modes such that the speech codec mode of the substantially lowest bit rate is adapted to the speech frames comprised by the speech signals such that in view of the channel conditions in the telecommunications network the level of residual error in coding will be substantially minimized at the same time.

15 citations


Proceedings ArticleDOI
06 Apr 2003
TL;DR: A novel transcoding algorithm for the adaptive multi rate (AMR) codec and the enhanced variable rate codec (EVRC) is proposed, which transcodes the parameters of one codec to the other without synthesizing the speech.
Abstract: A novel transcoding algorithm for the adaptive multi rate (AMR) codec and the enhanced variable rate codec (EVRC) is proposed. In contrast to the conventional tandem transcoding algorithm, the proposed algorithm transcodes the parameters of one codec to the other without synthesizing the speech. The proposed algorithm decodes the parameters of source codec from the input bitstream, and based on frame classification and mode decision, it appropriately transforms the parameters of source codec to those of the target codec in the parametric domain. Finally, the transformed parameters are encoded into a bitstream that is decodable by the target codec. The parameters transcoded by the proposed algorithm are line-spectral pair (LSP), pitch delay, fixed codevector, codebook gains, and frame energy. Evaluation results show that while reducing both the computational complexity and delay by 50%, the proposed algorithm produces speech quality equivalent to that of produced by the tandem transcoding algorithm. The general idea is not restricted to the AMR and EVRC but is applicable to various other code-excited linear prediction (CELP) based codecs.

12 citations


Proceedings ArticleDOI
30 Nov 2003
TL;DR: It is concluded that the use of dedicated speech recognition codecs, such as DSR, does not offer tangible benefits in real-world systems and services.
Abstract: In this paper, we investigate the usefulness of general-purpose speech codecs and dedicated speech recognition codecs for speech-enabled services. Specifically, we focus on 3rd generation WCDMA systems using the adaptive multi-rate (AMR) speech codec, in comparison with the distributed speech recognition (DSR) framework. Speech recognition experiments are carried out with the AMR speech codec in a simulated packet-switched network. The performance of the DSR codec is assumed to be unaffected by transmission errors. Experimental results in British English and Mandarin Chinese indicate that no significant performance difference can be observed between the AMRand DSR-based recognition systems. The gain from using the dedicated DSR codec is unlikely to provide a perceptible improvement in terms of quality of service for the end-users. In the light of the experimental results achieved, and other implementation and economical issues, it is concluded that the use of dedicated speech recognition codecs, such as DSR, does not offer tangible benefits in real-world systems and services.

6 citations



Journal Article
TL;DR: In this article, a noise suppression algorithm with high speech quality based on weighted noise estimation and MMSE STSA (Minimum Mean Square Error, Short Time Spectral Amplitude) is proposed.
Abstract: A noise suppression algorithm with high speech quality based on weighted noise estimation and MMSE STSA (Minimum Mean Square Error, Short Time Spectral Amplitude) is proposed. The proposed algorithm continuously updates the estimated noise by weighted noisy speech in accordance with an estimated SNR (Signal to Noise Ratio). With a better noise estimate, a more correct SNR is obtained resulting in enhanced speech with low distortion. Subjective evaluation results show that five-grade mean opinion scores of the new algorithm with a speech codec is improved by as much as 0.35, compared with either the original MMSE STSA or the EVRC (Enhanced Variable Rate Codec) noise suppression algorithm. A later version of this noise suppressor satisfies all the 3GPP (3rd Generation Partnership Project) minimum performance requirements. The latest FOMA® terminal, N2102V produced by NEC, is the world's first 3G handset equipped with this 3GPP-endorsed noise suppressor.

01 Sep 2003
TL;DR: RFC 2658 specifies the streaming format for 3GPP2 13K vocoder (High Rate Speech Service Option 17 for Wideband Spread Spectrum Communications Systems) data, but does not specify a storage format.
Abstract: RFC 2658 specifies the streaming format for 3GPP2 13K vocoder (High Rate Speech Service Option 17 for Wideband Spread Spectrum Communications Systems, also known as QCELP 13K vocoder) data, but does not specify a storage format. Many implementations have been using the "QCP" file format (named for its file extension) for exchanging QCELP 13K data as well as Enhanced Variable Rate Coder (EVRC) and Selectable Mode Vocoders (SMV) data. (For example, Eudora(r), QuickTime(r), and cmda2000(r) handsets).

Proceedings Article
01 Jan 2003
TL;DR: An efficient rate selection algorithm that can be used to transcode speech encoded by any code excited linear prediction (CELP)-type codec into a format compatible with selectable mode vocoder via direct parameter transformation is proposed.
Abstract: In this paper, we propose an efficient rate selection algorithm that can be used to transcode speech encoded by any code excited linear prediction (CELP)-type codec into a format compatible with selectable mode vocoder (SMV) via direct parameter transformation. The proposed algorithm performs rate selection using the CELP parameters. Simulation results show that while maintaining similar overall bit-rate compared to the rate selection algorithm of SMV, the proposed algorithm requires less computational load than that of SMV and does not degrade the quality of the transcoded speech.


Proceedings ArticleDOI
K.R. Pankaj1
09 Nov 2003
TL;DR: A direct speech transcoding scheme from the CDMA standard EVRC to the ITU standard G.729ab, which is the de facto workhorse for low bit rate speech coding over VOP (voice over packet) networks and results in considerable savings in computations.
Abstract: In this present day of Internet and wireless it has become increasingly important to have an interoperable communication between these two systems. It is obvious that at the present environment a direct speech transcoding scheme holds the key for the efficient and seamless transmission of speech communication between the two systems. This paper presents a direct speech transcoding scheme from the CDMA (code division multiple access) standard EVRC (enhanced variable rate codec) to the ITU (International Telecommunication Union) standard G.729ab. The EVRC is developed by Lucent and adopted as TIA/IS 127 standard by TIA (Telecommunications Industries Association). It is the most widely used speech codec for the present CDMA mobile systems. Again, it is also a very competent candidate for the 3rd generation mobile system for speech coding for its high quality. The ITU standard G.729ab is the de facto workhorse for low bit rate speech coding over VOP (voice over packet) networks. The motivation behind this transcoding scheme is to transform the EVRC parameters into the G.729ab parameters directly without going through the whole process of decoding the EVRC parameters and then encoding the resultant synthetic speech using the G.729ab encoder so that it improves the delay characteristics and also the quality of the speech. In the same time this approach also results in considerable savings in computations.

01 Jan 2003
TL;DR: The results show that the UEP system outperforms the Equal Error Protection (EEP) one by 1.45 dB at BER of 10 -5.5 dB.
Abstract: The Enhanced Variable Rate Codec (EVRC) is a standard for the Speech Service Option 3 for Wideband Spread Spectrum Digital System, which has been employed in both IS-95 cellular systems and ANSI J-STC-008 PCS (Personal Communications Systems) This paper investigated the combination of turbo codes with Unequal Error Protection (UEP) and 16-QAM modulation for EVRC codec of Rate 1 to get power and bandwidth efficient coding scheme The results show that the UEP system outperforms the Equal Error Protection (EEP) one by 145 dB at BER of 10 -5