Showing papers on "Adaptive Multi-Rate audio codec published in 1999"

PDF

Open Access

Patent•

Speech codec employing speech classification for noise compensation

[...]

Jes Thyssen¹, Huan-Yu Su¹, Yang Gao¹, Adil Benyassine¹•Institutions (1)

24 Aug 1999

TL;DR: In this paper, a method of encoding an input speech signal using a multi-rate encoder having a plurality of encoding rates is disclosed, where a high-pass filter and then a perceptual weighting filter are applied to such signal to generate a first target signal.

...read moreread less

Abstract: A method of encoding an input speech signal using a multi-rate encoder having a plurality of encoding rates is disclosed. A high-pass filter and then a perceptual weighting filter are applied to such signal to generate a first target signal. An adaptive codebook vector is identified from an adaptive codebook using the first target signal by filtering the vector to generate a filtered adaptive codebook vector. An adaptive codebook gain for the adaptive codebook vector is calculated and an error signal minimized. The adaptive codebook gain is adaptively reduced based on one encoding rate from the plurality of encoding rates to generate a reduced adaptive codebook gain. A second target signal based at least on the first target signal and the reduced adaptive codebook gain is generated. The input speech signal is converted into an encoded speech based on the second target signal.

...read moreread less

111 citations

Proceedings Article•DOI•

The adaptive multi-rate speech coder

[...]

E. Ekudden¹, R. Hagen, I. Johansson, J. Svedberg•Institutions (1)

Ericsson Radio Systems¹

20 Jun 1999

TL;DR: The adaptive multi-rate (AMR) speech coder currently under standardization for GSM systems as part of the AMR speech service is described, which provides seamless switching on 20 ms frame boundaries and the quality when used on GSM channels is significantly higher than for existing services.

...read moreread less

Abstract: In this paper, we describe the adaptive multi-rate (AMR) speech coder currently under standardization for GSM systems as part of the AMR speech service. The coder is a multi-rate ACELP coder with 8 modes operating at bit-rates from 12.2 kbit/s down to 4.75 kbit/s. The coder modes are integrated in a common structure where the bit-rate scalability is realized mainly by altering the quantization schemes for the different parameters. The coder provides seamless switching on 20 ms frame boundaries. The quality when used on GSM channels is significantly higher than for existing services.

...read moreread less

85 citations

Proceedings Article•DOI•

A wideband speech and audio codec at 16/24/32 kbit/s using hybrid ACELP/TCX techniques

[...]

B. Bessette¹, R. Salami¹, Claude Laflamme¹, Roch Lefebvre¹•Institutions (1)

Université de Sherbrooke¹

20 Jun 1999

TL;DR: A hybrid ACELP/TCX algorithm for coding speech and music signals at 16, 24, and 32 kbit/s is presented, which switches between algebraic code excited linear prediction (ACELP) and transform coded excitation (TCX) modes on a 20-ms frame basis.

...read moreread less

Abstract: A hybrid ACELP/TCX algorithm for coding speech and music signals at 16, 24, and 32 kbit/s is presented. The algorithm switches between algebraic code excited linear prediction (ACELP) and transform coded excitation (TCX) modes on a 20-ms frame basis. Applying TCX on 20 ms frames improved the quality for music signals. Special care was taken to alleviate the switching artifacts between the two modes resulting in a transparent switching process. Subjective test results showed that for speech signals, the performance at 16, 24, and 32 kbit/s, is equivalent to G.722 at 48, 56, and 64 kbit/s, respectively. For music signals, the quality at 24 kbit/s was found equivalent to G.722 at 56 kbit/s. However, at 16 kbit/s, the quality for music was slightly lower than G.722 at 48 kbit/s.

...read moreread less

76 citations

Proceedings Article•DOI•

A modular approach to speech enhancement with an application to speech coding

[...]

A.J. Accardi¹, R.V. Cox•Institutions (1)

AT&T Labs¹

15 Mar 1999

TL;DR: In this paper, a modified version of Ephraim and van trees's (see IEEE Trans. Speech and Audio Proc., vol.3, p.251-66, 1995) spectral domain constrained signal subspace estimator is used in this manner, obtaining a system with greater flexibility and similar performance.

...read moreread less

Abstract: Ephraim and Malah's (1984, 1985) MMSE-LSA speech enhancement algorithm, while robust and effective, is difficult to tune and adjust for the tradeoff between noise reduction and distortion. We suggest a means of generalizing this design, which allows for other estimators besides the MMSE-LSA to be used within the same supporting framework. When a modified version of Ephraim and Van Trees's (see IEEE Trans. Speech and Audio Proc., vol.3, p.251-66, 1995) spectral domain constrained signal subspace estimator is used in this manner, we obtain a system with greater flexibility and similar performance. We also explore the possibility of using different speech enhancement techniques as pre-processors for different parameter extraction modules of the IS-641 speech coder (a 7.4 kbit/s ACELP codec). We show that such a strategy can increase the quality of the coded speech and lead to a system that is more robust to differing noise types.

...read moreread less

70 citations

Proceedings Article•DOI•

Concepts and solutions for link adaptation and inband signaling for the GSM AMR speech coding standard

[...]

S. Bruhn¹, Peter Blöcher², Karl Hellwig², J. Sjoberg²•Institutions (2)

Ericsson Radio Systems¹, Ericsson²

16 May 1999

TL;DR: Various approaches for link adaptation with respect to varying radio channel conditions are described and the method of inband signaling that is standardized is discussed and motivated.

...read moreread less

Abstract: The European Telecommunications Standards Institute (ETSI) has just defined an adaptive multi rate (AMR) speech codec standard for the GSM system with a multitude of source and channel coding rates. The standard aims to provide robust high quality speech together with the flexibility to deliver radio network capacity enhancements by means of low bit-rate operation. The codec rates are dynamically selected with respect to the rapidly changing radio conditions and to local capacity requirements. This paper describes various approaches for link adaptation with respect to varying radio channel conditions and puts a focus on the solution in the AMR standard. Moreover the method of inband signaling that is standardized is discussed and motivated.

...read moreread less

59 citations

Journal Article•

MPEG-4 Low Delay Audio Coding Based on the AAC Codec

[...]

Eric Allamanche¹, Ralf Geiger¹, Juergen Herre¹, Thomas Sporer¹•Institutions (1)

Fraunhofer Society¹

01 May 1999-Journal of The Audio Engineering Society

TL;DR: An MPEG~2 AAC-derived codec which was optimized for very low delay and accepted as the baseline of development for low-delay coding in MPEG-4 version 2 audio is described.

...read moreread less

Abstract: Perceptual audio coding is known to deliver high sound quality even at low bit rates for a broad range of audio signals. However, the total delay of the encoder/decoder chain is usually considerably higher than acceptable for two-way communication applications, such as teleconferencing. This paper discusses the primary sources of algorithmic delay in a perceptual audio codec and describes an MPEG~2 AAC-derived codec which was optimized for very low delay and accepted as the baseline of development for low-delay coding in MPEG-4 version 2 audio.

...read moreread less

55 citations

Proceedings Article•DOI•

A 16, 24, 32 kbit/s wideband speech codec based on ATCELP

[...]

P. Combescure¹, J. Schnitzler¹, K. Fischer², R. Kircherr, Claude Lamblin³, A. Le Guyader³, D. Massalaux, Catherine Quinquis³, Joachim Stegmann², Peter Vary¹ - Show less +6 more•Institutions (3)

RWTH Aachen University¹, Deutsche Telekom², Orange S.A.³

15 Mar 1999

TL;DR: A combined adaptive transform codec (ATC) and code-excited linear prediction (CELP) algorithm for the compression of wideband (7 kHz) signals is described and a switching scheme between CELP and ATC mode is proposed and a frame erasure concealment technique is proposed.

...read moreread less

Abstract: This paper describes a combined adaptive transform codec (ATC) and code-excited linear prediction (CELP) algorithm, called ATCELP, for the compression of wideband (7 kHz) signals. The CELP algorithm applies mainly to speech, whereas the ATC mode is selected for music and noise signals. We propose a switching scheme between CELP and ATC mode and describe a frame erasure concealment technique. Subjective listening tests have shown that the ATCELP codec at bit rates of 16, 24 and 32 kbit/s achieved performances close to those of the CCITT G.722 at 48, 56 and 64 kbit/s, respectively, at most operating conditions.

...read moreread less

48 citations

Proceedings Article•DOI•

A multimode transform predictive coder (MTPC) for speech and audio

[...]

S.A. Ramprashad¹•Institutions (1)

Bell Labs¹

20 Jun 1999

TL;DR: This multimode transform predictive coder (MTPC) shows improved performance on both speech and audio inputs when compared to a single-mode transform predictivecoder (TPC).

...read moreread less

Abstract: Speech and audio coding are often considered to be two separate technologies, each almost independently developing different techniques for signal compression. At low bit rates the gap in performance between the two technologies begins to be noticeable; speech coders work better on speech and audio coders perform better on music. The challenge is to merge the two technologies into a single coding paradigm which will work as well as either two regardless of the input signal. Presented is a multimode speech and audio coder which can adapt almost continuously between a speech and audio coding mode. This multimode transform predictive coder (MTPC) shows improved performance on both speech and audio inputs when compared to a single-mode transform predictive coder (TPC).

...read moreread less

44 citations

Proceedings Article•DOI•

Advances in parametric audio coding

[...]

Heiko Purnhagen

17 Oct 1999

TL;DR: A brief tutorial overview of parametric audio coding is given and the parametric coder currently developed in the MPEG-4 audio standardisation is described.

...read moreread less

Abstract: Parametric modelling provides an efficient representation of general audio signals and is utilised in very low bit rate audio coding. It is based on the decomposition of an audio signal into components which are described by appropriate source models and represented by model parameters. Perception models are utilised in signal decomposition and model parameter coding. This paper gives a brief tutorial overview of parametric audio coding and describes the parametric coder currently developed in the MPEG-4 audio standardisation. Recent advances as well as novel approaches in this field are presented.

...read moreread less

43 citations

Patent•

Apparatus and method for intelligent conference call codec selection

[...]

Shmuel Shaffer¹, William J. Beyda¹•Institutions (1)

Siemens¹

19 Aug 1999

TL;DR: In this article, a multipoint control unit (MCU) is provided which allows for dynamic codec selection, and the endpoints can renegotiate their codec selections if a most common available codec is not being used, upon entry of new parties to a teleconference.

...read moreread less

Abstract: A multipoint control unit ( 104 ) is provided which allows for dynamic codec selection. According to one embodiment, the MCU ( 104 ) causes endpoints ( 102, 106 ) to renegotiate their codec selections if a most-commonly available codec is not being used, upon entry of new parties to a teleconference. Alternatively, the codec renegotiation may be performed each time a user speaks, to optimize for maximum transmission quality or for minimizing transcoding.

...read moreread less

31 citations

Journal Article•DOI•

Embedded coding using a mixed speech and audio coding paradigm

[...]

Sean A. Ramprashad¹•Institutions (1)

Alcatel-Lucent¹

01 May 1999-International Journal of Speech Technology

TL;DR: A two stage hybrid embedded speech/audio coding structure and algorithm is proposed which can be used to enhance the quality of an existing codec without modification of the original coding algorithm.

...read moreread less

Abstract: A two stage hybrid embedded speech/audio coding structure and algorithm are proposed. The first stage of the structure consists of a core speech coder which provides a minimum output bit rate and acceptable performance on clean speech inputs. The second stage is a perceptual/transform based coder which provides a separate optional bitstream for the enhancement of the core stage output. The two stage structure can be used to enhance the quality of an existing codec without modification of the original coding algorithm. In this regard it can be considered a value added option that can be used with a standard (existing) system. The structure can also be used in systems in which many users/systems force the coding algorithm to work simultaneously under multiple constraints of bitrate, complexity, delay, and coding quality. Informal testing of the algorithm has been done using ITU-T standard G.723.1 at 5.3 kb/s as a core coder. The maximum combined bitrate from the core and enhancement stages for the tests is 16 kb/s. The tests show that the second stage significantly improves the quality of the core output in the cases of music and speech with background noise. Compared to the non-embedded fixed rate standard LD-CELP G.728 at 16 kb/s, the quality of the two stage structure is generally lower on these inputs; the embedded feature does affect quality. On clean speech the quality of the two stage structure at 16 kb/s is close to if not better than that of G.728 at 16 kb/s.

...read moreread less

Proceedings Article•DOI•

Scalable audio coder based on quantizer units of MDCT coefficients

[...]

A. Jin, Takehiro Moriya, T. Norimatsu, M. Tsushima, T. Ishikawa - Show less +1 more

15 Mar 1999

TL;DR: Subjective quality evaluation tests showed that the scalable codec constructed by using transform coding and the basic modules for scalable encoder and decoder is better than that of an MPEG-2 layer 3 codec at 8, 16, and 24 kbit/s when the authors' scalable codec is constructed of 8-k bit/s basic modules.

...read moreread less

Abstract: A scalable codec has been constructed by using transform coding and the basic modules for scalable encoder and decoder. It allows users to choose a variety of scalable configurations in the frequency domain. The basic module is a quantizer that can quantize MDCT (modified DCT) coefficients transformed from a variety of frequency regions. This module mainly works at bit rates of more than 8 kbit/s. We can also change the target frequency regions of the basic module's input-output signals in each transform frame; i.e., we can change the scalable structure according to the nature of the input signals. In the scalable codec described here, the input-output signals are monaural and the sampling frequency is 24 kHz. The total bit rate of this scalable codec is more than 8 kbit/s. Subjective quality evaluation tests, mainly for musical sound sources, showed that it's sound quality is better than that of an MPEG-2 layer 3 codec at 8, 16, and 24 kbit/s when our scalable codec is constructed of 8-kbit/s basic modules. In combination with AAC (advanced audio coding), our scalable codec will be chosen as an international standard in ISO/IEC-MPEG-4/Audio.

...read moreread less

Proceedings Article•DOI•

An adaptive multi-rate speech coder for digital cellular telephony

[...]

Erdal Paksoy¹, J. Carlos de Martin¹, Alan V. McCree¹, C.G. Gerlach¹, A.K. Anandakumar, Wai-Ming Lai¹, Vishu R. Viswanathan¹ - Show less +3 more•Institutions (1)

Texas Instruments¹

15 Mar 1999

TL;DR: An adaptive multi-rate (AMR) speech coder designed to operate under the GSM digital cellular full rate and half rate channels and to maintain high quality in the presence of highly varying background noise and channel conditions is developed.

...read moreread less

Abstract: We have developed an adaptive multi-rate (AMR) speech coder designed to operate under the GSM digital cellular full rate (22.8 kb/s) and half rate (11.4 kb/s) channels and to maintain high quality in the presence of highly varying background noise and channel conditions. Within each total rate, several codec modes with different source/channel bit rate allocations are used. The speech coders in each codec mode are based on the CELP algorithm operating at rates ranging from 11.85 kb/s down to 5.15 kb/s, where the lowest rate coder is a source controlled multi-modal speech coder. The decoders monitor the channel quality at both ends of the wireless link using the soft values for the received bits and assist the base station in selecting the codec mode that is appropriate for a given channel condition. The coder was submitted to the GSM AMR standardization competition and met the qualification requirements in an independent formal MOS test.

...read moreread less

Proceedings Article•DOI•

Avoiding distortions due to speech coding and transmission errors in GSM ASR tasks

[...]

Ascensión Gallardo-Antolín¹, Fernando Diaz-de-Maria², Francisco J. Valverde-Albacete²•Institutions (2)

Carlos III Health Institute¹, Charles III University of Madrid²

15 Mar 1999

TL;DR: This work extends previous research on a new approach to automatic speech recognition (ASR) in the GSM environment and concludes that the proposed approach is much more effective in coping with the coding distortion and transmission errors.

...read moreread less

Abstract: We have extended our previous research on a new approach to automatic speech recognition (ASR) in the GSM environment. Instead of recognizing from the decoded speech signal, our system works from the digital speech representation used by the GSM encoder. We have compared the performance of a conventional system and the one we propose on a speaker independent, isolated-digit ASR task. For the half and full-rate GSM codecs, from our results, we conclude that the proposed approach is much more effective in coping with the coding distortion and transmission errors. Furthermore, in clean speech conditions, our approach does not impoverish the recognition performance, even recognizing from GSM digital speech, in comparison with a conventional system working on unencoded speech.

...read moreread less

Proceedings Article•DOI•

Voice activity detection for GSM adaptive multi-rate codec

[...]

A. Vahatalo¹, I. Johansson•Institutions (1)

Nokia¹

20 Jun 1999

TL;DR: The VAD for controlling DTX of the GSM AMR (adaptive multi-rate) speech codec is described, which is based on spectral estimation and periodicity detection and incorporates novel methods to estimate background noise and to detect periodic components based on open-loop pitch gain.

...read moreread less

Abstract: This paper describes the VAD (voice activity detection) for controlling DTX (discontinuous transmission) of the GSM AMR (adaptive multi-rate) speech codec. The algorithm is based on spectral estimation and periodicity detection. The VAD contains a 9-band IIR filter bank, which divides input signals into frequency bands. The signal level at each band is calculated. Background noise is estimated in each sub-band. The VAD decision is computed by comparing input signal level and background noise estimate. The algorithm incorporates novel methods to estimate background noise and to detect periodic components based on open-loop pitch gain. A new method is also derived to detect correlated complex signals like music.

...read moreread less

Patent•

Method and apparatus for controlling the transition of an audio signal converter between two operative modes based on a certain characteristic of the audio input signal

[...]

Chung Cheung C. Chu¹, Rafi Rabipour¹, David G. Sloan²•Institutions (2)

Nortel¹, Apple Inc.²

18 Jun 1999

TL;DR: In this article, a method and apparatus for controlling the transition of a bypass capable codec between operative modes, based on a certain characteristic of the audio data signal processed by the codec, is presented.

...read moreread less

Abstract: The invention relates to a method and apparatus for controlling the transition of a bypass capable codec between operative modes, based on a certain characteristic of the audio data signal processed by the codec. The apparatus relies on a control signal to determine when the codec will switch from one mode to another. This control signal reflects a characteristic of the audio data signal received at the apparatus, such as the type of speech activity or the format of the audio data signal. When in the active (non-bypass) mode, the apparatus relies on an additional control signal to switch to the inactive (bypass) mode. This additional control signal is received from a control unit at a remote codec that indicates that the remote codec is also bypass capable, hence the decoder at the first codec and the encoder at the remote codec can switch to the inactive mode to pass between them the compressed data frames.

...read moreread less

Patent•

Method and radio communication system for transmitting speech information using a broadband or a narrowband speech coding method depending on transmission possibilities

[...]

Oestreich Stefan¹•Institutions (1)

Siemens¹

05 Feb 1999

TL;DR: In this paper, a speech coder/decoder can select a broadband and a narrowband speech coding method for a connection to a mobile station, a monitoring of transmission possibilities is performed, and, given limited transmission possibilities, there is a changeover from broadband to narrowband Speech coding methods.

...read moreread less

Abstract: A speech coder/decoder can select a broadband and a narrowband speech coding method. For a connection to a mobile station, a monitoring of transmission possibilities is performed, and, given limited transmission possibilities, there is a changeover from broadband to narrowband speech coding methods. The received narrowband speech information is expanded to a greater bandwidth at the receive side. The subjective speech impression is improved by the bridging of this changeover effect. This guarantees an improved speech quality to the listener, particularly with the introduction of adaptive multirate coding.

...read moreread less

Patent•

Data CODEC system for computer

[...]

Kang Gyeong Ok, Jang Dae Yeong, Kwak Jin Seok, Hong Jin U, Kim Seong Han, Young Kwon Lim, Jin Woong Kim, Harald Popp, Stefan Geyersberger, Wolfgang Fiesel - Show less +6 more

08 Dec 1999

TL;DR: In this article, the authors present a data CODEC system for computer consisting of a system control software, a multichannel audio/speech and multimedia data signal processor, and a multi-channel audio and speech and multimedia input-output unit.

...read moreread less

Abstract: The present invention relates to a data CODEC system for computer. The data CODEC system for computer comprises a system control software, a multichannel audio/speech and multimedia data signal processor, and a multichannel audio/speech and multimedia data input-output unit. The system control software communicates multichannel audio/speech and multimedia data with the multichannel audio/speech and multimedia data signal processor according to control of various application programs. The multichannel audio/speech and multimedia data signal processor processes multichannel audio/speech and multimedia data. The multichannel audio/speech and multimedia data input-output means inputs/outputs multichannel audio/speech and multimedia data from/to an external system.

...read moreread less

Proceedings Article•DOI•

Multimode variable bit rate speech coding: an efficient paradigm for high-quality low-rate representation of speech signal

[...]

Amitava Das¹, Andrew P. Dejaco¹, Sharath Manjunath¹, Arasanipali K. Ananthapadmanabhan¹, Jing Huang¹, Eddi-Lun Tik Choy¹ - Show less +2 more•Institutions (1)

Qualcomm¹

15 Mar 1999

TL;DR: This paper presents the essential framework and the unique advantages of a multimode VBR codec and suggests algorithms for the different modes.

...read moreread less

Abstract: The speech signal consists of a time-varying ensemble of different types of segments with distinct characteristics, which require different degrees of coding resolution in order to retain an overall high voice quality. A fixed-rate coder can capture such time-varying characteristics only if it operates at a high enough bit rate. At a low bit rate, a fixed-rate coder will not be able to capture all of these various segments well and will fail to render high voice quality. A multimode variable bit rate (VBR) coder uses an arsenal of modes, operating at different bit rates. These modes are designed to represent these different speech segments optimally with the right amount of coding resolution. Thus, a multimode VBR codec adapts the coding mechanism to the input speech and delivers high quality at low (average) rates. This paper presents the essential framework and the unique advantages of a multimode VBR codec and suggests algorithms for the different modes.

...read moreread less

Generalized Audio Coding with MPEG-4 Structured Audio

[...]

Eric D. Scheirer¹, Youngmoo E. Kim•Institutions (1)

Massachusetts Institute of Technology¹

01 Sep 1999

TL;DR: It is proved that the MPEG-4 Structured Audio tool can be used to mimic the behavior of any other kind of decoder and that structured-audio coding is a universally minimal coding technique.

...read moreread less

Abstract: The MPEG-4 Structured Audio standard was created to enable high-quality, very-low-bitrate transmission of synthetic sound. However, structured-audio techniques also are suitable for flexible natural audio coding. This paper introduces the concept of generalized audio coding, in which the Structured Audio decoder is used to emulate the behavior of other audio decoders. We prove that the MPEG-4 Structured Audio tool can be used to mimic the behavior of any other kind of decoder and that structured-audio coding is a universally minimal coding technique. We provide examples of simple natural audio coders that use the SA toolset, and characterize the overhead that arises in the transcoding process. Generalized audio coding removes marketplace barriers to the use of special-purpose or signal-adaptive coding formats, and thus promotes greater overall efficiency in the world of audio coding.

...read moreread less

Proceedings Article•DOI•

Wideband speech coding using forward/backward adaptive prediction with mixed time/frequency domain excitation

[...]

J. Schnitzler, J. Eggers, Christoph Erdmann, Peter Vary

20 Jun 1999

TL;DR: A wideband (7 kHz) speech coding scheme using code-excited linear prediction (CELP) with mixed time and frequency domain excitation with improved synthesis filter is described.

...read moreread less

Abstract: This paper describes a wideband (7 kHz) speech coding scheme using code-excited linear prediction (CELP) with mixed time and frequency domain excitation. The proposed frequency domain innovation can be used alternatively or in parallel to a time domain codebook. In addition an improved synthesis filter is used consisting of a signal dependent combination of a forward adaptive and a backward adaptive (FA/BA) structure. An experimental codec operating at 15.5 or 20.0 kbit/s is demonstrated.

...read moreread less

Filter Banks in Perceptual Audio Coding

[...]

Marina Bosi

01 Sep 1999

Proceedings Article•DOI•

Joint speech codec parameter and channel decoding of parameter individual block codes (PIBC)

[...]

Tim Fingscheidt¹, S. Heinen, Peter Vary•Institutions (1)

AT&T Labs¹

20 Jun 1999

TL;DR: This work proposes the usage of what it calls parameter individual block codes (PIBC) for the most important codec parameters, which allows joint speech codec parameter and PIBC decoding taking advantage of the error concealing properties of soft-bit speech decoding.

...read moreread less

Abstract: In digital mobile speech transmission usually the most important (class la) bits provided by the speech coding scheme are protected by a CRC for error detection. As a consequence all parameters spanned by the class la bits have to be marked at the receiver either as reliable or as unreliable. In contrast to this somewhat coarse approach we propose the usage of what we call parameter individual block codes (PIBC) for the most important codec parameters. This allows joint speech codec parameter and PIBC decoding taking advantage of the error concealing properties of soft-bit speech decoding.

...read moreread less

MPEG-4 Speech Coding

[...]

Masayuki Nishiguchi

01 Sep 1999

Journal Article•DOI•

A modified CS-ACELP algorithm for variable-rate speech coding robust in noisy environments

[...]

F. Beritelli

01 Feb 1999-IEEE Signal Processing Letters

TL;DR: Comparisons with the recent ITU-T G.729 8 kbit/s standard, used in the discontinuous transmission mode, demonstrate that the proposed coder provides an average bit rate reduction of about 20% maintaining the same algorithmic delay and perceptive quality.

...read moreread less

Abstract: This letter deals with a variable bit-rate CS-ACELP speech coder based on new algorithms that are robust in the presence of the background noise typical of wireless communications. The coder presents eight operating modes ranging from 0-8 kbit/s with an average bit-rate of about 4 kbit/s. Subjective and objective comparisons with the recent ITU-T G.729 8 kbit/s standard, used in the discontinuous transmission mode, demonstrate that the proposed coder provides an average bit rate reduction of about 20% maintaining the same algorithmic delay and perceptive quality.

...read moreread less

Proceedings Article•DOI•

A 6.1 to 13.3-kb/s variable rate CELP codec (VR-CELP) for AMR speech coding

[...]

S. Heinen¹, M. Adratm, O. Steil, Peter Vary, Wen Xu - Show less +1 more•Institutions (1)

RWTH Aachen University¹

15 Mar 1999

TL;DR: A new 6.1 to 13.3-kb/s speech codec is proposed called variable rate code-excited linear prediction (VR-CELP) for adaptive multi-rate (AMR) transmission over mobile radio channels such as GSM or UMTS to enhance the transmission quality under very poor channel conditions.

...read moreread less

Abstract: We propose a new 6.1 to 13.3-kb/s speech codec called variable rate code-excited linear prediction (VR-CELP) for adaptive multi-rate (AMR) transmission over mobile radio channels such as GSM or UMTS. The AMR concept allows to operate with almost wireline speech quality for poor channel conditions and better quality for good channel conditions. This is achieved by dynamically splitting the gross bit rate of the transmission system between source and channel coding according to the current channel conditions. Thus the source coding scheme must be designed for seamless switching between rates without annoying artifacts. To enhance the transmission quality under very poor channel conditions, a new powerful error concealment strategy based on estimation theory is applied.

...read moreread less

Proceedings Article•DOI•

A 4 kb/s toll quality harmonic excitation linear predictive speech coder

[...]

S. Yeldener

15 Mar 1999

TL;DR: The HE-LPC coder has the potential of producing high quality speech at 4.8 kb/s and below and employs a new pitch estimation and voicing technique, and new DCT based LPC and residual amplitude quantization techniques have been developed.

...read moreread less

Abstract: The harmonic excitation linear predictive speech coder (HE-LPC) is a technique derived from MBE and MB-LPC type of speech coding algorithms. The HE-LPC coder has the potential of producing high quality speech at 4.8 kb/s and below. This coder employs a new pitch estimation and voicing technique. In addition, new DCT based LPC and residual amplitude quantization techniques have been developed. The 4 kb/s HE-LPC coder with a 14th order LPC filter was found to produce much better speech quality than the various low rate speech coding standards, including 3.6 kb/s INMARSAT Mini-M AMBE vocoder. During formal ITU ACR test, the 4 kb/s HE-LPC vocoder was found to produced equivalent performance to 32 kb/s ADPCM and G.729 for both flat and modified IRS filtered clean input speech conditions. The HE-LPC algorithm can also be extended to cover bit rates between 1.2 and 8 kb/s range depending on the application.

...read moreread less

Journal Article•DOI•

Speech coding in MPEG-4

[...]

Bernd Edler¹•Institutions (1)

Leibniz University of Hanover¹

01 May 1999-International Journal of Speech Technology

TL;DR: This paper gives a brief overview on the complete audio part of the MPEG-4 standard and more detailed information on its parts related to speech coding.

...read moreread less

Abstract: While previous MPEG Audio standards mainly were focused on the representation of audio signals close to or equal to CD quality, the new MPEG-4 Audio standard extends the range of applicability towards significantly lower bit rates. Furthermore it offers extended functionalities for the representation of natural and even synthetic audio signals in an object oriented fashion. This paper gives a brief overview on the complete audio part of the MPEG-4 standard and more detailed information on its parts related to speech coding.

...read moreread less

Journal Article•DOI•

Real-time software video codec with a fast adaptive motion vector search

[...]

T. Moriyoshi¹, H. Shinohara, Miyazaki Takashi¹, I. Kuroda¹•Institutions (1)

NEC¹

20 Oct 1999

TL;DR: A PC-based real-time software MPEG-4 video codec with a fast adaptive motion vector search is presented and this technique suppresses load fluctuation in the ME and contributes to the stable real- time work of the software codec.

...read moreread less

Abstract: A PC-based real-time software MPEG-4 video codec with a fast adaptive motion vector search is presented. In a fast adaptive motion estimation (ME) technique, the search order is dynamically changed in accordance with the motion of objects. This technique suppresses load fluctuation in the ME and contributes to the stable real-time work of the codec. MMX instructions are used to increase the codec speed. On a portable PC, the software video codec supports satisfactory mobile visual communication at 64 kbps and 128 kbps, for example, at QCIF 15 fps. The codec on a 450 MHz Pentium II processor can encode and decode 30 CIF frames in real-time.

...read moreread less

Proceedings Article•DOI•

A MPEG4 programmable codec DSP with an embedded pre/post-processing engine

[...]

S. Kurohmaru, M. Matsuo, H. Nakajima, Y. Kohashi, T. Yonezawa, T. Mori-iwa, M. Ohashi, M. Toujima, T. Nakamura, M. Hamada, T. Hashimoto, H. Fujimoto, Y. Iizuka, J. Michiyama, H. Komori - Show less +11 more

16 May 1999

TL;DR: This DSP has the capability of processing these algorithms in real-time and has excellent flexibility, so that it can, for instance, perform video codec at 15 CIF frames/sec or video/speech (G.723.1) codec at 30 QCIF frames/, making it possible to realize low-cost systems.

...read moreread less

Abstract: We have developed a programmable DSP for MPEG4, H.263, H.261 and wavelet based sub-band codec algorithms. This DSP has the capability of processing these algorithms in real-time and has excellent flexibility, so that it can, for instance, perform video codec at 15 CIF frames/sec or video/speech (G.723.1) codec at 30 QCIF frames/sec. This chip includes a video pre/post-processing engine and needs only one 16 Mbit SDRAM as an external memory to perform the above algorithms, making it possible to realize low-cost systems. This chip is fabricated using 0.25 um CMOS technology and contains 7.7 M transistors on 9.41 mm/spl times/9.22 mm die.

...read moreread less