Showing papers on "Adaptive Multi-Rate audio codec published in 1998"

PDF

Open Access

Patent•

Source coding enhancement using spectral-band replication

[...]

Liljeryd Lars Gustaf, Ekstrand Per Rune Albin, Henn Lars Fredrik, Hans Magnus Kristofer Kjorling

09 Jun 1998

TL;DR: In this article, the authors proposed a new method and apparatus for the enhancement of source coding systems, which employs bandwidth reduction (101) prior to or in the encoder, followed by spectral-band replication (105) at the decoder.

...read moreread less

Abstract: The present invention proposes a new method and apparatus for the enhancement of source coding systems. The invention employs bandwidth reduction (101) prior to or in the encoder (103), followed by spectral-band replication (105) at the decoder (107). This is accomplished by the use of new transposition methods, in combination with spectral envelope adjustments. Reduced bitrate at a given perceptual quality or an improved perceptual quality at a given bitrate is offered. The invention is preferably integrated in a hardware or software codec, but can also be implemented as a separate processor in combination with a codec. The invention offers substantial improvements practically independent of codec type and technological progress.

...read moreread less

488 citations

Journal Article•DOI•

Lossy source coding

[...]

Toby Berger¹, Jerry D. Gibson¹•Institutions (1)

Cornell University¹

01 Oct 1998-IEEE Transactions on Information Theory

TL;DR: This work chronicles the development of rate-distortion theory and provides an overview of its influence on the practice of lossy source coding.

...read moreread less

Abstract: Lossy coding of speech, high-quality audio, still images, and video is commonplace today. However, in 1948, few lossy compression systems were in service. Shannon introduced and developed the theory of source coding with a fidelity criterion, also called rate-distortion theory. For the first 25 years of its existence, rate-distortion theory had relatively little impact on the methods and systems actually used to compress real sources. Today, however, rate-distortion theoretic concepts are an important component of many lossy compression techniques and standards. We chronicle the development of rate-distortion theory and provide an overview of its influence on the practice of lossy source coding.

...read moreread less

213 citations

Patent•

Audio coding systems and methods

[...]

Anthony John Robinson¹, Carl William Seymour¹, Roger Cecil Ferry Tucker¹•Institutions (1)

Hewlett-Packard¹

15 May 1998-Journal of the Acoustical Society of America

TL;DR: In this paper, an audio signal is decomposed into lower and upper sub-band and at least the noise component of the upper subband is encoded at the decoder by a decoding means which utilises a synthesised noise excitation signal and a filter to reproduce the noise components in the lower subband.

...read moreread less

Abstract: An audio signal is decomposed into lower and upper sub-band and at least the noise component of the upper sub-band is encoded. At the decoder the audio signal is synthesised by a decoding means which utilises a synthesised noise excitation signal and a filter to reproduce the noise component in the upper sub-band.

...read moreread less

160 citations

Proceedings Article•DOI•

A two stage hybrid embedded speech/audio coding structure

[...]

S.A. Ramprashad¹•Institutions (1)

Bell Labs¹

12 May 1998

TL;DR: A two stage hybrid embedded speech/audio coding structure uses a speech coder as a core to provide the minimal bitrate and an acceptable performance on speech inputs and a transform coder using a modified discrete cosine transform and perceptual coding principles is proposed.

...read moreread less

Abstract: A two stage hybrid embedded speech/audio coding structure is proposed. The structure uses a speech coder as a core to provide the minimal bitrate and an acceptable performance on speech inputs. The second stage is a transform coder using a modified discrete cosine transform (MDCT) and perceptual coding principles. This stage is itself embedded both in complexity and bitrate, and provides various levels of enhancement of the core output, particularly for general audio signals like music. Informal A-B comparison tests show that the performance of the structure at 16 kb/s is between that of the GSM enhanced full rate coder at 12.2 kb/s, and the G.728 LD-CELP coder at 16 kb/s.

...read moreread less

69 citations

Patent•DOI•

Soft-clipping postprocessor scaling decoded audio signal frame saturation regions to approximate original waveform shape and maintain continuity

[...]

Shuwu Wu, John Mantegna

13 Oct 1998-Journal of the Acoustical Society of America

TL;DR: An audio coder/decoder that is suitable for real-time applications due to reduced computational complexity, and a novel adaptive sparse vector quantization (ASVQ) scheme and algorithms for general purpose data quantization, which provides low bit-rate compression for music and speech, while being applicable to higher bit- rate audio compression.

...read moreread less

Abstract: An audio coder/decoder ("codec") that is suitable for real-time applications due to reduced computational complexity, and a novel adaptive sparse vector quantization (ASVQ) scheme and algorithms for general purpose data quantization. The codec provides low bit-rate compression for music and speech, while being applicable to higher bit-rate audio compression. The codec includes an in-path implementation of psychoacoustic spectral masking, and frequency domain quantization using the novel ASVQ scheme and algorithms specific to audio compression. More particularly, the inventive audio codec employs frequency domain quantization with critically sampled subband filter banks to maintain time domain continuity across frame boundaries. The input audio signal is transformed into the frequency domain in which in-path spectral masking can be directly applied. This in-path spectral masking usually results in sparse vectors. The ASVQ scheme is a vector quantization algorithm that is particularly effective for quantizing sparse signal vectors. In the preferred embodiment, ASVQ adaptively classifies signal vectors into six different types of sparse vector quantization, and performs quantization accordingly. The ASVQ technique applies to general purpose data quantization as well as to quantization in the context of audio compression. The invention also includes a "soft clipping" algorithm in the decoder as a post-processing stage. The soft clipping algorithm preserves the waveform shapes of the reconstructed time domain audio signal in a frame- or block-oriented stateless manner while maintaining continuity across frame or block boundaries. The invention includes related methods, apparatus, and computer programs.

...read moreread less

68 citations

Patent•

Audio codec reselection for increased port density

[...]

James L. Fenton¹•Institutions (1)

Cisco Systems, Inc.¹

25 Sep 1998

TL;DR: In this article, an Internet telephony gateway and a method for operating a gateway are disclosed, where the gateway is designed with a port to support a predefined maximum number of audio data channels, and the gateway contains sufficient processing throughput to operate a first, high quality audio codec on a subset of the channels.

...read moreread less

Abstract: An Internet telephony gateway and method for operating a gateway are disclosed. The gateway is designed with a port to support a predefined maximum number of audio data channels. The gateway contains sufficient processing throughput to operate a first, high quality audio codec on a subset of the channels. However, this throughput is sufficient to operate a second, lower quality audio codec on a greater number of the channels, preferably all of them. The first and second codecs are designed to produce compressed audio data streams that are interoperably decompressable. In operation, the gateway host processor assigns new calls to either the first or second codec, depending on the current traffic being handled by the gateway. If new calls would result in the gateway's processing throughput being exceeded, the host processor may reassign a channel from the first codec to the second codec in order to create processing headroom for the addition of a new channel. Because the codecs are interoperably decompressable, no renegotiation need occur with the far end of the communication channel when a codec is reassigned. This gateway offers the potential for high-quality communication over the maximum number of channels possible, with a natural degradation as the gateway reaches its full channel capacity, using modest processing resources.

...read moreread less

62 citations

Proceedings Article•

Speech recognition from GSM codec parameters.

[...]

Juan M. Huerta, Richard M. Stern

01 Jan 1998

TL;DR: It is observed that by selectively combining the cepstral streams representing the LPC parameters and the residual signal it is possible to obtain recognition accuracy directly from the coded parameters that equals or exceeds the recognition accuracy obtained from the reconstructed waveforms.

...read moreread less

Abstract: Speech coding affects speech recognition performance, with recognition accuracy deteriorating as the coded bit rate decreases. Virtually all systems that recognize coded speech reconstruct the speech waveform from the coded parameters, and then perform recognition (after possible noise and/or channel compensation) using conventional techniques. In this paper we compare the recognition accuracy of coded speech obtained by reconstructing the speech waveform with the speech recognition accuracy obtained when using cepstral features derived from the coding parameters. We focus our efforts on speech that has been coded using the 13-kbps full-rate GSM codec, a Regular Pulse Excited Long Term Prediction (RPE-LTP) codec. The GSM codec develops separate representations for the linear prediction (LPC) filter and the residual signal components of the coded speech. We measure the effects of quantization and coding on the accuracy with which these parameters are represented, and present two different methods for recombining them for speech recognition purposes. We observe that by selectively combining the cepstral streams representing the LPC parameters and the residual signal it is possible to obtain recognition accuracy directly from the coded parameters that equals or exceeds the recognition accuracy obtained from the reconstructed waveforms.

...read moreread less

56 citations

Proceedings Article•DOI•

Adaptive multi-rate. A speech service adapted to cellular radio network quality

[...]

A. Uvliden¹, S. Bruhn, R. Hagen•Institutions (1)

Ericsson¹

01 Nov 1998

TL;DR: This work reviews the general AMR system concept and discusses the capacity and quality benefits that can be achieved and an example solution for GSM is described including speech coding, channel coding, inband signaling, and the adaptation scheme.

...read moreread less

Abstract: Adaptive multi-rate (AMR) is an emerging speech service currently being standardized in the ETSI for the GSM system. The new AMR standard will be flexible by adapting the error protection level and the allocated radio resources. A trade-off between speech quality and system capacity can be achieved for a variety of radio channel and operating conditions. The adaptation of the protection level will be fast and speech service specific. Besides the basic source and channel codec for speech signal payload, the AMR system concept further includes channel state tracking and inband transmission of adaptation data. We review the general AMR system concept and discuss the capacity and quality benefits that can be achieved. An example solution for GSM is described including speech coding, channel coding, inband signaling, and the adaptation scheme.

...read moreread less

33 citations

Proceedings Article•DOI•

A 13.0 kbit/s wideband speech codec based on SB-ACELP

[...]

J. Schnitzler

12 May 1998

TL;DR: This paper describes a wideband (7 kHz) speech compression scheme operating at a bit rate of 13.0 kbit/s, i.e. 0.8 bit per sample, using a split-band technique, where the 0-6 kHz band is critically subsampled and coded by an ACELP approach.

...read moreread less

Abstract: This paper describes a wideband (7 kHz) speech compression scheme operating at a bit rate of 13.0 kbit/s, i.e. 0.8 bit per sample. We apply a split-band (SB) technique, where the 0-6 kHz band is critically subsampled and coded by an ACELP approach. The high frequency signal components (6-7 kHz) are generated by an improved high-frequency-resynthesis (HFR) at the decoder such that no additional information has to be transmitted. In informal listening tests, the subjective speech quality was rated to be comparable to the CCITT G.722 wideband codec at 48 kbit/s.

...read moreread less

29 citations

Journal Article•

Concepts for Hybrid Audio Coding Schemes Based on Parametric Techniques

[...]

Bernd Edler, Heiko Purnhagen

01 Sep 1998-Journal of The Audio Engineering Society

28 citations

Patent•DOI•

System and method for providing full-duplex audio communication using a half-duplex audio circuit

[...]

Prakash Iyer¹, Gunner Danneels¹, Lance Carroll¹, Eric Davison¹•Institutions (1)

Intel¹

05 Jan 1998-Journal of the Acoustical Society of America

TL;DR: In this paper, the authors present a method for providing full-duplex audio communication utilizing a half duplex audio circuit in an audio communication system, which comprises the steps of configuring an idle state, a listen state, and a talk state.

...read moreread less

Abstract: The present invention discloses a method for providing full-duplex audio communication utilizing a half-duplex audio circuit in an audio communication system. The method comprises the steps of: (1) configuring an idle state, a listen state, and a talk state; (2) receiving an event triggered by one of an incoming speech, an outgoing speech, and a talk request from the half-duplex audio circuit; and (3) transitioning from one of the states to any one of the states in response to the event to provide full duplex communication.

...read moreread less

Proceedings Article•DOI•

An adaptive multi-rate speech codec based on MP-CELP coding algorithm for ETSI AMR standard

[...]

Hironori Ito¹, M. Serizawa, K. Ozawa, Toshiyuki Nomura•Institutions (1)

NEC¹

12 May 1998

TL;DR: T-tests show that the proposed speech codec based on the multi-pulse based CELP coding and convolutional coding algorithms for the ETSI adaptive multi-rate (AMR) standard meets about 80% of the seventeen requirements, which are selected from the AMR standard study report.

...read moreread less

Abstract: This paper proposes a speech codec based on the multi-pulse based CELP (MP-CELP) coding and convolutional coding algorithms for the ETSI adaptive multi-rate (AMR) standard The codec operates at several speech coding rates, maintaining a fixed gross rate including speech and channel coding for the full-rate (FR) and half-rate (HR) channel modes MP-CELP has great features of easily changing the speech coding rate by controlling the parameters such as the number of pulses and other parameters Subjective tests show that the proposed AMR codec in the FR channel mode achieves higher performance than that of the enhanced FR codec, and the proposed codec in the HR channel mode gives a comparable coding quality to that by the full-rate codec, by selecting an optimal coding rate for each channel condition T-tests based on the test results also show that the proposed speech codec meets about 80% of the seventeen requirements, which are selected from the AMR standard study report Therefore, the proposed codec is promising for the AMR standard

...read moreread less

Proceedings Article•DOI•

GSM EFR based multi-rate codec family

[...]

Janne Vainio¹, H. Mikkola, Kari Jarvinen, Petri Haavisto•Institutions (1)

Nokia¹

12 May 1998

TL;DR: A multi-rate codec family developed as a potential candidate for the GSM adaptivemulti-rate (AMR) codec standard, which consists of the G SM enhanced full rate (EFR) codec and lower bit-rate extensions thereof.

...read moreread less

Abstract: This paper describes a multi-rate codec family developed as a potential candidate for the GSM adaptive multi-rate (AMR) codec standard. The codec family consists of the GSM enhanced full rate (EFR) codec and lower bit-rate extensions thereof. The codec family consists of several codecs, i.e., modes that have different bit-rate partitionings between source coding and error protection. All the source codecs use the same ACELP-method (algebraic code excited linear predictive coding) used also in the GSM EFR codec. The codec operates at gross bit-rates of 22.8 kbit/s in the GSM full rate (FR) channel and 11.4 kbit/s in the GSM half rate (HR) channel. In the full rate channel, the codec provides improved error robustness over the GSM enhanced full rate (EFR) codec. It extends wireline quality (equal to or better than G.726-32 ADPCM) to poor channel error conditions with low C/I-ratios of 7 dB or even below. When operated in the half rate channel, the codec provides improved channel capacity while still providing wireline quality at high C/I-ratios above 16-19 dB.

...read moreread less

Patent•

System and method for providing an enhanced audio quality telecommunication session

[...]

P. Michael Henderson¹, James W. Johnston¹•Institutions (1)

Conexant¹

21 May 1998

TL;DR: In this paper, an improved telecommunication system is capable of supporting an enhanced audio transmission mode and a conventional PCM waveform encoding mode, while the PCM mode is governed by a PCM protocol such as μ-law encoding.

...read moreread less

Abstract: An improved telecommunication system is capable of supporting an enhanced audio transmission mode and a conventional PCM waveform encoding mode. The enhanced audio transmission mode is governed by an audio coding protocol, while the PCM mode is governed by a PCM protocol such as μ-law encoding. The telecommunication system performs an in-band signaling routine during a first communication session in accordance with the PCM protocol. The in-band signaling routine employs a form of robbed bit signaling to transmit information between the calling codec and the called codec. The signaling information is utilized to determine whether the called codec is compatible with the enhanced audio coding mode and, as necessary, to initiate the transition between the PCM mode and the audio coding mode. The audio coding mode transmits signals using a wider bandwidth than that used during the PCM mode. The use of a wider bandwidth results in a higher quality sound that better resembles person-to-person speech.

...read moreread less

Proceedings Article•DOI•

Capacity and speech quality aspects using adaptive multi-rate (AMR)

[...]

O. Corbun¹, M. Almgren, K. Svanbro•Institutions (1)

Ericsson Radio Systems¹

08 Sep 1998

TL;DR: The aim is to show the gain provided by an AMR system compared with an existing GSM system using second generation EFR and HR (half rate) coders, and show that there is a trade-off between capacity increase and speech quality degradation.

...read moreread less

Abstract: The AMR (adaptive multi-rate) is an emerging speech codec cellular standard in the ETSI. This standard should be ready during as a speech GSM evolution. It is a new concept for achieving a high speech quality maintaining an efficient spectrum usage. According to the channel quality and the traffic load, the radio resource algorithm allocates a half-rate or a full-rate channel in order to obtain the best balance between quality and capacity. Within this channel, the codec is quickly adapted to track changes in the radio link. An AMR system model has been developed to show the impact on speech quality by varying the capacity from only full-rate channels to only half-rate channels. The aim is also to show the gain provided by an AMR system compared with an existing GSM system using second generation EFR (enhanced full rate) and HR (half rate) coders. The results show that there is a trade-off between capacity increase and speech quality degradation. It is also very clear that there is a potential gain in quality by using AMR compared to existing speech codecs in GSM systems.

...read moreread less

Proceedings Article•DOI•

Coding of natural audio in MPEG-4

[...]

Schuyler Quackenbush¹•Institutions (1)

AT&T Labs¹

12 May 1998

TL;DR: This paper presents an overview of the MPEG-4 natural audio coding framework and each of its component coding techniques.

...read moreread less

Abstract: MPEG-4 standardizes natural audio coding at bit rates ranging from 2 kbit/s, suitable for intelligible speech coding, to 64 kbit/s per channel, suitable for high-quality audio coding. Within this range, three categories of coding are defined: parametric coding, code excited linear predictive coding (CELP) and time/frequency (T/F) coding. The unique contribution of MPEG-4 audio is that not only does it scale across a wide range of bit rates, but it also scales across a broad set of other parameters, such as sampling rate, bandwidth, voice pitch and complexity. This paper presents an overview of the MPEG-4 natural audio coding framework and each of its component coding techniques.

...read moreread less

Proceedings Article•DOI•

A software platform for multiway audio distribution over the Internet

[...]

O. Hodson¹, S. Varakliotis¹, V. Hardman¹•Institutions (1)

University College London¹

18 Nov 1998

TL;DR: The Robust Audio Tool is discussed, methods of real-time multimedia delivery, and issues of particular importance for music transmission over the Internet are identified, and the Internet performance is illustrated in terms of packet loss, and variable transit delays.

...read moreread less

Abstract: The Robust Audio Tool (RAT) allows users to achieve real-time multiway communication over the Internet. It was initially intended for use in multiway conferences, but is being used as an Internet audio broadcast application, by radio stations in the US and elsewhere. RAT can also be used in a point-to-point manner, and as a transcoder between networks of differing capabilities, e.g. for mobile access to the Internet. The emphasis of work in RAT has been on maximising the audio quality despite inherent problems of packet transport, processor scheduling and audio capabilities of the end system. The important features of RAT, in comparison to other Internet audio tools, is that it is able to support multirate processing, has no restrictions on audio frame duration, and supports multi-channel audio, and both fixed and variable size audio frames. We discuss methods of real-time multimedia delivery, and identify issues of particular importance for music transmission over the Internet. For music coding researchers interested in using RAT to exploit their research, we present an overview of the architecture of the RAT and specifically focus on codec integration. Finally, we present some off-line performance measurements of a public domain MPEG1 music codec that has been integrated into the RAT, and illustrate the Internet performance in terms of packet loss, and variable transit delays.

...read moreread less

Proceedings Article•DOI•

H.263 mobile video codec based on a low power consumption digital signal processor

[...]

Y. Naito¹, I. Kuroda•Institutions (1)

NEC¹

12 May 1998

TL;DR: Fast algorithms, such as a fast motion estimation algorithm and a low complexity noise reduction filter, are proposed to implement the video codec on a single DSP chip maintaining sufficient picture quality by using a 50 MIPS, 100 mW DSP.

...read moreread less

Abstract: This paper describes an H.263 video codec implementation based on a low power consumption general purpose DSP. Fast algorithms, such as a fast motion estimation algorithm and a low complexity noise reduction filter, are proposed to implement the video codec on a single DSP chip maintaining sufficient picture quality. By using a 50 MIPS, 100 mW DSP, the developed codec encodes and decodes 7.5 QCIF frames per second, which is sufficient performance for low bit-rate video compression, typically below 64 kbps.

...read moreread less

Patent•

Echo cancellation in the network for data applications

[...]

Donald Lars Duttweiler¹, David Goodwin Shaw¹•Institutions (1)

Alcatel-Lucent¹

19 Oct 1998

TL;DR: In this article, a network-based CODEC (coder-decoder) includes an echo canceler, which detects the presence of a data call by detecting predefined signaling portions of a modem handshaking process and uses a stored channel model for performing echo cancellation during the data call.

...read moreread less

Abstract: A network-based CODEC (coder-decoder) includes an echo canceler. This CODEC recognizes the presence of a data call by detecting predefined signaling portions of a modem handshaking process. For each detected data call, the CODEC uses a stored channel model for performing echo cancellation during the data call. The CODEC trains off-line during selected segments of the modem call and then stores the new channel model for use in a future data call.

...read moreread less

Patent•

System for compressing video data using bi-orthogonal wavelet coding having a DSP for adjusting compression ratios to maintain a constant data flow rate of the compressed data

[...]

Christian L. Houlberg¹, Philip J. McPartland¹•Institutions (1)

United States Department of the Navy¹

18 Dec 1998

TL;DR: An encoder for compressing video data to allow for its transmission over a narrow bandwidth is described in this paper, where the encoder comprises a multiformat video codec for real-time compression digital data and a dynamic random access memory which operates as a temporary storage device storing compressed data while the codec is compressing data.

...read moreread less

Abstract: An encoder for compressing video data to allow for its transmission over a narrow bandwidth. The encoder comprises a multiformat video codec for real-time compression digital data and a dynamic random access memory which operates as a temporary storage device storing compressed data while the codec is compressing data. A digital signal processor adjust the data compression ratio for the codec while the codec is compressing video data. An EPROM, which is connected to the digital signal processor contains the software to run the digital signal processor. A programmable gate array operates as an interface between the codec and an external processor. The array includes a read write controller which provides a read signal to the codec to allow compressed video data to be read from the codec to a parallel to serial shift register within the array. The write control signals which allow data to be written into and shifted through the register are also generated by the read write controller. The array includes a FIFO flush data controller which is used to flush data from a FIFO within the codec whenever the codec supplies a service request signal to the programmable gate array. The service request signal is provided to the array whenever an overflow condition is about to occur within the FIFO of the codec.

...read moreread less

Proceedings Article•DOI•

Low bit-rate frequency extension coding

[...]

R.C.F. Tucker¹•Institutions (1)

Hewlett-Packard¹

18 Nov 1998

TL;DR: This work proposes encoding just the noise component of the upper frequency band of the original signal using about 500 bits/sec, which greatly enhances contemporary music and close-microphone speech, but has little effect on classical music.

...read moreread less

Abstract: There are now a number of applications, most notably streamed Internet audio, which require audio and speech to be encoded at a low bit rate, typically 16 kbit/sec or below. To achieve an acceptable quality, the original signal is normally low-pass filtered to somewhere between 4 and 5.5 kHz before encoding. Rather than discard the upper frequency band completely, we propose encoding just the noise component of it using about 500 bits/sec. This greatly enhances contemporary music and close-microphone speech, but has little effect on classical music. The process can be used to enhance any audio or speech codec, knowing only its encoding/decoding delay.

...read moreread less

Proceedings Article•

Backward adaptive warped lattice for wideband stereo coding

[...]

Ahi Harma¹, Unto K. Laine¹, Matti Karjalainen¹•Institutions (1)

Helsinki University of Technology¹

01 Sep 1998

TL;DR: An extremely low delay perceptual audio codec is presented based on warped linear prediction which inherently utilizes auditory frequency resolution and frequency masking characteristics of hearing using backward adaptive lattice methods.

...read moreread less

Abstract: In this paper an extremely low delay perceptual audio codec is presented. The codec is based on warped linear prediction which inherently utilizes auditory frequency resolution and frequency masking characteristics of hearing. In the current version of the codec the coding delay is the minimum. This is achieved using backward adaptive lattice methods where waveform modeling is completely based on already transmitted data. Coding technique is applied separately to the two channels but the quantization processes are unified to gain more bit rate reduction.

...read moreread less

Journal Article•DOI•

Fast time-frequency transform algorithms and their applications to real-time software implementation of AC-3 audio codec

[...]

Yu-Chi Chen¹, Chien-Wu Tsai, Ja-Ling Wu•Institutions (1)

National Taiwan University¹

01 May 1998-IEEE Transactions on Consumer Electronics

TL;DR: Two fast algorithms of the time-frequency transform, one for memory economization and the other is for time domain subsampling, are presented and the current performance status of the AC-3 decoder is state.

...read moreread less

Abstract: AC-3 audio coding technology is a kind of perceptual audio coder (PAC) developed by the Dolby Company. Up to 5 full-bandwidth channels and one subwoofer channel (cutoff at 120 Hz) are available in AC-3 to provide multi-channel, low bit rate, and high perceptual quality of audio. This explains why AC-3 has become the audio standard of many international standards. In this paper, we focus on the real-time software implementation issues of AC-3. Two fast algorithms of the time-frequency transform, one for memory economization and the other is for time domain subsampling, are presented. Meanwhile, we state the current performance status of our AC-3 decoder.

...read moreread less

Proceedings Article•DOI•

AudioPaK-an integer arithmetic lossless audio codec

[...]

Mat Hans¹, Ronald W. Schafer•Institutions (1)

Georgia Institute of Technology¹

30 Mar 1998

TL;DR: A simple, lossless audio codec, called AudioPaK, which uses only a small number of integer arithmetic operations on both the coder and the decoder side, and performs as well, or even better than most losslessaudio codecs.

...read moreread less

Abstract: We designed a simple, lossless audio codec, called AudioPaK, which uses only a small number of integer arithmetic operations on both the coder and the decoder side. The main operations of this codec are polynomial prediction and Golomb-Rice coding, and are done on a frame basis. Our coder performs as well, or even better than most lossless audio codecs.

...read moreread less

Journal Article•

FM Analysis/Synthesis-Based Audio Coding

[...]

Bondhan Winduratna

01 May 1998-Journal of The Audio Engineering Society

Proceedings Article•DOI•

Joint source-channel coding with a redundancy-matched binary mapping in GSM speech transmission

[...]

Wen Xu¹•Institutions (1)

Siemens¹

08 Nov 1998

TL;DR: The basic idea is to convert the residual redundancy of the source encoded parameters into the bit redundancy such that it can be more efficiently utilized in the channel decoding and the resulting parameters are less vulnerable to digital errors.

...read moreread less

Abstract: The optimal binary mappings for converting the signal redundancy of the zero-th order (nonuniformity) and the first order (correlation) into individual bits are described. By employing a mapping matched to the residual redundancy inherent in the source-encoded parameters further gains can be obtained in the joint source-channel coding. The basic idea is to convert the residual redundancy of the source encoded parameters into the bit redundancy such that it can be more efficiently utilized in the channel decoding and the resulting parameters are less vulnerable to digital errors. The approach is successfully applied to the GSM full rate (FR) codec to achieve a more reliable transmission of speech signals.

...read moreread less

Dissertation•

Optimization of digital audio for internet transmission

[...]

Mat Hans, Ronald W. Schafer

01 Jan 1998

TL;DR: The MPEG standard is improved and enhanced by introducing new algorithmic and architectural enhancements while staying compliant with the standard, and a new, real-time lossless audio codec is designed and implemented, which is optimized for Internet transmission because of its low instruction complexity and good compression performance.

...read moreread less

Abstract: The focus of this thesis is the development of novel and practical algorithms to encode, transmit, and decode in real-time digital compact disc quality audio over the Internet. More precisely, we improve and enhance the widespread accepted international MPEG audio standard, which defines a lossy compression codec, and we innovate in the area of lossless audio coding, which we believe is likely to play an important part in audio transmission over the Internet in conjunction with the lossy technologies. We enhance the MPEG standard by introducing new algorithmic and architectural enhancements while staying compliant with the standard. Also, we design a decoding process so that it can adapt to varying computational characteristics. Finally, we transcode the compressed bit stream for the streaming over packet networks to several users through paths with heterogeneous characteristics. In the area of lossless audio compression, we survey and classify state-of-the-art lossless audio codecs, and we design and implement a new, real-time lossless audio codec (AudioPaK), which is optimized for Internet transmission because of its low instruction complexity and good compression performance.

...read moreread less

Proceedings Article•DOI•

Low-power implementation of H.324 audiovisual codec dedicated to mobile computing

[...]

Takao Onoye¹, Gen Fujita, Hiroyuki Okuhata, Morgan Hirosuke Miki, Isao Shirakawa¹ - Show less +1 more•Institutions (1)

Osaka University¹

10 Feb 1998

TL;DR: A VLSI implementation of the H.324 audiovisual codec is described, using 0.35 /spl mu/m CMOS 4LM technology, which contains totally 420 K transistors with the dissipation of 224.32 mW from single 3.3 V supply.

...read moreread less

Abstract: A VLSI implementation of the H.324 audiovisual codec is described. A number of sophisticated low-power architectures have been devised dedicatedly for the mobile use. A set of specific functional units, each corresponding to a process of H.263 video codec, is employed to lighten different performance bottlenecks. A compact DSP core composed of two MAC units is used for both ACELP and MP-MLQ coding schemes of the G.723.1 speech codec. The proposed audiovisual codec core has been implemented by using 0.35 /spl mu/m CMOS 4LM technology, which contains totally 420 K transistors with the dissipation of 224.32 mW from single 3.3 V supply.

...read moreread less

Journal Article•

A Low-Power DSP Core Architecture for Low Bitrate Speech Codec(Special Section on Digital Signal Processing)

[...]

Hiroyuki Okuhata, Morgan Hirosuke Miki, Takao Onoye, Isao Shirakawa

25 Aug 1998-IEICE Transactions on Fundamentals of Electronics, Communications and Computer Sciences

Patent•

Interpolation digital filter for audio CODEC

[...]

Ihm Jae-Yong

23 Nov 1998

TL;DR: In this article, an interpolation digital filter for an audio CODEC system with a clock signal of 256 FS was proposed. But the performance of the filter was not as good as the one proposed in this paper.

...read moreread less

Abstract: An interpolation digital filter for an audio CODEC uses a bit serial method for an audio CODEC system with a clock signal of 256 FS. The interpolation digital filter converts a 32-bits data signal of sampling frequency of 1 FS to a 32-bit data signal of the sampling frequency of 8 FS using a clock signal of 256 FS in a filter unit. Therefore, the present invention reduces the size of the system and reduces the cost.

...read moreread less