scispace - formally typeset
Search or ask a question

Showing papers on "Adaptive Multi-Rate audio codec published in 1993"


Journal ArticleDOI
TL;DR: The audio coding standard developed by the Moving Pictures Expert Group within the International Organization for standardization (ISO/MPEG) is covered in some detail, since it will be used in many application areas, including digital storage, transmission, and broadcasting of audio-only signals and audiovisual applications such as videotelephony, videoconferencing, and TV broadcasting.
Abstract: Typical parameters of wideband speech and audio signals, including digitized versions of each, potential applications, and available transmission media, are described. Facts about human auditory perception that are exploited in audio coding and quality measures that play an important role in coder evaluations and designs are reviewed. Techniques for efficient coding of wideband speech and audio signals, with an emphasis on existing standards, are discussed. The audio coding standard developed by the Moving Pictures Expert Group within the International Organization for standardization (ISO/MPEG) is covered in some detail, since it will be used in many application areas, including digital storage, transmission, and broadcasting of audio-only signals and audiovisual applications such as videotelephony, videoconferencing, and TV broadcasting. Ongoing research and standardization work is outlined. >

131 citations


Patent
20 May 1993
TL;DR: In this paper, a codec subsystem is shared between several end users and can be located near the switch, which can be used for video conferencing, remote surveillance or desk-top services.
Abstract: A dial-up aural and visual communication system includes a telecommunication network with a switch connected thereto, a codec subsystem connected to the switch and video equipment connected to the switch via the codec subsystem with voice communication equipment connected directly to the switch. The codec subsystem is shared between several end users and can be located near the switch. Sharing the codec reduces cost and amount of equipment at end users desk. The codec subsystem can also switch video, including composite video, between local lines and can include frame and image storage. The codec can transmit at 9.6 kbps, p×64 kbps, and via ISDN. The system can be used for video conferencing, remote surveillance or desk-top services, and can include an image grooming system. Images may be stored in switch facilities traditionally used for voice mail.

62 citations


BookDOI
01 Jan 1993
TL;DR: Speech Coding for Wireless Transmission, a Beginner's Guide to Speech Coding, and Topics in speech Coding.
Abstract: I: Introduction. II: Low Delay Speech Coding. III: Speech Quality. IV: Speech Coding for Wireless Transmission. V: Audio Coding. VI: Speech Coding for Noisy Transmission Channels. VII: Topics in Speech Coding. Author Index. Index.

56 citations


Journal ArticleDOI
TL;DR: The interaction of coding with networking in a multiuser environment, including algorithms for robust coding which anticipate imperfect network performance, and techniques of decoding a signal that has traversed an imperfect network are described.
Abstract: An overview of low bit rate coding and the interaction between source coding and channel coding is presented. The interaction of coding with networking in a multiuser environment, including algorithms for robust coding which anticipate imperfect network performance, and techniques of decoding a signal that has traversed an imperfect network are described. The performances of such algorithms are illustrated with examples from speech, audio, and video transmission in the presence of packet losses. The challenges in measuring the quality of service (QOS) in the context of new algorithms for coding and networking and the difficulty of measuring QOS in the networking of multimedia information are discussed. >

35 citations


Patent
23 Feb 1993
TL;DR: In this paper, a video codec for a videophone terminal of an integrated services digital network is presented, which is organized to receive firstly image signals from a television camera so as to transmit them, after compressing and encoding them, to a remote video decoding unit, via a digital transmission line, and secondly similarly-processed image signals via the digital transmission lines, to decompress and decode them for a television screen receiver locally connected to the video codec.
Abstract: A video codec for a videophone terminal of an integrated services digital network. The codec is organized to receive firstly image signals from a television camera so as to transmit them, after compressing and encoding them, to a remote video decoding unit, via a digital transmission line, and secondly similarly-processed image signals via the digital transmission line, so as to decompress and decode them for a television screen receiver locally connected to the video codec. The video codec includes a processing unit, of the integrated circuit type, co-operating with a single external memory plane both to compress and encode data to be transmitted, and also to decompress and decode received data, with the assistance of internal operational components time shared so as to apply known methods to transmission and to reception.

24 citations


Proceedings ArticleDOI
27 Apr 1993
TL;DR: The authors propose an audio synthesis/coding method which employs an optimized wavelet-transform (WT)-based adaptive transform coding to exploit perceptual masking and a dynamic dictionary is used to extract source redundancies.
Abstract: The authors propose an audio synthesis/coding method which employs an optimized wavelet-transform (WT)-based adaptive transform coding to exploit perceptual masking. In addition, a dynamic dictionary is used to extract source redundancies. Experiments indicated that transparent coding of monophonic CD quality signals (sampled at 44.1 KHz) is possible at bit rates approaching 64 kbit/s when the WT-based method alone is used, and in the range of 48-64 kbit/s when the dynamic dictionary is also employed. >

21 citations


Journal ArticleDOI
TL;DR: The audio quality, robustness and implementational complexity of a novel mobile digital audio broadcast scheme are addressed and the audio codec proposed is based on an efficient combination of subband coding and multipulse excited linear prediction coding.
Abstract: The audio quality, robustness and implementational complexity of a novel mobile digital audio broadcast scheme are addressed. The audio codec proposed is based on an efficient combination of subband coding (SBC) and multipulse excited linear prediction coding (MPLPC). The bit allocation is dynamically adapted according to both the signal power in different subbands and a perceptual hearing model. Typically a segmental signal to noise ratio (SEGSNR) in excess of 30 dB associated with high fidelity subjective quality was achieved for 2.67-b/sample transmissions at a bit rate of 86 kb/s. Perceptually unimpaired audio quality was achieved for a bit error rate (BER) of about 10/sup -4/, when injecting random errors, which was degraded for increased BERs. In order to provide robust error protection, the audio codec was also subjected to a rigorous bit sensitivity analysis. Four different forward error correction schemes were investigated in order to explore the complexity, bit rate, and robustness tradeoffs. >

19 citations



Proceedings ArticleDOI
A. Crossman1
13 Oct 1993
TL;DR: The audio codec described here was designed to provide 7 kHz bandwidth audio equivalent to CCITT G.722 for both music and speech with extremely low complexity, variable bit rate with subjec tively transparent transitions between changes in bit rate, scalability - the ability to operate as a narrow band or wide band codec.
Abstract: Videoconferencing requires the transmission of both video and audio information over a communication channel of fixed bandwidth. Typically, a portion of the band width is dedicated to audio and the remaining bandwidth to video. For efficient operation, it is necessary to use a variable bit rate audio codec and a variable bit rate video codec. During periods of silence less audio information is transmitted, allowing the video bandwidth and picture quality to increase by using the vacated audio bandwidth. During periods of high audio activity the bandwidth increases to maintian audio fidelity. The audio codec described here was designed for use in such a videoconferencing system. It was designed to provide 7 kHz bandwidth audio equivalent to CCITT G.722 (at 48 kbitts) for both music and speech with extremely low complexity, variable bit rate with subjec tively transparent transitions between changes in bit rate, scalability - the ability to operate as a narrow band or wide band codec. The nominal average bit rate during periods of high audio activity is 28 kbit/s, while during silence as little as 6 kbit/s.

14 citations


Proceedings ArticleDOI
27 Apr 1993
TL;DR: A 2.4-kbit/s analysis-by-synthesis speech codec based on a CELP (code-excited linear prediction) structure is presented and achieves better subjective ratings than LPC-10e; however, it is still inferior to the Federal standard operating at double the bit rate, although for many speakers the difference was small.
Abstract: A 2.4-kbit/s analysis-by-synthesis speech codec based on a CELP (code-excited linear prediction) structure is presented. Several bit-rate reduction techniques, including the addition of a frame classifier and a frame-dependent codec structure, are used to improve the resulting speech quality. For voiced frames, the encoding is based entirely on the past excitation retrieved from an adaptive codebook using multitap gains with as many as seven taps. Unvoiced frames are encoded using only stochastic excitation. Finally, transition frames use both the adaptive and the stochastic codebooks. The results of subjective quality evaluation show that the open-loop classifier performs slightly better than the open-loop classifier, especially for female speakers. The best 2.4-kbit/s system uses the closed-loop classifier and a five-tap adaptive codebook for encoding voiced segments. This system achieve better subjective ratings than LPC-10e; however, it is still inferior to the Federal standard operating at double the bit rate, although for many speakers the difference was small. >

10 citations


Patent
Yoshito Haba1
18 May 1993
TL;DR: In this paper, a coding mode for transmitting audio information is set, and the audio CODEC corresponding to the coding mode is selected and switched and switched according to the instruction.
Abstract: A communication apparatus having a plurality of audio CODEC and communication method thereof. When a coding mode for transmitting audio information is set, the audio CODEC corresponding to the coding mode is selected and switched. When the coding mode of the audio information is instructed from the terminal on the transmission side, the audio CODEC on the reception-side terminal is switched according to the instruction. When these audio CODEC are switched, generation of audio noise can be suppressed by setting the mute on the audio information.

Book ChapterDOI
01 Jan 1993
TL;DR: This work has shown that nonuniform analysis-synthesis systems based on maximally decimated filter banks in which the bandwidths of the channels increase with increasing frequency are best met.
Abstract: Over the last decade, analysis-synthesis systems based on maximally decimated filter banks have emerged as one of the important techniques for speech and audio coding. For speech and audio signals, the analysis-synthesis filter bank can be thought of as modeling the human auditory system, where the critical band model of aural perception is reflected in the design of the filter banks. The constraints imposed by the aural model are best met by nonuniform analysis-synthesis systems in which the bandwidths of the channels increase with increasing frequency.

Proceedings ArticleDOI
13 Oct 1993
TL;DR: In this paper, the authors proposed a Wideband-CELP-Coding scheme (bandwidth 7kHz) at 24 kbit/s with a delay of just 10 ms.
Abstract: This paper proposes a Wideband-CELP-Coding scheme (bandwidth 7kHz) at 24 kbit/s. The codec introduces a delay of just 10 ms. This fulfills the requirements of a possible codec candidate for wideband speech coding within DECT or video applications [1]. The analysis-by-synthesis structure of the proposed Wideband-CELP-Codec includes an alternative LPC analysis concept, where the autocorrelation function is calculated recursively [2]. This special LPC scheme provides an improved speech quality and a reduction of computational complexity in comparison to conventional algorithms for the LPC analysis. In addition a stochastic sparse codebook with extremely low computational effort is presented, comparable to the one presented in [12] resulting in a neglectable amount of storage. The CCITT G.722 standard was applied as reference codec, in order to compare the new coding scheme in terms of subjective quality. With the proposed Wideband-CELP a speech quality is achieved, which is equivalent to the reference codec operating at 56 kbit/s.

Proceedings Article
01 Jan 1993
TL;DR: A speech quality is achieved, which is equivalent to the reference codec operating at 56 kbit/s, and a stochastic sparse codebook with extremely low computational effort is presented, comparable to the one presented in [12].
Abstract: This paper proposes a Wideband-CELP-Coding scheme (bandwidth 7kHz) at 24 kbit/s The codec introduces a delay of just 10 ms This fulfills the requirements of a possible codec candidate for wideband speech coding within DECT or video applications [1] The analysis-by-synthesis structure of the proposed Wideband-CELP-Codec includes an alternative LPC analysis concept, where the autocorrelation function is calculated recursively [2] This special LPC scheme provides an improved speech quality and a reduction of computational complexity in comparison to conventional algorithms for the LPC analysis In addition a stochastic sparse codebook with extremely low computational effort is presented, comparable to the one presented in [12] resulting in a neglectable amount of storage The CCITT G722 standard was applied as reference codec, in order to compare the new coding scheme in terms of subjective quality With the proposed Wideband-CELP a speech quality is achieved, which is equivalent to the reference codec operating at 56 kbit/s

Proceedings ArticleDOI
L. Cellario1, Daniele Sereno
13 Oct 1993
TL;DR: A variable rate speech codec with seven operating rates ranging from 400 bit/s to 16 kbit/s and a 10 ms algorithmic delay is presented in this paper.
Abstract: A variable rate speech codec with seven operating rates ranging from 400 bit/s to 16 kbit/s and a 10 ms algorithmic delay is presented in this paper. The current rate is chosen according to an open-loop speech classification followed by a closed-loop quality evaluation. The rate selection can be source-controlled or network-controlled. The codec is based on a CUP algorithm with a variable number of excitations. The innovations have a deterministic structure allowing a very efficient search procedure and are orthogonalized to the long-term contribution. The algorithm will be implemented in the testbed of one RACE II project: CODIT (COde DIvision Testbed) [1].

PatentDOI
Makio Nakamura1, Akira Hioki1
TL;DR: In this article, one frame of speech signal data and subframes of speech data are encoded by frame and subframe encoding processes while another frame is decoded by frame-and sub-frame decoding processes.
Abstract: One frame of speech signal data and subframes of speech signal data divided from the frame of speech signal data are encoded by frame and subframe encoding processes while another frame is decoded by frame and subframe decoding processes. The frame and subframe encoding processes and the frame and subframe decoding processes are interleaved to reduce the DSP memory capacity needed for a speech codec.

Book ChapterDOI
01 Jan 1993
TL;DR: The quality improvements are in terms of increased intelligibilty, naturalness and speaker recognition, and several future applications are foreseen for wideband speech coders, such as teleconferencing, commentary channels, and high-quality wideband telephony.
Abstract: In the recent years, there has been a great advance in the development of speech coding algorithms at very low bit rates. High-quality speech coders are now available at bit rates below 8 kb/s. Researchers’ efforts, however, have focussed on narrow-band speech signals where the transmission bandwidth is limited to 300-3400 Hz, as in analog telephone systems. This bandwidth limitation degrades the speech quality, specially when the speech is to be heard through loudspeakers. For many future applications, a wider bandwidth is needed in order to achieve face-to-face communication quality. A bandwdith of 50-7000 Hz provides significantly improved quality as compared to narrow-band speech. The quality improvements are in terms of increased intelligibilty, naturalness and speaker recognition. Several future applications are foreseen for wideband speech coders, such as teleconferencing, commentary channels, and high-quality wideband telephony.

Proceedings ArticleDOI
17 Oct 1993
TL;DR: The adaptive predictive coding with transform domain quantization (APC-TQ) technique was proposed by Bhaskar (1991) for the compression of audio signals and the result is a near transparent quality compression of 5 kHz bandwidth audio at a rate of 17 kbit/s.
Abstract: The adaptive predictive coding with transform domain quantization (APC-TQ) technique was proposed by Bhaskar (1991) for the compression of audio signals. Since then, significant developments have taken place leading to a reduction in the coding rate. While enhancing the audio quality. These developments include (i) the use of block size adaptation to exploit the variations in the stationarity of the signal, (ii) high resolution spectral modeling using LPC analysis orders up to 64, and (iii) an adaptive bit-allocation procedure to minimize coding noise power as well as minimize the perception of coding noise. The result is a near transparent quality compression of 5 kHz bandwidth audio at a rate of 17 kbit/s. This technology will find applications in the distribution and transmission of AM quality audio programming over low rate channels such as the INMARSAT Standard A, B and aeronautical systems. >

Patent
21 Jun 1993
TL;DR: In this paper, a method and an apparatus for preprocessing digital voice data enroute to or from a Telephone's CODEC (108) is described, where an auxiliary processing device (130) connected to the internal bus side of the phone is provided with means to process the digital voice information before sending it uplink or to the phone.
Abstract: A method and apparatus for preprocessing digital voice data enroute to or from a Telephone's CODEC (108). The apparatus supports a CODEC (108) directly connected to an internal bus and using a separate CODEC clock and sync signal to control transfers between a telephony link/internal bus interface (102) and the CODEC (108). According to an embodiment of the present invention, an auxiliary processing device (130) (connected to the internal bus side of the phone) is provided with means to process the digital voice information before sending it uplink or to the CODEC (108). This is accomplished without changing the position of the incoming voice field.

Proceedings Article
01 Jan 1993

Proceedings ArticleDOI
03 May 1993
TL;DR: A new variable-rate codec for the compression of image sequences is presented, based on a two-dimensional finite-state vector quantization (2-D FSVQ), and a frame adaptive technique using codebook and address map replenishment.
Abstract: A new variable-rate codec for the compression of image sequences is presented. This codec is based on a two-dimensional finite-state vector quantization (2-D FSVQ), and a frame adaptive technique using codebook and address map replenishment. The results show that the codec achieves a good picture quality at low bit rate. >


T.J. Moulsley1, I. Wells1
06 Dec 1993
TL;DR: The TETRA standard is described, the requirements for speech transmission in this system are detailed and the procedure adopted for selection of a codec algorithm is described.
Abstract: This paper briefly describes the TETRA standard, details the requirements for speech transmission in this system, describes the procedure adopted for selection of a codec algorithm and gives details of one of the codec candidates.

Proceedings ArticleDOI
19 Oct 1993
TL;DR: A hybrid CELPC and voice excited linear predictive coding (VELPC) scheme is presented in the paper for speech coding with lower complexity and the test experiments showed this new coder could produce synthesized speech with good quality at bit rates around 4.0 kbps.
Abstract: Code-excited linear predictive coding (CELPC) is now the main technique used to produce good quality speech at bit rates around 48 kbps However, the original CELPC is impractical in most cases due to its heavy computational load A hybrid CELPC and voice excited linear predictive coding (VELPC) scheme is presented in the paper for speech coding with lower complexity In the algorithm, the speech signal is firstly divided into two parts, the base-band and the high-band respectively, in frequency domain, and then the base-band and the high-band signal are coded with CELPC and VELPC techniques respectively The test experiments showed this new coder could produce synthesized speech with good quality at bit rates around 40 kbps >

Proceedings ArticleDOI
19 Oct 1993
TL;DR: A novel approach to integrate the echo canceller into the LD-CELP codec, such as CCITT G.728 "Coding of speech at 16 kbps using low-delay code-excited linear prediction" codec.
Abstract: A new scheme to handle the echo canceller as an on-side job of a LD-CELP codec is studied in the paper The authors give a novel approach to integrate the echo canceller (CCITT G165) into the LD-CELP codec, such as CCITT G728 "Coding of speech at 16 kbps using low-delay code-excited linear prediction" codec >


Proceedings ArticleDOI
19 Oct 1993
TL;DR: An arithmetic coding data compression codec for transmitting newspapers is reported that by using a limited accuracy algorithm to Chinese words and a rastered photograph newspaper page with resolution of 24 Pels/mm, a compression ratio about 10 has been obtained.
Abstract: An arithmetic coding data compression codec for transmitting newspapers is reported By using a limited accuracy algorithm to Chinese words and a rastered photograph newspaper page with resolution of 24 Pels/mm, a compression ratio about 10 has been obtained This codec was put into use in May 1992 in the national transmission network of China >

01 Feb 1993
TL;DR: Unlike most hard decision CODEC's, the HARRIS C ODEC doesn't upgrade BER performance significantly at high BER's but rather becomes transparent.
Abstract: HARRIS, under contract with NASA Lewis, has developed a hard decision BCH (Bose-Chaudhuri-Hocquenghem) triple error correcting block CODEC ASIC, that can be used in either a bursted or continuous mode. the ASIC contains both encoder and decoder functions, programmable lock thresholds, and PSK related functions. The CODEC provides up to 4 dB of coding gain for data rates up to 300 Mbps. The overhead is selectable from 7/8 to 15/16 resulting in minimal band spreading, for a given BER. Many of the internal calculations are brought out enabling the CODEC to be incorporated in more complex designs. The ASIC has been tested in BPSK, QPSK and 16-ary PSK link simulators and found to perform to within 0.1 dB of theory for BER's of 10(exp -2) to 10(exp -9). The ASIC itself, being a hard decision CODEC, is not limited to PSK modulation formats. Unlike most hard decision CODEC's, the HARRIS CODEC doesn't upgrade BER performance significantly at high BER's but rather becomes transparent.

DOI
01 Jan 1993
TL;DR: This thesis provides the design of two new psychoacoustic models used to effect the real time implementation of the new ISO/MPEG audio codec on an existing hardware platform at MPR Teltech, a subsidiary of the British Columbia Telephone Co.
Abstract: Implementation of an audio codec (coder/decoder) is sought which meets a new standard proposed by the ISO/MPEG (International Standards Organization / Moving Pictures Experts Group) committee. The standard aims to encode wideband audio signals, achieving data compression by employing psychoacoustic modelling. By exploiting the properties of the human auditory system, psychoacoustic modelling shapes quantization noise spectrally to render it inaudible. This thesis provides the design of two new psychoacoustic models used to effect the real time implementation of the new ISO/MPEG audio codec on an existing hardware platform at MPR Teltech, a subsidiary of the British Columbia Telephone Co. The two new models are named: spreading function model and attenuation model. Both models overcome real time implementation problems of existing psychoacoustic models found in the literature by introducing new methods of calculating the global masking threshold and signal to mask ratios. Listening tests and simulation analyses show that while both models attain very high audio coding quality, the attenuation model is superior to the spreading function model, and is considered for implementation in the new ISO/MPEG codec on the existing hardware platform.

Proceedings ArticleDOI
13 Oct 1993
TL;DR: The extension of the ISO/IEC high quality audio coding standard IS 11172-3 (MPEG-Audio) to lower sampling rates is described and the focus is on extensions of the Layer III of MPEG-Audio.
Abstract: There is a gray zone between classical speech coders, working on speech signals sampled at 8 kHz, and high quality audio coders, working on audio signals sampled at 32 kHz and higher. In the last years, several systems have been proposed which fill this gray zone from both ends. These systems include wide band speech coding systems as well as high quality audio coding systems working with reduced bandwidth. This paper describes the extension of the ISO/IEC high quality audio coding standard IS 11172-3 (MPEG-Audio) to lower sampling rates. The focus is on extensions of the Layer III of MPEG-Audio.