scispace - formally typeset
Search or ask a question
Author

R. Geiger

Bio: R. Geiger is an academic researcher. The author has contributed to research in topics: Transform coding & Second-generation wavelet transform. The author has an hindex of 2, co-authored 2 publications receiving 113 citations.

Papers
More filters
Proceedings ArticleDOI
19 Apr 2009
TL;DR: This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding, which results in a codec that exhibits consistently high quality for speech, music and mixed audio content.
Abstract: Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding.

108 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: This paper presents a multidimensional lifting approach for reducing this approximation error in the frequency domain, in which large parts of the transform are calculated without rounding operations, only the output is rounded and added.
Abstract: Recently lifting-based integer transforms have received much attention, especially in the area of lossless audio and image coding. The usual approach is to apply the lifting scheme to each Givens rotation. Especially in the case of long transform sizes in audio coding applications, this leads to a considerable approximation error in the frequency domain. This paper presents a multidimensional lifting approach for reducing this approximation error. In this approach, large parts of the transform are calculated without rounding operations, only the output is rounded and added. The new approach is applied and evaluated for both the integer modified discrete cosine transform (IntMDCT) and the integer fast Fourier transform (IntFFT).

6 citations


Cited by
More filters
Proceedings ArticleDOI
19 Apr 2009
TL;DR: This paper exposes the origin of the roughness and proposes a bandwidth extension method, which does not introduce roughness into the reconstructed audio signal, and demonstrates the advantage of the proposed method compared to a standard bandwidth extension.
Abstract: Today's efficient audio codecs for low bitrate application scenarios often rely on parametric coding of the upper frequency band portion of a signal while the lower frequency band portion of the same is conveyed by a waveform preserving coding method. At the decoder, the upper frequency signal is approximated from the lower frequency data using the upper frequency band parameters. However, commonly used methods of bandwidth extension almost inevitably suffer from a sensation of unpleasant roughness, which is especially present for tonal music items. In this paper we expose the origin of the roughness and propose a bandwidth extension method, which does not introduce roughness into the reconstructed audio signal. A listening test demonstrates the advantage of the proposed method compared to a standard bandwidth extension.

106 citations

Journal Article
TL;DR: All aspects of this standardization eort are outlined, starting with the history and motivation of the MPEG work item, describing all technical features of the nal system, and further discussing listening test results and performance numbers which show the advantages of the new system over current state-of-the-art codecs.
Abstract: In early 2012 the ISO/IEC JTC1/SC29/WG11 (MPEG) nalized the new MPEG-D Unied Speech and Audio Coding standard The new codec brings together the previously separated worlds of general audio coding and speech coding It does so by integrating elements from audio coding and speech coding into a unied system The present publication outlines all aspects of this standardization eort, starting with the history and motivation of the MPEG work item, describing all technical features of the nal system, and further discussing listening test results and performance numbers which show the advantages of the new system over current state-of-the-art codecs

88 citations

Journal Article
TL;DR: This paper describes this codec in detail and shows that the new reference model reaches the goal of consistent high quality for all signal types.
Abstract: Coding of speech signals at low bitrates, such as 16 kbps, has to rely on an efficient speech reproduction model to achieve reasonable speech quality. However, for audio signals not fitting to the model this approach generally fails. On the other hand, generic audio codecs, designed to handle any kind of audio signal, tend to show unsatisfactory results for speech signals, especially at low bitrates. To overcome this, a process was initiated by ISO/MPEG, aiming to standardize a new codec with consistent high quality for speech, music and mixed content over a broad range of bitrates. After a formal listening test evaluating several proposals MPEG has selected the best performing codec as the reference model for the standardization process. This paper describes this codec in detail and shows that the new reference model reaches the goal of consistent high quality for all signal types.

68 citations

Patent
06 Oct 2010
TL;DR: In this paper, a multi-mode audio signal decoder for providing a decoded representation of an audio content on the basis of an encoded representation of the audio content comprises a spectral value determinator configured to obtain sets of decoded spectral coefficients for a plurality of portions of audio content.
Abstract: A multi-mode audio signal decoder for providing a decoded representation of an audio content on the basis of an encoded representation of the audio content comprises a spectral value determinator configured to obtain sets of decoded spectral coefficients for a plurality of portions of the audio content. The audio signal decoder also comprises a spectrum processor configured to apply a spectral shaping to a set of spectral coefficients, or to a pre-processed version thereof, in dependence on a set of linear-prediction-domain parameters for a portion of the audio content encoded in a linear-prediction mode, and to apply a spectral shaping to a set of decoded spectral coefficients, or a pre-processed version thereof, in dependence on a set of scale factor parameters for a portion of the audio content encoded in a frequency-domain mode. The audio signal decoder comprises a frequency-domain-to-time-domain converter configured to obtain a time-domain representation of the audio content on the basis of a spectrally-shaped set of decoded spectral coefficients for a portion of the audio content encoded in the linear-prediction mode, and to obtain a time domain representation of the audio content on the basis of a spectrally shaped set of decoded spectral coefficients for a portion of the audio content encoded in the frequency domain mode. An audio signal encoder is also described.

66 citations

Journal Article
TL;DR: A new set of cross-fade windows designed in order to provide an adequate trade-off between overlap duration and time/frequency resolution, and to maintain the benefits of critical sampling through all coding modes are presented.
Abstract: The reference model selected by MPEG for the forthcoming unified speech and audio codec (USAC) switches between a non-LPC based coding mode (based on AAC) operating in the transform domain and an LPC-based coding mode (derived from AMR-WB+) operating either in the time domain (ACELP) or in the frequency domain (wLPT). Seamlessly switching between these different coding modes required the design of a new set of cross-fade windows optimized to minimize the amount of overhead information sent during transitions between LPC-based and non-LPC based coding. This paper presents the new set of windows which was designed in order to provide an adequate trade-off between overlap duration and time/frequency resolution, and to maintain the benefits of critical sampling through all coding modes.

66 citations