scispace - formally typeset
Search or ask a question
Author

Philippe Gournay

Other affiliations: Nokia, Thales Communications, Philips  ...read more
Bio: Philippe Gournay is an academic researcher from Université de Sherbrooke. The author has contributed to research in topics: Speech coding & Encoder. The author has an hindex of 22, co-authored 75 publications receiving 1561 citations. Previous affiliations of Philippe Gournay include Nokia & Thales Communications.


Papers
More filters
Patent
30 May 2003
TL;DR: In this article, a method and device for improving concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder (106) to a decoder (110), and for accelerating recovery of the decoder after non erased frames of the encoded sound signals have been received.
Abstract: The present invention relates to a method and device for improving concealment of frame erasure caused by frames of an encoded sound signal erased during transmission from an encoder (106) to a decoder (110), and for accelerating recovery of the decoder after non erased frames of the encoded sound signal have been received. For that purpose, concealment/recovery parameters are determined in the encoder or decoder. When determined in the encoder (106), the concealment/recovery parameters are transmitted to the decoder (110). In the decoder, erasure frame concealment and decoder recovery is conducted in response to the concealment/recovery parameters. The concealment/recovery parameters may be selected from the group consisting of: a signal classification parameter, an energy information parameter and a phase information parameter. The determination of the concealment/recovery parameters comprises classifying the successive frames of the encoded sound signal as unvoiced, unvoiced transition, voiced transition, voiced, or onset, and this classification is determined on the basis of at least a part of the following parameters: a normalized correlation parameter, a spectral tilt parameter, a signal-to-noise ratio parameter, a pitch stability parameter, a relative frame energy parameter, and a zero crossing parameter.

160 citations

Patent
06 Dec 2012
TL;DR: In this paper, a signal analyzer for analyzing the audio signal is provided, which determines whether an audio portion is effective in the encoder output signal as a first encoded signal from the first encoding branch or as a second encoded message from a second encoding branch.
Abstract: An audio encoder for encoding an audio signal has a first coding branch, the first coding branch comprising a first converter for converting a signal from a time domain into a frequency domain Furthermore, the audio encoder has a second coding branch comprising a second time/frequency converter Additionally, a signal analyzer for analyzing the audio signal is provided The signal analyzer, on the hand, determines whether an audio portion is effective in the encoder output signal as a first encoded signal from the first encoding branch or as a second encoded signal from a second encoding branch On the other hand, the signal analyzer determines a time/frequency resolution to be applied by the converters when generating the encoded signals An output interface includes, in addition to the first encoded signal and the second encoded signal, a resolution information identifying the resolution used by the first time/frequency converter and used by the second time/frequency converter

128 citations

Proceedings ArticleDOI
19 Apr 2009
TL;DR: This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding, which results in a codec that exhibits consistently high quality for speech, music and mixed audio content.
Abstract: Traditionally, speech coding and audio coding were separate worlds. Based on different technical approaches and different assumptions about the source signal, neither of the two coding schemes could efficiently represent both speech and music at low bitrates. This paper presents a unified speech and audio codec, which efficiently combines techniques from both worlds. This results in a codec that exhibits consistently high quality for speech, music and mixed audio content. The paper gives an overview of the codec architecture and presents results of formal listening tests comparing this new codec with HE-AAC(v2) and AMR-WB+. This new codec forms the basis of the reference model in the ongoing MPEG standardization activity for Unified Speech and Audio Coding.

108 citations

Journal Article
TL;DR: All aspects of this standardization eort are outlined, starting with the history and motivation of the MPEG work item, describing all technical features of the nal system, and further discussing listening test results and performance numbers which show the advantages of the new system over current state-of-the-art codecs.
Abstract: In early 2012 the ISO/IEC JTC1/SC29/WG11 (MPEG) nalized the new MPEG-D Unied Speech and Audio Coding standard The new codec brings together the previously separated worlds of general audio coding and speech coding It does so by integrating elements from audio coding and speech coding into a unied system The present publication outlines all aspects of this standardization eort, starting with the history and motivation of the MPEG work item, describing all technical features of the nal system, and further discussing listening test results and performance numbers which show the advantages of the new system over current state-of-the-art codecs

88 citations

Patent
06 Jul 2009
TL;DR: An apparatus for encoding comprises a first domain converter (510), a switchable bypass (50), a second domain Converter (410), a first processor (420) and a second processor (520) to obtain an encoded audio signal having different signal portions represented by coded data in different domains, which have been coded by different coding algorithms as discussed by the authors.
Abstract: An apparatus for encoding comprises a first domain Converter (510), a switchable bypass (50), a second domain Converter (410), a first processor (420) and a second processor (520) to obtain an encoded audio signal having different signal portions represented by coded data in different domains, which have been coded by different coding algorithms. Corresponding decoding stages in the decoder together with a bypass for bypassing a domain Converter allow the generation of a decoded audio signal with high quality and low bit rate.

85 citations


Cited by
More filters
PatentDOI
TL;DR: In this paper, a method for low-frequency emphasizing the spectrum of a sound signal transformed in a frequency domain and comprising transform coefficients grouped in a number of blocks, in which a maximum energy for one block is calculated and a position index of the block with maximum energy is determined, a factor is calculated for each block having a position Index smaller than the position Index of the Block with maximum Energy, and for each blocks a gain is determined from the factor and is applied to the transform coefficients of the blocks.
Abstract: An aspect of the present invention relates to a method for low-frequency emphasizing the spectrum of a sound signal transformed in a frequency domain and comprising transform coefficients grouped in a number of blocks, in which a maximum energy for one block is calculated and a position index of the block with maximum energy is determined, a factor is calculated for each block having a position index smaller than the position index of the block with maximum energy, and for each block a gain is determined from the factor and is applied to the transform coefficients of the block.

243 citations

Patent
23 Mar 2011
TL;DR: In this paper, the authors present a control interface that enables the user to manage synchronized output of companion content (e.g., text and corresponding audio content) by dragging a finger across the text displayed on the touch screen.
Abstract: A computing device may provide a control interface that enables the user to manage synchronized output of companion content (e.g., text and corresponding audio content). A visual cue may be displayed identifying a current location in text corresponding to a current output position of companion audio content. The cue may be advanced during audio presentation to maintain synchronization between the output position of the audio content and a corresponding position in the text. In some embodiments, synchronized output rnay be controlled by dragging a finger across the text displayed on the touch screen. A visual indication rnay be provided depicting the distance between the advancing position in the text and the current position of the user's finger. In further embodiments, the speed at which the audio content is presented and the visual cue is advanced may be adjusted based at, least in part on a user's performance on a task (e.g., a speed on an exercise machine).

201 citations

Patent
30 Dec 2008
TL;DR: In this article, a linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; and a quantization unit for quantizing the transform domain signal.
Abstract: The present invention teaches a new audio coding system that can code both general audio and speech signals well at low bit rates. A proposed audio coding system comprises linear prediction unit for filtering an input signal based on an adaptive filter; a transformation unit for transforming a frame of the filtered input signal into a transform domain; and a quantization unit for quantizing the transform domain signal. The quantization unit decides, based on input signal characteristics, to encode the transform domain signal with a model-based quantizer or a non-model-based quantizer. Preferably, the decision is based on the frame size applied by the transformation unit.

170 citations