scispace - formally typeset
Search or ask a question
Author

Jes Thyssen

Other affiliations: Mindspeed Technologies
Bio: Jes Thyssen is an academic researcher from Conexant. The author has contributed to research in topics: Speech coding & Voice activity detection. The author has an hindex of 15, co-authored 24 publications receiving 942 citations. Previous affiliations of Jes Thyssen include Mindspeed Technologies.

Papers
More filters
Patent
Yang Gao1, Adil Benyassine2, Jes Thyssen2, Eyal Shlomot2, Huan-Yu Su2 
15 Sep 2000
TL;DR: In this paper, a speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed, which optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech.
Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

119 citations

Patent
24 Aug 1999
TL;DR: In this paper, a method of encoding an input speech signal using a multi-rate encoder having a plurality of encoding rates is disclosed, where a high-pass filter and then a perceptual weighting filter are applied to such signal to generate a first target signal.
Abstract: A method of encoding an input speech signal using a multi-rate encoder having a plurality of encoding rates is disclosed. A high-pass filter and then a perceptual weighting filter are applied to such signal to generate a first target signal. An adaptive codebook vector is identified from an adaptive codebook using the first target signal by filtering the vector to generate a filtered adaptive codebook vector. An adaptive codebook gain for the adaptive codebook vector is calculated and an error signal minimized. The adaptive codebook gain is adaptively reduced based on one encoding rate from the plurality of encoding rates to generate a reduced adaptive codebook gain. A second target signal based at least on the first target signal and the reduced adaptive codebook gain is generated. The input speech signal is converted into an encoded speech based on the second target signal.

111 citations

PatentDOI
Huan-Yu Su1, Eyal Shlomot1, Jes Thyssen1, Adil Benyassine1, Yang Gao1 
TL;DR: There is provided a conference bridge or transcoder configured to intelligently handle multiple speech channels in the contest of a packet network, wherein various speech channels may adhere to variety of speech encoding standards.
Abstract: There is provided a conference bridge or transcoder configured to intelligently handle multiple speech channels in the contest of a packet network, wherein various speech channels may adhere to variety of speech encoding standards. For example, the conference bridge establishes framing and alignment of multiple incoming speech channels associated with multiple participants, extracts parameters from the speech samples, mixes the parameters, and re-encodes the resulting speech samples for transmission to the participants. In one aspect, a speech processing method comprises decoding a first bitstream according to a first coding scheme to generate first speech samples and a first side information; generating second speech samples and a second side information using the first speech samples and the first side information, for use according to a second coding scheme; and creating a second bitstream, encoded based on the second coding scheme, using the second speech samples and the second side information.

81 citations

Patent
Yang Gao1, Adil Benyassine1, Huan-Yu Su1, Eyal Shlomot1, Jes Thyssen1 
15 Sep 2000
TL;DR: In this article, a speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed, which optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech.
Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

81 citations


Cited by
More filters
Patent
02 Nov 2000
TL;DR: In this paper, a system and method for voice transmission over high level network protocols is presented, where variable compression based on silence detection takes advantage of the natural silences and pauses in human speech, thus reducing the delays in transmission caused by using HTTP/TCP.
Abstract: A system and method for voice transmission over high level network protocols. On the Internet and the World Wide Web, such high level protocols are HTTP/TCP. The restrictions imposed by firewalls and proxy servers are avoided by using HTTP level connections to transmit voice data. In addition, packet delivery guarantees are obtained by using TCP instead of UDP. Variable compression based on silence detection takes advantage of the natural silences and pauses in human speech, thus reducing the delays in transmission caused by using HTTP/TCP. The silence detection includes the ability to bookend the voice data sent with small portions of silence to insure that the voice sounds natural. Finally, the voice data is transmitted to each client computer independently from a common circular list of voice data, thus insuring that all clients will stay current with the most recent voice data. The combination of these features enables simple, seamless, and interactive Internet conferencing.

1,129 citations

Patent
02 Oct 2000
TL;DR: In this article, the authors proposed a packet-based conference bridge that selects the talkers for the voice conference and outputs an addressing control signal to the talker in order to reduce the required processing power and the latency within the conference bridge.
Abstract: The performance of a voice conference using a packet-based conference bridge can be improved with a number of modifications. In one modification, the conference bridge receives speech indication signals from the individual packet-based terminals within the voice conference, these speech indication signals then being used by the conference bridge to select the talkers within the voice conference. This removes the need for speech detection techniques within the conference bridge, hence decreasing the required processing power and the latency within the conference bridge. In another modification, the conference bridge sends addressing control signals to the individual packet-based terminals selected as talkers, these addressing control signals directing the terminals selected as talkers to directly transmit their voice data packets to the other terminals within the voice conference. This direct transmission of voice data packets can reduce transcoding and latency within the network. These two modifications could further be combined, resulting in a conference bridge that receives speech indication signals, selects the talkers for the voice conference and outputs addressing control signal to the talkers. In this case, the advantages of the two modifications are gained as well as additional capacity advantages resulting from no voice signals actually traversing the conference bridge.

229 citations

Patent
25 Oct 2005
TL;DR: In this paper, the authors proposed a measurement and control of perceived sound loudness and the perceived spectral balance of an audio signal, which is useful in one or more of: loudness-compensating volume control, automatic gain control, dynamic range control (including, for example, limiters, compressors, expanders, etc.), dynamic equalization, and compensating for background noise interference in an audio playback environment.
Abstract: The invention relates to the measurement and control of the perceived sound loudness and/or the perceived spectral balance of an audio signal. An audio signal is modified in response to calculations performed at least in part in the perceptual (psychoacoustic) loudness domain. The invention is useful, for example, in one or more of: loudness-compensating volume control, automatic gain control, dynamic range control (including, for example, limiters, compressors, expanders, etc.), dynamic equalization, and compensating for background noise interference in an audio playback environment. The invention includes not only methods but also corresponding computer programs and apparatus.

215 citations

Patent
Burg Frederick Murray1
31 May 2000
TL;DR: In this article, a call initiator present in a text chat room session establishes a data connection to Call Broker and, after qualifying for access (e.g., using credit card information) and providing a callback number, receives voice session information and participant access codes for each desired participant in a voice call.
Abstract: A network-based system and method for providing anonymous voice communications using the telephone network and data communications links under the direction of a Call Broker and associated network elements. A user (the call initiator) present in a text chat room session establishes a data connection to Call Broker and, after qualifying for access (e.g., using credit card information) and providing a callback number, receives voice session information and participant access codes for each desired participant in a voice call. The initiator causes session information and participant codes to be passed to one or more selected chat participants in the current text chat room. When a selected participant uses the received session information, and enters the received participant code and a callback number, the Call Broker in cooperation with a Network Adjunct Processor (NAP) completes voice links to the initiator and the selected participant(s).

177 citations

PatentDOI
Jebu Jacob Rajan1
TL;DR: In this article, a system for allowing a user to add word models to a speech recognition system is described. But this system requires the user to input a number of renditions of a new word and generate from these a sequence of phonemes representative of the new word.
Abstract: A system is provided for allowing a user to add word models to a speech recognition system. In particular, the system allows a user to input a number of renditions of the new word and which generates from these a sequence of phonemes representative of the new word. This representative sequence of phonemes is stored in a word to phoneme dictionary together with the typed version of the word for subsequent use by the speech recognition system.

166 citations