scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings ArticleDOI
19 Apr 1994
TL;DR: This paper describes the application of transform coded excitation (TCX) coding to encoding wideband speech and audio signals in the bit rate range of 16 k bits/s to 32 kbits/s and proposes novel quantization procedures including inter-frame prediction in the frequency domain.
Abstract: This paper describes the application of transform coded excitation (TCX) coding to encoding wideband speech and audio signals in the bit rate range of 16 kbits/s to 32 kbits/s. The approach uses a combination of time domain (linear prediction; pitch prediction) and frequency domain (transform coding; dynamic bit allocation) techniques, and utilizes a synthesis model similar to that of linear prediction coders such as CELP. However, at the encoder, the high complexity analysis-by-synthesis technique is bypassed by directly quantizing the so-called target signal in the frequency domain. The innovative excitation is derived at the decoder by inverse filtering the quantized target signal. The algorithm is intended for applications whereby a large number of bits is available for the innovative excitation. The TCX algorithm is utilized to encode wideband speech and audio signals with a 50-7000 Hz bandwidth. Novel quantization procedures including inter-frame prediction in the frequency domain are proposed to encode the target signal. The proposed algorithm achieves very high quality for speech at 16 kbits/s, and for music at 24 kbits/s. >

93 citations

Proceedings ArticleDOI
09 May 1995
TL;DR: A 2.4 kb/s coder using waveform interpolation principles to represent the speech signal as an evolving characteristic waveform (CW) and a significant increase in coding efficiency is obtained by coding these two components separately.
Abstract: For low-rate speech coding it is advantageous to represent the speech signal as an evolving characteristic waveform (CW). The CW evolves slowly when the speech signal is clearly voiced and rapidly when the speech signal is clearly unvoiced. The voiced (periodic) and unvoiced (nonperiodic) components of the speech signal can be separated by a simple nonadaptive filter in the CW domain. Because of perceptual effects, a significant increase in coding efficiency is obtained by coding these two components separately. A 2.4 kb/s coder using these principles was developed. In an independent evaluation, the performance of the 2.4 kb/s waveform interpolation (WI) coder was found to be at least equivalent to the 4.8 kb/s FS1016 standard for all of the many tests.

93 citations

Proceedings ArticleDOI
01 Feb 2007
TL;DR: A novel approach for speech signal modeling using fractional calculus that has the merit of requiring a smaller number of model parameters, and is demonstrated to be superior to the LPC approach in capturing the details of the modeled signal.
Abstract: In this paper, we present a novel approach for speech signal modeling using fractional calculus. This approach is contrasted with the celebrated Linear Predictive Coding (LPC) approach which is based on integer order models. It is demonstrated via numerical simulations that by using a few integrals of fractional orders as basis functions, the speech signal can be modeled accurately. The new approach has the merit of requiring a smaller number of model parameters, and is demonstrated to be superior to the LPC approach in capturing the details of the modeled signal.

93 citations

Patent
Philippe Ferriere1
11 Oct 1995
TL;DR: In this article, an audio data transmission system uses computing units which are designed to select an appropriate combination of block size and input sampling rate to maximize the available bandwidth of the receiving modem.
Abstract: An audio data transmission system encodes audio files into individual audio data blocks which contain a variable number bits of digital audio data that were sampled at a selectable sample rate. The number of bits of digital data and the input sampling rate are scaleable to produce an encoded bit stream bit rate that is less than or equal to an effective operational bit rate of a recipient's modem. The audio data transmission system uses computing units which are designed to select an appropriate combination of block size and input sampling rate to maximize the available bandwidth of the receiving modem. For example, if the modem connection speed for one modem is 14.4 kbps, a version of the audio data compressed at 13000 bits/s might be sent to the recipient; if the modem connection speed for another modem is 28.8 kbps, a version of the audio data compressed at 24255 bits/s might be sent to the receiver. The audio data blocks are then transmitted at the encoded bit stream bit rate to the intended recipient's modem. The audio data blocks are decoded at the recipient to reconstruct the audio file and immediately play the audio file as it is received. The audio data transmission system can be implemented in online service systems, ITV systems, computer data network systems, and communication systems.

93 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108