scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This decomposition provides a method of parameter simplification which appears to be useful for detecting fundamental frequencies, and characterizing formants.
Abstract: Uses an algorithm based on the adapted-window Malvar transform to decompose digitized speech signals into a local time-frequency representation. The authors present some applications and experimental results for a signal compression and automatic voiced-unvoiced segmentation. This decomposition provides a method of parameter simplification which appears to be useful for detecting fundamental frequencies, and characterizing formants. >

85 citations

PatentDOI
Willem Bastiaan Kleijn1
TL;DR: In this article, a plurality of sets of indexed parameters are generated based on samples of the speech signal, each set corresponds to a waveform characterizing the speech signals at a discrete point in time.
Abstract: A method of coding a speech signal is described. In accordance with the method, a plurality of sets of indexed parameters are generated based on samples of the speech signal. Each set of indexed parameters corresponds to a waveform characterizing the speech signal at a discrete point in time. Parameters of the plurality of sets are grouped based on index value to form a first set of signals which represents the evolution of characterizing waveform shape; the signals of the first set are filtered to remove low frequency components and thereby produce a second set of signals which represents relatively high rates of evolution of characterizing waveform shape. The speech signal is then coded based on the second set of signals representing high rates of characterizing waveform shape evolution. Coding of the speech signal may further be based on a set of smoothed first signals.

85 citations

Proceedings ArticleDOI
07 Nov 2002
TL;DR: A comparison of the relative merits and demerits along with the subjective quality of speech after the pruning of silence periods for four time-domain VAD algorithms in terms of speech quality, compression level and computational complexity.
Abstract: We discuss techniques for voice activity detection (VAD) for voice over Internet Protocol (VoIP). VAD aids in reducing the bandwidth requirement of a voice session, thereby using bandwidth efficiently. Such a scheme would be implemented in the application layer. Thus the VAD is independent of the lower layers in the network stack (see Flood, J.E., "Telecommunications Switching - Traffic and Networks", Prentice Hall India). We compare four time-domain VAD algorithms in terms of speech quality, compression level and computational complexity. A comparison of the relative merits and demerits along with the subjective quality of speech after the pruning of silence periods is presented for all the algorithms. A quantitative measurement of speech quality for different algorithms is also presented.

84 citations

Patent
05 Oct 1999
TL;DR: In this paper, a start of an input speech signal is detected during presentation of an output audio signal and an input start time, relative to the output audio signals, is determined.
Abstract: A start of an input speech signal is detected during presentation of an output audio signal and an input start time, relative to the output audio signal, is determined. The input start time is then provided for use in responding to the input speech signal. In another embodiment, the output audio signal has a corresponding identification. When the input speech signal is detected during presentation of the output audio signal, the identification of the output audio signal is provided for use in responding to the input speech signal. Information signals comprising data and/or control signals are provided in response to at least the contextual information provided, i.e., the input start time and/or the identification of the output audio signal. In this manner, the present invention accurately establishes a context of an input speech signal relative to an output audio signal regardless of the delay characteristics of the underlying communication system.

84 citations

Journal ArticleDOI
TL;DR: The most efficient implementation of theforward and inverse MDCT computation for layer III in MPEG-1 and MPEG-2 international audio coding standards is proposed, based on a new fast algorithm for the forward and inverseMDCT computation in the oddly stacked system.
Abstract: The modified discrete cosine transform (MDCT) is employed in subband/transform coding schemes as the analysis/synthesis filter bank based on time domain aliasing cancellation (TDAC). The most efficient implementation of the forward and inverse MDCT computation for layer III in MPEG-1 and MPEG-2 international audio coding standards is proposed. It is based on a new fast algorithm for the forward and inverse MDCT computation in the oddly stacked system. The complete signal flow graphs for the implementation of MDCT and inverse MDCT in layer III are also provided.

84 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108