scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings ArticleDOI
17 May 2004
TL;DR: This paper compares entropy and Euclidian distance measures for VFR in ASR experiments using the Aurora2 and T146 databases and finds better performance is observed for the entropy-based VFR over the earlier VFR approach and over the fixed-rate system.
Abstract: Most speech processing algorithms analyze speech signals frame by frame with a fixed frame rate. Fixed-rate analysis is inconsistent with human speech perception and effectively assigns the same importance or 'weight' to all equi-duration frames. In Zhu et al. (2000), we proposed a variable frame rate (VFR) analysis technique that is based on a Euclidian distance measure. In this paper, we propose another approach for VFR based on the entropy of the signal. We compare entropy and Euclidian distance measures for VFR in ASR experiments using the Aurora2 and T146 databases. Better performance is observed for the entropy-based VFR over our earlier VFR approach and over the fixed-rate system.

59 citations

Proceedings ArticleDOI
16 May 1999
TL;DR: Various approaches for link adaptation with respect to varying radio channel conditions are described and the method of inband signaling that is standardized is discussed and motivated.
Abstract: The European Telecommunications Standards Institute (ETSI) has just defined an adaptive multi rate (AMR) speech codec standard for the GSM system with a multitude of source and channel coding rates. The standard aims to provide robust high quality speech together with the flexibility to deliver radio network capacity enhancements by means of low bit-rate operation. The codec rates are dynamically selected with respect to the rapidly changing radio conditions and to local capacity requirements. This paper describes various approaches for link adaptation with respect to varying radio channel conditions and puts a focus on the solution in the AMR standard. Moreover the method of inband signaling that is standardized is discussed and motivated.

59 citations

Journal ArticleDOI
TL;DR: The performance limits, as given by the signal-to-noise ratio (s/n), are described for different speech-encoding schemes including adaptive quantization and (linear) adaptive prediction schemes.
Abstract: In this paper, the performance limits, as given by the signal-to-noise ratio (s/n), are described for different speech-encoding schemes including adaptive quantization and (linear) adaptive prediction schemes. The comparison is made on the basis of computer simulations using 8-kHz-sampled speech signals of one speaker. Different bit rates (two bits per sample–five bits per sample) have been used. A three-bit-per-sample pcm scheme with a nonadaptive μ100 quantizer leads to an s/n value of approximately 9 dB. A maximum s/n value of approximately 25 dB has been reached using an encoding scheme including both adaptive quantization and adaptive prediction. Entropy coding of the quantizer output symbols leads to an additional gain in s/n of nearly 3 dB.

59 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: This work proposes a new method for selecting an appropriate modeling order, which outperformed the classical information theoretic criteria and was applied to both synthetic and musical signals.
Abstract: High resolution methods, such as the ESPRIT (estimation of signal parameters by rotational invariance techniques) algorithm, perform an accurate representation of a harmonic signal as a sum of exponentially damped sinusoids. However, in coding applications, the signal must be represented with a minimum number of parameters. Unfortunately, it is well known that applying the ESPRIT algorithm with an under-estimated model order generates biased frequency estimates. We propose a new method for selecting an appropriate modeling order, which minimizes this bias. This approach was applied to both synthetic and musical signals and outperformed the classical information theoretic criteria.

58 citations

Proceedings ArticleDOI
21 Apr 1997
TL;DR: Two immediate applications of the dynamic programming approach to LPC speech coding and to sinusoidal modeling of musical signals are presented.
Abstract: The idea of optimal joint time segmentation and resource allocation for signal modeling is explored with respect to arbitrary segmentations and arbitrary representation schemes. When the chosen signal modeling techniques can be quantified in terms of a cost function which is additive over distinct segments, a dynamic programming approach guarantees the global optimality of the scheme while keeping the computational requirements of the algorithm sufficiently low. Two immediate applications of the algorithm to LPC speech coding and to sinusoidal modeling of musical signals are presented.

58 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108