scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings ArticleDOI
29 Mar 1999
TL;DR: An exact solution to the problem of optimal channel code allocation is found and the properties of the solution which allow us to transmit the source progressively while retaining the optimality at intermediate and final transmission rates are investigated.
Abstract: We present a scheme for joint source-channel coding for transmission of sources compressed by embedded source coders over a memoryless noisy channel. We find an exact solution to the problem of optimal channel code allocation. Then we investigate the properties of the solution which allow us to transmit the source progressively while retaining the optimality at intermediate and final transmission rates, using rate-compatible codes.

65 citations

Journal ArticleDOI
TL;DR: Two postprocessing approaches applying convolutional neural networks either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs are proposed.
Abstract: Enhancing coded speech suffering from far-end acoustic background noise, quantization noise, and potentially transmission errors is a challenging task. In this paper, we propose two postprocessing approaches applying convolutional neural networks either in the time domain or the cepstral domain to enhance the coded speech without any modification of the codecs. The time-domain approach follows an end-to-end fashion, whereas the cepstral domain approach uses analysis–synthesis with cepstral domain features. The proposed postprocessors in both domains are evaluated for various narrowband and wideband speech codecs in a wide range of conditions. The proposed postprocessor improves perceptual evaluation of speech quality by up to 0.25 mean opinion score listening quality objective points for G.711, 0.30 points for G.726, 0.82 points for G.722, and 0.26 points for adaptive multirate wideband codec. In a subjective comparison category rating listening test, the proposed postprocessor on G.711-coded speech exceeds the speech quality of an ITU-T-standardized postfilter by 0.36 CMOS points, and obtains a clear preference of 1.77 CMOS points compared to legacy G.711, even better than uncoded speech with statistical significance . The source code for the cepstral domain approach to enhance G.711-coded speech is made available. 1 1 https://github.com/ifnspaml/Enhancement-Coded-Speech .

65 citations

Journal ArticleDOI
TL;DR: A new embedded zerotree wavelet image coding algorithm that is based on the algorithms developed by Shapiro and Said and is substantially more robust with respect to varying channel error conditions, which provides much needed reliability in low-bandwidth wireless applications.
Abstract: We present a new embedded zerotree wavelet image coding algorithm that is based on the algorithms developed by Shapiro (see IEEE Trans. Signal Processing, Spec. Issue Wavelets Signal Processing, vol.41, no.12, p.3445-62, 1993) and Said et al. (see IEEE Trans. Circuits Syst. Video Technol., vol.6, no.6, p.243-50, 1996). Our algorithm features a relatively simple coding structure and provides a better framework for balancing between high compression performance and robustness to channel errors. The fundamental approach is to explicitly classify the encoder's output bit sequence into subsequences, which are then protected differently according to their importance and robustness. Experimental results indicate that, for noisy channels, the proposed algorithm is slightly more resilient to channel errors than more complex and sophisticated source-channel coding algorithms. More important is that our algorithm is substantially more robust with respect to varying channel error conditions. This provides much needed reliability in low-bandwidth wireless applications.

65 citations

Journal ArticleDOI
TL;DR: It is shown that postfilters based on higher order LPC (linear predictive coding) models can provide very low distortion in terms of special tilt and can provide better speech enhancement than circuits based on the backward-adaptive pole-zero predictor in ADPCM (adaptive digital pulse code modulation).
Abstract: It is shown that postfiltering circuits based on higher order LPC (linear predictive coding) models can provide very low distortion in terms of special tilt. Thus, they can provide better speech enhancement than circuits based on the backward-adaptive pole-zero predictor in ADPCM (adaptive digital pulse code modulation). Quantitative criteria for designing postfiltering circuits based on higher-order LPC models are discussed. These postfilters are particularly attractive for systems where high-order LPC analysis is an integral part of the coding algorithm. In a subjective test that used a computer-simulated version of these circuits, enhanced ADPCM obtained a mean opinion score of 3.6 at 16 kb/s. >

65 citations

Proceedings ArticleDOI
19 Jun 2014
TL;DR: In this paper, an implementation of speech recognition system in MATLAB environment is explained, where two algorithms, Mel-Frequency Cepstral Coefficients (MFCC) and Dynamic Time Wrapping (DTW) are adapted for feature extraction and pattern matching respectively.
Abstract: Speech recognition has wide range of applications in security systems, healthcare, telephony military, and equipment designed for handicapped. Speech is continuous varying signal. So, proper digital processing algorithm has to be selected for automatic speech recognition system. To obtain required information from the speech sample, features have to be extracted from it. For recognition purpose the feature are analyzed to make decisions. In this paper implementation of Speech recognition system in MATLAB environment is explained. Mel-Frequency Cepstral Coefficients (MFCC) and Dynamic Time Wrapping (DTW) are two algorithms adapted for feature extraction and pattern matching respectively. Results are obtained by one time training and continuous testing phases.

65 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108