scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A new VAD algorithm for improving speech detection robustness in noisy environments and the performance of speech recognition systems is presented, which formsulates the speech/non-speech decision rule by comparing the long-term spectral envelope to the average noise spectrum, thus yielding a high discriminating decision rule and minimizing the average number of decision errors.

412 citations

Journal ArticleDOI
TL;DR: It is argued that the Itakura-Saito and related distortions are well-suited computationally, mathematically, and intuitively for such applications.
Abstract: Several properties, interrelations, and interpretations are developed for various speech spectral distortion measures. The principle results are 1) the development of notions of relative strength and equivalence of the various distortion measures both in a mathematical sense corresponding to subjective equivalence and in a coding sense when used in minimum distortion or nearest neighbor speech processing systems; 2) the demonstration that the Itakura-Saito and related distortion measures possess a property similar to the triangle inequality when used in nearest neighbor systems such as quantization and cluster analysis; and 3) that the Itakura-Saito and normalized model distortion measures yield efficient computation algorithms for generalized centroids or minimum distortion points of groups or clusters of speech frames, an important computation in both classical cluster analysis techniques and in algorithms for optimal quantizer design. We also argue that the Itakura-Saito and related distortions are well-suited computationally, mathematically, and intuitively for such applications.

409 citations

Book
01 Jan 1999
TL;DR: This Second Edition of Speech and Audio Signal Processing will update and revise the original book to augment it with new material describing both the enabling technologies of digital music distribution and a range of exciting new research areas in automatic music content processing that have emerged in the past five years, driven by the digital music revolution.
Abstract: When Speech and Audio Signal Processing published in 1999, it stood out from its competition in its breadth of coverage and its accessible, intutiont-based style. This book was aimed at individual students and engineers excited about the broad span of audio processing and curious to understand the available techniques. Since then, with the advent of the iPod in 2001, the field of digital audio and music has exploded, leading to a much greater interest in the technical aspects of audio processing.This Second Edition will update and revise the original book to augment it with new material describing both the enabling technologies of digital music distribution (most significantly the MP3) and a range of exciting new research areas in automatic music content processing (such as automatic transcription, music similarity, etc.) that have emerged in the past five years, driven by the digital music revolution.New chapter topics include:Psychoacoustic Audio Coding, describing MP3 and related audio coding schemes based on psychoacoustic masking of quantization noiseMusic Transcription, including automatically deriving notes, beats, and chords from music signals.Music Information Retrieval, primarily focusing on audio-based genre classification, artist/style identification, and similarity estimation.Audio Source Separation, including multi-microphone beamforming, blind source separation, and the perception-inspired techniques usually referred to as Computational Auditory Scene Analysis (CASA).

395 citations

Proceedings ArticleDOI
09 Jul 2006
TL;DR: In this paper, an investigation of H.264/MPEG4-AVC conforming coding with hierarchical B pictures is presented, and simulation results turned out that in comparison to the widely used IBBP structure coding gains can be achieved at the expense of an increased coding delay.
Abstract: In this paper, an investigation of H.264/MPEG4-AVC conforming coding with hierarchical B pictures is presented. We analyze the coding delay and memory requirements, describe details of an improved encoder control, and compare the coding efficiency for different coding delays. Additionally, the coding efficiency of hierarchical B picture coding is compared to that of MCTF-based coding by using identical coding structures and a similar degree of encoder optimization. Our simulation results turned out that in comparison to the widely used IBBP...structure coding gains of more than 1 dB can be achieved at the expense of an increased coding delay. Further experiments have shown that the coding efficiency gains obtained by using the additional update steps in MCTF coding are generally smaller than the losses resulting from the required open-loop encoder control.

394 citations

Book
18 Jul 1996
TL;DR: International standards for image, video and audio coding, including ITU-T H.263 Very Low Bit-rate Coding, and MPEG-2 Generic Coding Algorithms are presented.
Abstract: 1. Introduction. I. DIGITAL CODING TECHNIQUES. 2. Color Formats. 3. Quantization. 4. Predictive Coding. 5. Transform Coding. 6. Hybrid Coding and Motion Compensation. 7. Vector Quantization and Subband Coding. II. INTERNATIONAL STANDARDS FOR IMAGE, VIDEO AND AUDIO CODING. 8. JPEG Still Picture Compression Algorithm. 9. ITU-T H.261 Video Coder. 10. MPEG-1 Audiovisual Coder for Digital Storage Media. 11. MPEG-2 Generic Coding Algorithms. 12. MPEG-4 and H.263 Very Low Bit-rate Coding. 13. High Definition Television Services. 14. CMTT Digital Broadcasting S Standards. Appendix A. Manufactures and Vendors. Appendix B. Information on the Internet.

390 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108