scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Journal ArticleDOI
01 Jun 1995
TL;DR: Basic approaches to speech, wideband speech, and audio bit rate compressions in audiovisual communications are explained and it will become obvious that the use of the knowledge of auditory perception helps minimizing perception of coding artifacts and leads to efficient low bit rate coding algorithms which can achieve substantially more compression than was thought possible only a few years ago.
Abstract: Current and future visual communications for applications such as broadcasting videotelephony, video- and audiographic-conferencing, and interactive multimedia services assume a substantial audio component. Even text, graphics, fax, still images, email documents, etc. will gain from voice annotation and audio clips. A wide range of speech, wideband speech, and wideband audio coders is available for such applications. In the context of audiovisual communications, the quality of telephone-bandwidth speech is acceptable for some videotelephony and videoconferencing services. Higher bandwidths (wideband speech) may be necessary to improve the intelligibility and naturalness of speech. High quality audio coding including multichannel audio will be necessary in advanced digital TV and multimedia services. This paper explains basic approaches to speech, wideband speech, and audio bit rate compressions in audiovisual communications. These signal classes differ in bandwidth, dynamic range, and in listener expectation of offered quality. It will become obvious that the use of our knowledge of auditory perception helps minimizing perception of coding artifacts and leads to efficient low bit rate coding algorithms which can achieve substantially more compression than was thought possible only a few years ago. The paper concentrates on worldwide source coding standards beneficial for consumers, service providers, and manufacturers. >

62 citations

Patent
27 Jan 1998
TL;DR: In this paper, each speech frame is represented by a weighted average of codebook entries and the weights represent a perceptual distance of the speech frame and may be refined by a gradient descent analysis.
Abstract: A voice conversion system and methodology employing a codebook mapping approach to transforming a source voice to sound like a target voice. Each speech frame is represented by a weighted average of codebook entries (304). The weights represent a perceptual distance of the speech frame and may be refined by a gradient descent analysis. The vocal tract characteristics, represented by a line spectral frequency vector (302), the excitation characteristics (308), represented by a linear predictive coding residual, the duration, and the amplitude of the speech frame are transformed in the same weighted-average framework.

62 citations

Patent
Ojala Pasi1
05 Dec 1997
TL;DR: In this article, the authors proposed a speech encoding method for low data transfer speeds, which is suitable for use at low data transmission speeds, because it offers a sound encoding method of even quality and low average bit rate.
Abstract: The invention is related digital speech encoding. In a speech codec according to the invention, for modeling a speech signal (301) both prediction parameters (321, 322, 331) modeling a speech signal in a short term and prediction parameters (341, 342, 351) modeling a speech signal in a long term are used. Each prediction parameter (321, 322, 331, 341, 342, 351) is presented using a certain accuracy, in a digital system with a certain number of bits. In speech encoding according to the invention the number of bits used for presenting prediction parameters (321, 322, 331, 341, 342, 351) is adjusted based upon information parameters (321, 322, 331, 341, 342, 351) obtained from a short-term LPC-analysis (32) and from a long-term LTP-analysis (31, 34, 35). The invention is particularly suitable for use at low data transfer speeds, because it offers a speech encoding method of even quality and low average bit rate.

61 citations

Proceedings ArticleDOI
25 Mar 2012
TL;DR: This paper proposes an effective splicing detection method for audios by detecting abnormal differences in the local noise levels in an audio signal and demonstrates the efficacy and robustness of the proposed method using both synthetic and realistic audio splicing forgeries.
Abstract: One common form of tampering in digital audio signals is known as splicing, where sections from one audio is inserted to another audio. In this paper, we propose an effective splicing detection method for audios. Our method achieves this by detecting abnormal differences in the local noise levels in an audio signal. This estimation of local noise levels is based on an observed property of audio signals that they tend to have kurtosis close to a constant in the band-pass filtered domain. We demonstrate the efficacy and robustness of the proposed method using both synthetic and realistic audio splicing forgeries.

61 citations

Book
03 Mar 2005
TL;DR: This book discusses speech recognition techniques using probabilistic finite-state models, and Parsing, a method of Parsing Using Probabilistic grammars, which automates the very labor-intensive and therefore time-heavy and expensive process of parsing.
Abstract: 1. Introduction 2. Sounds and numbers 3. Digital filters and resonators 4. Frequency analysis and linear predictive coding 5. Finite state machines 6. Introduction to speech recognition techniques 7. Probabilistic finite-state models 8. Parsing 9. Using probabilistic grammars.

61 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108