scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Patent
Martin Holzapfel1
10 Sep 2001
TL;DR: In this article, a first speech model is trained with a first time pattern and a second speech model with a second time pattern, and the second model is initialized with the first model.
Abstract: A method and also a configuration for determining a descriptive feature of a speech signal, in which a first speech model is trained with a first time pattern and a second speech model is trained with a second time pattern. The second speech model is initialized with the first speech model.

120 citations

Journal ArticleDOI
TL;DR: Examination of speech impairments likely to arise in dynamically managed voice (DMV) systems, which utilize speech activity detection to exploit speech idle time and variable bit rate coding to exploit nonstationary speech statistics, finds two impairments not commonly found in traditional communication systems variable Speech burst delay and speech clipping.
Abstract: The purpose of this paper is to examine speech impairments likely to arise in dynamically managed voice (DMV) systems. DMV systems utilize speech activity detection to exploit speech idle time and variable bit rate coding to exploit nonstationary speech statistics. The emphasis here is on systems using speech detection. This processing introduces two impairments not commonly found in traditional communication systems variable Speech burst delay and speech clipping. Simulations of these impairments were implemented, and formal subjective testing was performed to assess subjects' reactions to a range of impairment levels. Emphasis was on formal subjective listening tests and customer opinion of speech quality as defined by a rating scale. The test conditions are applicable to general telephony, where relatively high speech quality is required. Results on variable speech burst delay and front-end and midspeech burst clipping are presented. These results serve as input to the design process and to the establishment of performance guidelines for DMV systems.

119 citations

Patent
Yang Gao1, Adil Benyassine2, Jes Thyssen2, Eyal Shlomot2, Huan-Yu Su2 
15 Sep 2000
TL;DR: In this paper, a speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed, which optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech.
Abstract: A speech compression system capable of encoding a speech signal into a bitstream for subsequent decoding to generate synthesized speech is disclosed. The speech compression system optimizes the bandwidth consumed by the bitstream by balancing the desired average bit rate with the perceptual quality of the reconstructed speech. The speech compression system comprises a full-rate codec, a half-rate codec, a quarter-rate codec and an eighth-rate codec. The codecs are selectively activated based on a rate selection. In addition, the full and half-rate codecs are selectively activated based on a type classification. Each codec is selectively activated to encode and decode the speech signals at different bit rates emphasizing different aspects of the speech signal to enhance overall quality of the synthesized speech.

119 citations

Proceedings ArticleDOI
06 Jul 2003
TL;DR: A unified framework to extract highlights from three sports: baseball, golf and soccer by detecting some of the common audio events that are directly indicative of highlights by using MPEG-7 audio features and entropic prior hidden Markov models (HMM).
Abstract: We have developed a unified framework to extract highlights from three sports - baseball, golf and soccer - by detecting some of the common audio events that are directly indicative of highlights. We used MPEG-7 audio features and entropic prior hidden Markov models (HMM) for feature extraction and classification, respectively, to recognize these common audio events. Together with pre- and post-processing techniques using general sports knowledge, we have been able to generate promising results dealing with an audio track that is dominated by audio mixtures and noisy background.

119 citations

Journal ArticleDOI
Sharad Singhal, B. S. Atal1
TL;DR: The algorithm provides a framework for computing multipulse excitation with varying degrees of optimization and computational complexity and finds that speech quality depends on the pulse rate and female speech requires a higher pulse rate than male speech.
Abstract: Although the multipulse model is conceptually simple, the problem of locating the pulses is computationally complex. The authors discuss the basic multipulse model and describe a procedure to compute the excitation with optimally adjusted amplitudes. The algorithm provides a framework for computing multipulse excitation with varying degrees of optimization and computational complexity. The authors find that speech quality depends on the pulse rate. They also find that for the same quality, female speech requires a higher pulse rate than male speech. The pitch dependence can be reduced and speech quality improved for high-pitched speakers by incorporating long delay prediction in the multipulse model. >

119 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108