scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings ArticleDOI
30 Oct 1999
TL;DR: This study modify and combine audio features known to be effective in distinguishing speech from music, and examine their behavior on mixed audio in the CueVideo system.
Abstract: The role of audio in the context of multimedia applications involving video is becoming increasingly important. Many efforts in this area focus on audio data that contains some built-in semantic information structure such as in broadcast news, or focus on classification of audio that contains a single type of sound such as cleaar speech or clear music only. In the CueVideo system, we detect and classify audio that consists of mixed audio, i.e. combinations of speech and music together with other types of background sounds. Segmentation of mixed audio has applications in detection of story boundaries in video, spoken document retrieval systems, audio retrieval systems etc. We modify and combine audio features known to be effective in distinguishing speech from music, and examine their behavior on mixed audio. Our preliminary experimental results show that we can achieve a classification accuracy of over 80% for such mixed audio. Our study also provides us with several helpful insights related to analyzing mixed audio in the context of real applications.

75 citations

Jürgen Herre1
01 Sep 1999
TL;DR: The second part of this paper will focus on the large number of possible choices for the quantization and coding methods for perceptual audio coding along with examples of real-world systems using these approaches.
Abstract: Perceptual audio coding has become an important key technology for many types of multimedia services these days. This paper provides a brief tutorial introduction into a number of issues as they arise in todayOs low bitrate audio coders. After discussing the Temporal Noise Shaping technology in the first part of this paper, the second part will focus on the large number of possible choices for the quantization and coding methods for perceptual audio coding along with examples of real-world systems using these approaches.

75 citations

Journal ArticleDOI
TL;DR: The data from both experiments combined indicate that, in contrast to normal hearing, timing cues available from natural head-width delays do not offer binaural advantages with present methods of electrical stimulation, even when fine-timing cues are explicitly coded.
Abstract: Four adult bilateral cochlear implant users, with good open-set sentence recognition, were tested with three different sound coding strategies for binaural speech unmasking and their ability to localize 100 and 500 Hz click trains in noise. Two of the strategies tested were envelope-based strategies that are clinically widely used. The third was a research strategy that additionally preserved fine-timing cues at low frequencies. Speech reception thresholds were determined in diotic noise for diotic and interaurally time-delayed speech using direct audio input to a bilateral research processor. Localization in noise was assessed in the free field. Overall results, for both speech and localization tests, were similar with all three strategies. None provided a binaural speech unmasking advantage due to the application of 700 micros interaural time delay to the speech signal, and localization results showed similar response patterns across strategies that were well accounted for by the use of broadband interaural level cues. The data from both experiments combined indicate that, in contrast to normal hearing, timing cues available from natural head-width delays do not offer binaural advantages with present methods of electrical stimulation, even when fine-timing cues are explicitly coded.

75 citations

Proceedings ArticleDOI
17 Oct 1999
TL;DR: The thresholds from this algorithm are compared to those produced from a clean speech estimate from a variety of common spectral subtraction algorithms, and the relationship between those from the corrupted speech and corrupting noise is examined.
Abstract: We propose a new method for the estimation of clean speech masking thresholds for speech enhancement. These thresholds are applied to a perceptually based spectral subtraction algorithm to enhance speech in a non-stationary noise environment. In contrast to other approaches we do not directly use an estimate of the clean speech to obtain the masking thresholds, but examine the relationship between those from the corrupted speech and corrupting noise. The thresholds from this algorithm are compared to those produced from a clean speech estimate from a variety of common spectral subtraction algorithms.

74 citations

Patent
02 Mar 2005
TL;DR: In this article, a method for reducing noise disturbance associated with an audio signal received through a microphone is provided, which initiates with magnifying a noise disturbance relative to a remaining component of the audio signal.
Abstract: A method for reducing noise disturbance associated with an audio signal received through a microphone is provided. The method initiates with magnifying a noise disturbance of the audio signal relative to a remaining component of the audio signal. Then, a sampling rate of the audio signal is decreased. Next, an even order derivative is applied to the audio signal having the decreased sampling rate to define a detection signal. Then, the noise disturbance of the audio signal is adjusted according to a statistical average of the detection signal. A system capable of canceling disturbances associated with an audio signal, a video game controller, and an integrated circuit for reducing noise disturbances associated with an audio signal are included.

74 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108