scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Proceedings Article
01 Jan 1987
TL;DR: This paper presents a generalisation of error propagation nets to deal with time varying, or dynamic patterns, and three possible architectures are explored.
Abstract: Error propagation nets have been shown to be able to learn a variety of tasks in which a static input pattern is mapped onto a static output pattern. This paper presents a generalisation of these nets to deal with time varying, or dynamic patterns, and three possible architectures are explored. As an example, dynamic nets are applied to the problem of speech coding, in which a time sequence of speech data are coded by one net and decoded by another. The use of dynamic nets gives a better signal to noise ratio than that achieved using static nets.

102 citations

Patent
Robert E. Holm1
30 Sep 1994
TL;DR: In this paper, a transform based compression mechanism is proposed to establish a single compression technique which is applicable for all audio compression ranging from very low bit rate speech to CD/Audio quality music.
Abstract: A transform based compression mechanism to establish a single compression technique which is applicable for all audio compression ranging from very low bit rate speech to CD/Audio quality music. Additionally, a multiconferencing unit (multi-point bridge) is provided which takes advantage of the transform based compression algorithm by providing a simple, low cost way of combining multiple parties without the need for transcoding.

102 citations

Proceedings ArticleDOI
01 Jan 1996
TL;DR: The results of a study to examine the effects speech coders have on speech recognition are presented and the effects onspeech recognition performance by tandeming each of the speechCoders are presented.
Abstract: Speech coders with bitrates as low as 2.4 kbits/s are now being developed for speech transmission in the telecommunications industry. For speech coders to work at this reduced bitrate, some speech information has to be removed and it is only natural to expect that the performance of speech recognition systems will deteriorate when coded speech is applied as input to a recognition system. The results of a study to examine the effects speech coders have on speech recognition am presented. Six different speech coders ranging from 4.8 kbits/s to 40 kbits/s are used with two different speech recognition systems: (1) isolated word recognition and (2) phoneme recognition from continuous speech. The effects on speech recognition performance by tandeming each of the speech coders are also presented.

102 citations

Journal ArticleDOI
TL;DR: A new perceptually motivated approach is proposed for enhancement of speech corrupted by colored noise that takes into account the frequency masking properties of the human auditory system and reduces the perceptual effect of the residual noise.
Abstract: A new perceptually motivated approach is proposed for enhancement of speech corrupted by colored noise. The proposed approach takes into account the frequency masking properties of the human auditory system and reduces the perceptual effect of the residual noise. This new perceptual method is incorporated into a frequency-domain speech enhancement method and a subspace-based speech enhancement method. A better power spectrum/autocorrelation function estimator was also developed to improve the performance of the proposed algorithms. Objective measures and informal listening tests demonstrated significant improvements over other methods when tested with TIMIT sentences corrupted by various types of colored noise.

102 citations

Journal ArticleDOI
TL;DR: A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.
Abstract: Audio signals are information rich nonstationary signals that play an important role in our day-to-day communication, perception of environment, and entertainment. Due to its non-stationary nature, time- or frequency-only approaches are inadequate in analyzing these signals. A joint time-frequency (TF) approach would be a better choice to efficiently process these signals. In this digital era, compression, intelligent indexing for content-based retrieval, classification, and protection of digital audio content are few of the areas that encapsulate a majority of the audio signal processing applications. In this paper, we present a comprehensive array of TF methodologies that successfully address applications in all of the above mentioned areas. A TF-based audio coding scheme with novel psychoacoustics model, music classification, audio classification of environmental sounds, audio fingerprinting, and audio watermarking will be presented to demonstrate the advantages of using time-frequency approaches in analyzing and extracting information from audio signals.

101 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108