scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Patent
James R. Lewis1, Barbara Ballard1
08 Mar 1999
TL;DR: In this paper, a method and system for responding to randomly occurring noise in a voice recognition application program is presented, where the system receives an audio signal representative of sound in an audio environment and processes the audio signal to identify certain non-speech sounds.
Abstract: A method and system for responding to randomly occurring noise in a voice recognition application program. The system receives an audio signal representative of sound in an audio environment and processes the audio signal to identify certain non-speech sounds. A pre-defined action is performed in response to the non-speech sound which has been identified. The pre-defined action is selected from the group consisting of disabling a microphone source of the audio signal, suspending further processing of the audio signal by the speech recognition system, executing a user-defined macro, and ignoring the sound. The system may perform additional steps including recording a sound which is to be identified as a non-speech sound and assigning one of the pre-defined actions to be performed in response when the non-speech sound has been identified.

132 citations

Journal ArticleDOI
TL;DR: The most effective estimation technique for packets containing 16 ms of speech in a pulse-code-modulation format is pitch waveform replication, which extends the acceptable ratio of missing packets to 10%.
Abstract: Missing packets are a major cause of impairment in packet voice networks. While it is easiest to allow these gaps in received speech to appear as silent intervals in reconstructed speech, speech quality is improved by filling the gaps with estimates of the transmitted waveform. Several estimation techniques have been investigated for packets containing 16 ms of speech in a pulse-code-modulation format. The simplest method, packet repetition, extends from 2% to 5%, the acceptable ratio of missing packets. Here, acceptability is defined as a mean opinion score midway between fair and good on a five-point opinion scale. The most effective estimation technique (although not the most complex) is pitch waveform replication. It extends the acceptable ratio of missing packets to 10%. >

131 citations

Patent
Dipanjan Sen1
12 Jul 2013
TL;DR: In this article, a unified approach to encoding different types of audio inputs is described, and a system, methods, and apparatus for a unified method for encoding different audio inputs are described.
Abstract: Systems, methods, and apparatus for a unified approach to encoding different types of audio inputs are described.

131 citations

Journal ArticleDOI
TL;DR: The audio coding standard developed by the Moving Pictures Expert Group within the International Organization for standardization (ISO/MPEG) is covered in some detail, since it will be used in many application areas, including digital storage, transmission, and broadcasting of audio-only signals and audiovisual applications such as videotelephony, videoconferencing, and TV broadcasting.
Abstract: Typical parameters of wideband speech and audio signals, including digitized versions of each, potential applications, and available transmission media, are described. Facts about human auditory perception that are exploited in audio coding and quality measures that play an important role in coder evaluations and designs are reviewed. Techniques for efficient coding of wideband speech and audio signals, with an emphasis on existing standards, are discussed. The audio coding standard developed by the Moving Pictures Expert Group within the International Organization for standardization (ISO/MPEG) is covered in some detail, since it will be used in many application areas, including digital storage, transmission, and broadcasting of audio-only signals and audiovisual applications such as videotelephony, videoconferencing, and TV broadcasting. Ongoing research and standardization work is outlined. >

131 citations

Journal ArticleDOI
TL;DR: It is shown that this new method resuits in a substantial improvement in the intelligibility of speech in white noise over normal speech and over previously implemented methods.
Abstract: This paper presents the results of an examination of rapid amplitude compression following high-pass filtering as a method for processing speech, prior to reception by the listener, as a means of enhancing the intelligibility of speech in high noise levels. Arguments supporting this particular signal processing method are based on the results of previous perceptual studies of speech in noise. In these previous studies, it has been shown that high-pass filtered/clipped speech offers a significant gain in the intelligibility of speech in white noise over that for unprocessed speech at the same signal-to-noise ratios. Similar results have also been obtained for speech processed by high-pass filtering alone. The present paper explores these effects and it proposes the use of high-pass filtering followed by rapid amplitude compression as a signal processing method for enhancing the intelligibility of speech in noise. It is shown that this new method resuits in a substantial improvement in the intelligibility of speech in white noise over normal speech and over previously implemented methods.

131 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108