scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
Book
01 Jun 1993
TL;DR: This book discusses the design of mobile satellite systems, as well as the construction of non-Geostationary Satellites, and some of the technologies used in this generation of satellites.
Abstract: Preface. 1. Intro to Satellite Communications. 2. Propagation. 3. Mobile Satellite System Design. 4. Traffic Capacity and Access Control. 5. Digital Model Design. 6. Speech Codec Systems. 7. Error Control Coding. 8. Signalling Systems. 9. Non-Geostationary Satellites. Index.

64 citations

Patent
27 Aug 2015
TL;DR: In this article, a multi-sourced noise suppression system for the Internet of Things (IoT) is presented. But the system is not suitable for the use of speech recognition.
Abstract: Systems and methods for multi-sourced noise suppression are provided. An example system may receive streams of audio data including a voice signal and noise, the voice signal including a spoken word. The streams of audio data are provided by distributed audio devices. The system can assign weights to the audio streams based at least partially on quality of the audio streams. The weights of audio streams can be determined based on signal-to-noise ratios (SNRs). The system may further process, based on the weights, the audio stream to generate cleaned speech. Each audio device comprises microphone(s) and can be associated with the Internet of Things (IoT), such that the audio devices are Internet of Things devices. The processing can include noise suppression and reduction and echo cancellation. The cleaned speech can be provided to a remote device for further processing which may include Automatic Speech Recognition (ASR).

64 citations

Patent
17 Sep 2014
TL;DR: In this paper, a beamforming-based automatic speech recognition (ASR) system is presented. But, the ASR system is configured to process speech based on multiple channels of audio received from a beamformer.
Abstract: In an automatic speech recognition (ASR) processing system, ASR processing may be configured to process speech based on multiple channels of audio received from a beamformer. The ASR processing system may include a microphone array and the beamformer to output multiple channels of audio such that each channel isolates audio in a particular direction. The multichannel audio signals may include spoken utterances/speech from one or more speakers as well as undesired audio, such as noise from a household appliance. The ASR device may simultaneously perform speech recognition on the multi-channel audio to provide more accurate speech recognition results.

64 citations

Journal ArticleDOI
Gautham J. Mysore1
TL;DR: It is argued that the goal of enhancing speech content such as voice overs, podcasts, demo videos, lecture videos, and audio stories should not only be to make it sound cleaner as would be done using traditional speech enhancement techniques, but tomake it sound like it was recorded and produced in a professional recording studio.
Abstract: The goal of speech enhancement is typically to recover clean speech from noisy, reverberant, and often bandlimited speech in order to yield improved intelligibility, clarity, or automatic speech recognition performance. However, the acoustic goal for a great deal of speech content such as voice overs, podcasts, demo videos, lecture videos, and audio stories is often not merely clean speech, but speech that is aesthetically pleasing. This is achieved in professional recording studios by having a skilled sound engineer record clean speech in an acoustically treated room and then edit and process it with audio effects (which we refer to as production). A growing amount of speech content is being recorded on common consumer devices such as tablets, smartphones, and laptops. Moreover, it is typically recorded in common but non-acoustically treated environments such as homes and offices. We argue that the goal of enhancing such recordings should not only be to make it sound cleaner as would be done using traditional speech enhancement techniques, but to make it sound like it was recorded and produced in a professional recording studio. In this paper, we show why this can be beneficial, describe a new data set (a great deal of which was recorded in a professional recording studio) that we prepared to help in developing algorithms for this purpose, and discuss some insights and challenges associated with this problem.

64 citations

Proceedings ArticleDOI
14 Dec 1997
TL;DR: It is shown that analysis of the modulation spectrum offers a means for the systematic evaluation of medium term temporal dynamics of speech features and the importance of syllabic rate modulation spectral components for speech communication.
Abstract: The article questions the reliability of the short term spectral envelope as the dominant carrier of the phonetic identity of a given speech instant and suggests the temporal dynamics of components of the spectral envelopes as more reliable means for deriving the linguistic context of the speech message. It shows that analysis of the modulation spectrum offers a means for the systematic evaluation of medium term temporal dynamics of speech features. Such a medium term dynamic has been previously efficiently utilized in the computation of dynamic features and in RASTA processing. We aim for data driven analysis of the modulation spectrum and demonstrate the importance of syllabic rate modulation spectral components for speech communication.

64 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108