scispace - formally typeset
Search or ask a question
Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.


Papers
More filters
PatentDOI
TL;DR: In this paper, each audio signal is digitized and then transformed into a predefined visual image, which is displayed in a 3D space, and selected audio characteristics, such as frequency, amplitude, time and spatial placement, are correlated to selected visual characteristics of the visual image.
Abstract: A method and apparatus for mixing audio signals. Each audio signal is digitized and then transformed into a predefined visual image, which is displayed in a three-dimensional space. Selected audio characteristics of the audio signal, such as frequency, amplitude, time and spatial placement, are correlated to selected visual characteristics of the visual image, such as size, location, texture, density and color. Dynamic changes or adjustment to any one of these parameters causes a corresponding change in the correlated parameter.

218 citations

Proceedings ArticleDOI
15 Mar 1999
TL;DR: The estimated q's are used to control both the gain and the update of the estimated noise spectrum during speech presence in a modified MMSE log-spectral amplitude estimator, which resulted in higher scores than for the IS-127 standard enhancement algorithm, when pre-processing noisy speech for a coding application.
Abstract: Speech enhancement algorithms which are based on estimating the short-time spectral amplitude of the clean speech have better performance when a soft-decision gain modification, depending on the a priori probability of speech absence, is used. In reported works a fixed probability, q, is assumed. Since speech is non-stationary and may not be present in every frequency bin when voiced, we propose a method for estimating distinct values of q for different bins which are tracked in time. The estimation is based on a decision-theoretic approach for setting a threshold in each bin followed by short-time averaging. The estimated q's are used to control both the gain and the update of the estimated noise spectrum during speech presence in a modified MMSE log-spectral amplitude estimator. Subjective tests resulted in higher scores than for the IS-127 standard enhancement algorithm, when pre-processing noisy speech for a coding application.

217 citations

Journal ArticleDOI
TL;DR: An information theory approach to the theory and practice of linear predictive coded speech compression systems is developed and it is shown that a traditional LPC system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is a minimum discrimination information between a speech process model and an observed frame of actual speech.
Abstract: An information theory approach to the theory and practice of linear predictive coded (LPC) speech compression systems is developed. It is shown that a traditional LPC system can be viewed as a minimum distortion or nearest-neighbor system where the distortion measure is a minimum discrimination information between a speech process model and an observed frame of actual speech. This distortion measure is used in an algorithm for computer-aided design of block source codes subject to a fidelity criterion to obtain a 750-bits/s speech compression system that resembles an LPC system but has a much lower rate, a larger memory requirement, and requires no on-line LPC analysis. Quantitative and informal subjective comparisons are made among our system and LPC systems.

217 citations

Journal ArticleDOI
TL;DR: This work chronicles the development of rate-distortion theory and provides an overview of its influence on the practice of lossy source coding.
Abstract: Lossy coding of speech, high-quality audio, still images, and video is commonplace today. However, in 1948, few lossy compression systems were in service. Shannon introduced and developed the theory of source coding with a fidelity criterion, also called rate-distortion theory. For the first 25 years of its existence, rate-distortion theory had relatively little impact on the methods and systems actually used to compress real sources. Today, however, rate-distortion theoretic concepts are an important component of many lossy compression techniques and standards. We chronicle the development of rate-distortion theory and provide an overview of its influence on the practice of lossy source coding.

213 citations

Patent
19 Nov 2007
TL;DR: In this article, a speech recognition method was proposed to improve methods of performing speech recognition with barge-in by starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user input signal, selectively altering the synthesized speech, and recognizing the user choice.
Abstract: Embodiments of the present invention improve methods of performing speech recognition with barge-in. In one embodiment, the present invention includes a speech recognition method comprising starting a synthesis of recorded speech, receiving a user speech input signal providing information regarding a user choice, detecting an initial portion of the user speech input signal, selectively altering the synthesis of recorded speech, and recognizing the user choice.

213 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Decoding methods
65.7K papers, 900K citations
84% related
Fading
55.4K papers, 1M citations
80% related
Feature vector
48.8K papers, 954.4K citations
80% related
Feature extraction
111.8K papers, 2.1M citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202338
202284
202170
202062
201977
2018108