Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Methods and apparatus for generating, updating and distributing speech recognition models

[...]

Craig L. Reding, Suzi Levas¹•Institutions (1)

Google¹

30 Dec 2011

TL;DR: In this paper, a shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facilities via a communications channel, e.g., the Internet.

...read moreread less

Abstract: Techniques for generating, distributing, and using speech recognition models are described. A shared speech processing facility is used to support speech recognition for a wide variety of devices with limited capabilities including business computer systems, personal data assistants, etc., which are coupled to the speech processing facility via a communications channel, e.g., the Internet. Devices with audio capture capability record and transmit to the speech processing facility, via the Internet, digitized speech and receive speech processing services, e.g., speech recognition model generation and/or speech recognition services, in response. The Internet is used to return speech recognition models and/or information identifying recognized words or phrases. Thus, the speech processing facility can be used to provide speech recognition capabilities to devices without such capabilities and/or to augment a device's speech processing capability. Voice dialing, telephone control and/or other services are provided by the speech processing facility in response to speech recognition results.

...read moreread less

58 citations

Patent•

Audio signal processing method and apparatus

[...]

Mcgrath David S¹, Adam Richard Mckeag¹, Glenn Norman Dickens¹, Richard J. Cartwright¹, Andrew Peter Reilly¹ - Show less +1 more•Institutions (1)

Dolby Laboratories¹

06 Jan 1999-Journal of the Acoustical Society of America

TL;DR: In this article, the authors propose a method for processing a series of input audio signals representing virtual audio sound sources to produce a reduced set of audio output signals for playback over speaker devices placed around a listener.

...read moreread less

Abstract: A method of processing a series of input audio signals representing a series of virtual audio sound sources placed at predetermined positions around a listener to produce a reduced set of audio output signals for playback over speaker devices placed around a listener, the method comprising the steps of: (a) for each of the input audio signals and for each of the audio output signals: (i) convolving the input audio signals with an initial head portion of a corresponding impulse response mapping substantially the initial sound and early reflections for an impulse response of a corresponding virtual audio source to a corresponding speaker device so as to form a series of initial responses; (b) for each of the input audio signals and for each of the audio output signals: (i) forming a combined mix from the audio input signals; and (ii) forming a combined convolution tail from the tails of the corresponding impulse responses; (iii) convolving the combined mix with the combined convolution tail to form a combined tail response; (c)for each of the audio output signals: (i) combining a corresponding series of initial responses and a corresponding combined tail response to form the audio output signal

...read moreread less

58 citations

Signal subspace methods for speech enhancement

[...]

Peter Søren Kirk Hansen¹•Institutions (1)

Technical University of Denmark¹

01 Jun 1998

TL;DR: This thesis focuses on the theory analysis and algorithm aspects of signal subspace methods used for speech enhancement in digital speech processing and illustrates the power and ro bustness of the subspace approach.

...read moreread less

Abstract: This thesis focus on the theory analysis and algorithm aspects of signal subspace methods used for speech enhancement in digital speech processing The problem is approached by initially performing an analysis of subspace principles applied to speech signals in order to characterize the usefulness of de ning a signal subspace for this application The theory is formulated by means of the singular value decomposition or the eigendecomposition and subspace methods are linked to ltering in the frequency domain Nonparametric speech enhancement using linear signal subspace based estimation of the clean signal from the noisy signal is reviewed and connections between existing algorithms and litterature are explored An analysis of the practical behavior of the estimators is given and aspects regarding their performance in the case with prewhitening is covered The relation to the popular spectral subtraction approach is discussed and the origin of the musical noise is pointed out A possible way to reduce the latter is devised In the noisy case model based estimation is a nonlinear problem which is normally solved by iterative techniques However a new idea based on multi microphone inverse ltering is presented where the solution is obtained by subspace methods The algorithm aspects of signal subspace methods are discussed in terms of the rank revealing ULV ULLV decompositions which are numerically stable and can be cheaply updated when a new data sample is present The potential of the decompositions when applied to speech problems are analyzed and di erent estimation strategies are suggested Again the practical behavior of the estimators are analyzed A recursive ULLV algorithm for a so called sliding window estimation is presented which is new in its complete treatment and implementation Many aspects of the algorithm are discussed in details and important considerations are pointed out Both the ULV ULLV algorithms and the subspace based enhancement algorithms are implemented in a Matlab toolbox Throughout the thesis the speech enhancement application illustrates the power and ro bustness of the subspace approach and a number of illustrative examples are given Peter S K Hansen iii

...read moreread less

58 citations

Proceedings Article•DOI•

Speech enhancement with sparse coding in learned dictionaries

[...]

Christian Sigg¹, Tomas Dikk¹, Joachim M. Buhmann¹•Institutions (1)

ETH Zurich¹

14 Mar 2010

TL;DR: This work presents a monaural speech enhancement method based on sparse coding of noisy speech signals in a composite dictionary, consisting of the concatenation of a speech and interferer dictionary, both being possibly over-complete.

...read moreread less

Abstract: The enhancement of speech degraded by non-stationary interferers is a highly relevant and difficult task of many signal processing applications. We present a monaural speech enhancement method based on sparse coding of noisy speech signals in a composite dictionary, consisting of the concatenation of a speech and interferer dictionary, both being possibly over-complete. The speech dictionary is learned off-line on a training corpus, while an environment specific interferer dictionary is learned on-line during speech pauses. Our approach optimizes the trade-off between source distortion and source confusion, and thus achieves significant improvements on objective quality measures like cepstral distance, in the speaker dependent and independent case, in several real-world environments and at low signal-to-noise ratios. Our enhancement method outperforms state-of-the-art methods like multi-band spectral subtraction and approaches based on vector quantization.

...read moreread less

58 citations

Patent•

Method and apparatus for generating a binaural audio signal

[...]

Dirk Jeroen Breebaart¹, Lars Falck Villemoes¹•Institutions (1)

Philips¹

30 Sep 2008

TL;DR: In this article, a demultiplexer (401) and decoder (403) are used to generate a binaural audio signal, which is a downmix of an N-channel audio signal and spatial parameter data.

...read moreread less

Abstract: An apparatus for generating a binaural audio signal comprises a demultiplexer (401) and decoder (403) which receives audio data comprising an audio M-channel audio signal which is a downmix of an N-channel audio signal and spatial parameter data for upmixing the M-channel audio signal to the N-channel audio signal. A conversion processor (411) converts spatial parameters of the spatial parameter data into first binaural parameters in response to at least one binaural perceptual transfer function. A matrix processor (409) converts the M-channel audio signal into a first stereo signal in response to the first binaural parameters. A stereo filter (415, 417) generates the binaural audio signal by filtering the first stereo signal. The filter coefficients for the stereo filter are determined in response to the at least one binaural perceptual transfer function by a coefficient processor (419). The combination of parameter conversion/ processing and filtering allows a high quality binaural signal to be generated with low complexity.

...read moreread less

58 citations

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics