Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Weighting schemes for audio-visual fusion in speech recognition

[...]

Hervé Glotin, D. Vergyr, Chalapathy Neti, Gerasimos Potamianos, Juergen Luettin - Show less +1 more

07 May 2001

TL;DR: An improvement in the state-of-the-art large vocabulary continuous speech recognition (LVCSR) performance is demonstrated by the use of visual information, in addition to the traditional audio one, by taking a decision fusion approach for the audio-visual information.

...read moreread less

Abstract: We demonstrate an improvement in the state-of-the-art large vocabulary continuous speech recognition (LVCSR) performance, under clean and noisy conditions, by the use of visual information, in addition to the traditional audio one. We take a decision fusion approach for the audio-visual information, where the single-modality (audio- and visual- only) HMM classifiers are combined to recognize audio-visual speech. More specifically, we tackle the problem of estimating the appropriate combination weights for each of the modalities. Two different techniques are described: the first uses an automatically extracted estimate of the audio stream reliability in order to modify the weights for each modality (both clean and noisy audio results are reported), while the second is a discriminative model combination approach where weights on pre-defined model classes are optimized to minimize WER (clean audio only results).

...read moreread less

95 citations

Journal Article•DOI•

Linear prediction based packet loss concealment algorithm for PCM coded speech

[...]

E. Gunduzhan¹, K. Momtahan¹•Institutions (1)

Nortel¹

01 Nov 2001-IEEE Transactions on Speech and Audio Processing

TL;DR: A high performance packet loss concealment algorithm for pulse code modulation (PCM) coded speech that extracts the residual signal of the previously received speech by linear prediction analysis, uses periodic replication to generate an approximation for the excitation signal of missing speech, and generates synthesized speech using this excitation.

...read moreread less

Abstract: One of the well-known problems in real-time packetized voice applications is the degradation in voice quality due to delayed or misrouted packets. When a voice packet does not arrive at the receiver on time, the receiver needs a packet loss concealment algorithm to generate a signal instead of the missing voice segment. In this paper we describe a high performance packet loss concealment algorithm for pulse code modulation (PCM) coded speech. The algorithm extracts the residual signal of the previously received speech by linear prediction analysis, uses periodic replication to generate an approximation for the excitation signal of missing speech, and generates synthesized speech using this excitation. It also performs overlap-and-adding and scaling operations to smooth out transitions at frame boundaries. The new algorithm is compared to other algorithms by subjective quality tests, and is found to be better than the existing algorithms in some cases.

...read moreread less

95 citations

Patent•

Audio encoder, audio decoder and audio processor having a dynamically variable warping characteristic

[...]

Juergen Herre, Bernhard Grill, Markus Multrus, Stefan Bayer, Ulrich Kraemer, Jens Hirschfeld, Stefan Wabnik, Gerald Schuller - Show less +4 more

30 Jun 2006

TL;DR: In this article, a controller is connected for providing the time-varying control signal, which depends on the audio signal, and the controller is introduced to an encoding processor having different encoding algorithms adapted to a specific signal pattern.

...read moreread less

Abstract: An audio encoder, an audio decoder or an audio processor includes a filter for generating a filtered audio signal, the filter having a variable warping characteristic, the characteristic being controllable in response to a time-varying control signal, the control signal indicating a small or no warping characteristic or a comparatively high warping characteristic. Furthermore, a controller is connected for providing the time-varying control signal, which depends on the audio signal. The filtered audio signal can be introduced to an encoding processor having different encoding algorithms, one of which is a coding algorithm adapted to a specific signal pattern. Alternatively, the filter is a post-filter receiving a decoded audio signal.

...read moreread less

95 citations

Proceedings Article•DOI•

Iterative source-channel decoder using extrinsic information from softbit-source decoding

[...]

Marc Adrat¹, Peter Vary, J. Spittka•Institutions (1)

RWTH Aachen University¹

07 May 2001

TL;DR: This contribution deals with an iterative source-channel decoding approach where a simple channel decoder and a softbit-source decoder are concatenated, and derives a new formula that shows how the residual redundancy transforms into extrinsic information utilizable for iterative decoding.

...read moreread less

Abstract: In digital mobile communications, efficient compression algorithms are needed to encode speech or audio signals. As the determined source parameters are highly sensitive to transmission errors, robust source and channel decoding schemes are required. This contribution deals with an iterative source-channel decoding approach where a simple channel decoder and a softbit-source decoder are concatenated. We mainly focus on softbit-source decoding which can be considered as an error concealment technique. This technique utilizes residual redundancy remaining after source coding. We derive a new formula that shows how the residual redundancy transforms into extrinsic information utilizable for iterative decoding. The derived formula opens several starting points for optimizations, e.g. it helps to find a robust index assignment. Furthermore, it allows the conclusion that softbit-source decoding is the limiting factor if applied to iterative decoding processes. Therefore, no significant gain will be obtainable by more than two iterations. This will be demonstrated by simulation.

...read moreread less

94 citations

Proceedings Article•DOI•

Frame level noise classification in mobile environments

[...]

Khaled Helmi El-Maleh¹, A. Samouelian, Peter Kabal•Institutions (1)

McGill University¹

15 Mar 1999

TL;DR: The experimental results show that the line spectral frequencies (LSFs) are robust features in distinguishing the different classes of noises.

...read moreread less

Abstract: Background environmental noises degrade the performance of speech-processing systems (e.g. speech coding, speech recognition). By modifying the processing according to the type of background noise, the performance can be enhanced. This requires noise classification. In this paper, four pattern-recognition frameworks have been used to design noise classification algorithms. Classification is done on a frame-by-frame basis (e.g. once every 20 ms). Five commonly encountered noises in mobile telephony (i.e. car, street, babble, factory, and bus) have been considered in our study. Our experimental results show that the line spectral frequencies (LSFs) are robust features in distinguishing the different classes of noises.

...read moreread less

94 citations

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics