Topic

Speech coding

About: Speech coding is a research topic. Over the lifetime, 14245 publications have been published within this topic receiving 271964 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Patent•

Completed fixed codebook for speech encoder

[...]

Yang Gao¹•Institutions (1)

Conexant¹

24 Aug 1999

TL;DR: In this article, a multi-rate speech codec supports a number of encoding bit rate modes by adaptively selecting encoding bits rate modes to match communication channel restrictions, and a variety of techniques are applied, many of which involve the classification of the input signal.

...read moreread less

Abstract: A multi-rate speech codec supports a number of encoding bit rate modes by adaptively selecting encoding bit rate modes to match communication channel restrictions. In higher bit rate encoding modes, an accurate representation of speech through CELP (code-excited linear prediction) and other associated modeling parameters are generated for higher quality decoding and reproduction. To achieve high quality in high lower bit rate encoding modes, the speech encoder departs from the strict waveform matching criteria of regular CELP coders and strives to identify significant perceptual features of the input signal. To support lower bit rate encoding modes, a variety of techniques are applied, many of which involve the classification of the input signal. For each of the bit rate modes selected, a number of fixed or innovation sub-codebooks are selected in use in generating innovation vectors.

...read moreread less

77 citations

Proceedings Article•

Review of algorithms for perceptual coding of digital audio signals

[...]

Ted Painter¹, Andreas Spanias•Institutions (1)

Arizona State University¹

01 Jan 1997

TL;DR: In this paper, a review of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio signals is presented, including algorithms which manipulate transform components and subband signal decompositions.

...read moreread less

Abstract: Considerable research has been devoted to the development of algorithms for perceptually transparent coding of high-fidelity (CD-quality) digital audio. As a result, many algorithms have been proposed and several have now become international and/or commercial product standards. This paper reviews algorithms for perceptually transparent coding of CD-quality digital audio, including both research and standardization activities. First, psychoacoustic principles are described with the MPEG psychoacoustic signal analysis model 1 discussed in some detail. Then, we review methodologies which achieve perceptually transparent coding of FM- and CD-quality audio signals, including algorithms which manipulate transform components and subband signal decompositions. The discussion concentrates on architectures and applications of those techniques which utilize psychoacoustic models to exploit efficiently masking characteristics of the human receiver. Several algorithms which have become international and/or commercial standards are also presented, including the ISO/MPEG family and the Dolby AC-3 algorithms. The paper concludes with a brief discussion of future research directions.

...read moreread less

77 citations

Proceedings Article•DOI•

Bandwidth extension of narrowband speech for low bit-rate wideband coding

[...]

Jean-Marc Valin¹, Roch Lefebvre•Institutions (1)

Université de Sherbrooke¹

17 Sep 2000

TL;DR: An algorithm to generate wideband speech from narrow band speech using as low as 500 bit/s of side information is presented, which has enhanced quality compared to narrowband speech.

...read moreread less

Abstract: Wireless telephone speech is usually limited to the 300-3400 Hz band, which reduces its quality. There is thus a growing demand for wideband speech systems that transmit from 50 Hz to 8000 Hz. This paper presents an algorithm to generate wideband speech from narrowband speech using as low as 500 bit/s of side information. The 50-300 Hz band is predicted from the narrowband signal. A source-excitation model is used for the 3400-8000 Hz band, where the excitation is extrapolated at the receiver, and the spectral envelope is transmitted. Though some artifacts are present, the resulting wideband speech has enhanced quality compared to narrowband speech.

...read moreread less

77 citations

Proceedings Article•DOI•

High-quality 16 kb/s speech coding with a one-way delay less than 2 ms

[...]

Juin-Hwey Chen¹•Institutions (1)

Bell Labs¹

03 Apr 1990

TL;DR: A high-quality 16-kb/s speech coder which has a one-way coding delay of less than 2 ms is presented and formal subjective tests indicate that this coder produces high- quality speech comparable to that of the CCITT G.721 32- kb/s ADPCM standard.

...read moreread less

Abstract: A high-quality 16-kb/s speech coder which has a one-way coding delay of less than 2 ms is presented. The coder is basically a backward-adaptive version of the code-excited linear prediction (CELP) coder. The low coding delay is achieved by using backward-adaptive predictor and gain and by using an excitation vector size as small as five samples. The pitch predictor in conventional CELP coders is eliminated, and the linear predictive coding (LPC) predictor order is increased from 10 to 50. The excitation gain is updated by a tenth-order adaptive logarithmic gain predictor. This log-gain predictor and the LPC predictor are updated by performing LPC analysis on previous log-gain and coded speech, respectively. The excitation codebook is closed-loop optimized and the codebook index is Gray-coded to improve the robustness against channel errors. Formal subjective tests indicate that this coder produces high-quality speech comparable to that of the CCITT G.721 32-kb/s ADPCM standard. >

...read moreread less

77 citations

Journal Article•DOI•

Impairment Factor Framework for Wide-Band Speech Codecs

[...]

Sebastian Möller¹, Alexander Raake¹, Nobuhiko Kitawaki, A. Takahashi, Marcel Wältermann - Show less +1 more•Institutions (1)

Technical University of Berlin¹

01 Nov 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: A new method is described for quantifying the quality degradation introduced by wide-band speech codecs via a one-dimensional impairment factor, based on auditory listening-only tests, which may be used for predicting speech quality in an instrumental way.

...read moreread less

Abstract: A new method is described for quantifying the quality degradation introduced by wide-band speech codecs via a one-dimensional impairment factor. The method is based on auditory listening-only tests, but the resulting impairment factors may be used for predicting speech quality in an instrumental way, e.g., for network planning purposes. Following the method, auditory test results are first transformed to an overall quality rating scale, and then adjusted to rule out test-specific effects. The derived impairment factors fit into the common framework which is defined by the E-model for narrow-band telephone networks, and which is hereby extended towards wide-band speech transmission. This paper presents the necessary auditory test data, describes the derivation and adjustment methodology, and provides numerical values for a range of wide-band speech codecs. The values are tested for their robustness in case of codec tandems and adjusted to represent the effects of packet loss

...read moreread less

77 citations

Collapse

Network Information

Performance

Metrics

14,368

Papers

279,843

Citations

No. of papers in the topic in previous years
Year	Papers
2023	38
2022	84
2021	70
2020	62
2019	77
2018	108

Speech coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics