scispace - formally typeset
Search or ask a question
Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.


Papers
More filters
Proceedings ArticleDOI
14 Apr 1991
TL;DR: This method uses a split vector quantizer to code spectral parameters at 18 to 22 b/frame without excessive memory requirements and exploits interframe redundancy by block coding the pitch, voicing, and energy parameters, and by selectively interpolating the spectral parameters.
Abstract: Pitch-excited linear predictive coding (LPC) is widely used to code speech at 2400 b/s. The authors describe a method of coding LPC parameters which reduces the required channel bandwidth from 2400 b/s to 600-800 b/s. This method uses a split vector quantizer to code spectral parameters at 18 to 22 b/frame without excessive memory requirements. It also exploits interframe redundancy by block coding the pitch, voicing, and energy parameters, and by selectively interpolating the spectral parameters. When operating at less than 700 b/s, the system achieved as core of 90 for a three-male-speaker DRT (diagnostic rhyme test). >

34 citations

Proceedings ArticleDOI
08 Jun 1994
TL;DR: An adaptive coding system that adjusts the rate allocation according to actual channel conditions and shows that the objective and the subjective speech quality of the adaptive coders are superior than their non-adaptive counterparts.
Abstract: Although the mobile communication channels are time-varying, most systems allocate the combined rate between the speech coder and error correction coder according to a nominal channel condition. This generally leads to a pessimistic design and consequently an inefficient utilization of the available resources, such as bandwidth and power. This paper describes an adaptive coding system that adjusts the rate allocation according to actual channel conditions. Two types of variable rate speech coders are considered : the embedded coders and the multimode coders and both are based on code excited linear prediction (CELP). On the other hand, the variable rate channel coders are based on the rate compatible punctured convolutional codes (RCPC). A channel estimator is used at the receiver to track both the short term and the long term fading condition in the channel. The estimated channel state information is then used to vary the rate allocation between the speech and the channel coder, on a frame by frame basis. This is achieved by sending an appropriate rate adjustment command through a feedback channel. Experimental results show that the objective and the subjective speech quality of the adaptive coders are superior than their non-adaptive counterparts. Improvements of up to 1.35 dB in SEGSNR of the speech signal and up to 0.9 in informal MOS for a combined rate of 12.8 kbit/s have been found. In addition, we found that the multimode coders perform better than their embedded counterparts. >

34 citations

Proceedings ArticleDOI
11 Apr 1988
TL;DR: If a number of VQ stages is increased sufficiently, MSVXC can be expressed as a form of transform coding, in which the computationally intensive excitation codebook search is completely eliminated.
Abstract: An approach to vector-excitation-coding (VXC) speech compression utilizing multiple-stage vector quantization (VQ) is considered. Called multiple-stage VXC (MSVXC), this technique facilitates the use of high-dimensional excitation vectors at medium-band rates without substantially increasing computation. The basic approach consists of successively approximating the input speech vector in several cascaded VQ stages, where the input vector for each stage is the quantization error vector from the preceding stage. It is shown that if a number of VQ stages is increased sufficiently, MSVXC can be expressed as a form of transform coding, in which the computationally intensive excitation codebook search is completely eliminated. >

34 citations

Journal ArticleDOI
TL;DR: The experiments presented here show that the analysis-synthesis technique based on GSS can produce speech comparable to that of a high-quality vocoder that is based on the spectral envelope representation, and permit control over voice qualities, namely to transform a modal voice into breathy and tense, by modifying the glottal parameters.
Abstract: This paper proposes an analysis method to separate the glottal source and vocal tract components of speech that is called Glottal Spectral Separation (GSS). This method can produce high-quality synthetic speech using an acoustic glottal source model. In the source-filter models commonly used in speech technology applications it is assumed the source is a spectrally flat excitation signal and the vocal tract filter can be represented by the spectral envelope of speech. Although this model can produce high-quality speech, it has limitations for voice transformation because it does not allow control over glottal parameters which are correlated with voice quality. The main problem with using a speech model that better represents the glottal source and the vocal tract filter is that current analysis methods for separating these components are not robust enough to produce the same speech quality as using a model based on the spectral envelope of speech. The proposed GSS method is an attempt to overcome this problem, and consists of the following three steps. Initially, the glottal source signal is estimated from the speech signal. Then, the speech spectrum is divided by the spectral envelope of the glottal source signal in order to remove the glottal source effects from the speech signal. Finally, the vocal tract transfer function is obtained by computing the spectral envelope of the resulting signal. In this work, the glottal source signal is represented using the Liljencrants-Fant model (LF-model). The experiments we present here show that the analysis-synthesis technique based on GSS can produce speech comparable to that of a high-quality vocoder that is based on the spectral envelope representation. However, it also permit control over voice qualities, namely to transform a modal voice into breathy and tense, by modifying the glottal parameters.

34 citations

01 Apr 1998
TL;DR: In this article, a hierarchical structure of measuring normalizing blocks was proposed to compare perceptually transformed speech signals, and the resulting estimates of perceived speech quality were correlated with the results of nine subjective listening tests.
Abstract: Perceived speech quality is most directly measured by subjective listening tests These tests are often slow and expensive, and numerous attempts have been made to supplement them with objective estimators of perceived speech quality These attempts have found limited success, primarily in analog and higher–rate, error–free digital environments where speech waveforms are preserved or nearly preserved How to objectively measure the perceived quality of highly compressed digital speech, possibly with bit errors or frame erasures, has remained an open question We describe a new approach to this problem, using a simple but effective perceptual transformation, and a hierarchy of measuring normalizing blocks to compare perceptually transformed speech signals The resulting estimates of perceived speech quality were correlated with the results of nine subjective listening tests Together, these tests include 219 4–kHz bandwidth speech encoders/decoders, transmission systems, and reference conditions, with bit rates ranging from 24–64 kb/s When compared with six other estimators, significant improvements were seen in many cases, particularly at lower bit rates, and when bit errors or frame erasures were present These hierarchical structures of measuring normalizing blocks, or other structures of measuring normalizing blocks, may also address open issues in perceived audio quality estimation, layered speech or audio coding, automatic speech or speaker recognition, audio signal enhancement, and other areas

34 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Noise
110.4K papers, 1.3M citations
81% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
80% related
Filter (signal processing)
81.4K papers, 1M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202225
202126
202042
201925
201837