Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Deep Learning for Acoustic Modeling in Parametric Speech Generation: A systematic review of existing techniques and future trends

[...]

Zhen-Hua Ling¹, Shiyin Kang², Heiga Zen³, Andrew W. Senior³, Mike Schuster³, Xiaojun Qian², Helen Meng², Li Deng⁴ - Show less +4 more•Institutions (4)

University of Science and Technology of China¹, The Chinese University of Hong Kong², Google³, Microsoft⁴

02 Apr 2015-IEEE Signal Processing Magazine

TL;DR: In this article, Hidden Markov Models (HMMs) and Gaussian Mixture Models (GMMs) are used for generating low-level speech waveforms from high-level symbolic inputs via intermediate acoustic feature sequences.

...read moreread less

Abstract: Hidden Markov models (HMMs) and Gaussian mixture models (GMMs) are the two most common types of acoustic models used in statistical parametric approaches for generating low-level speech waveforms from high-level symbolic inputs via intermediate acoustic feature sequences. However, these models have their limitations in representing complex, nonlinear relationships between the speech generation inputs and the acoustic features. Inspired by the intrinsically hierarchical process of human speech production and by the successful application of deep neural networks (DNNs) to automatic speech recognition (ASR), deep learning techniques have also been applied successfully to speech generation, as reported in recent literature.

...read moreread less

203 citations

Journal Article•DOI•

Regular-pulse excitation--A novel approach to effective and efficient multipulse coding of speech

[...]

P. Kroon¹, E.D. Deprettere², Robert Johannes Sluyter³•Institutions (3)

Bell Labs¹, Delft University of Technology², Philips³

01 Oct 1986-IEEE Transactions on Acoustics, Speech, and Signal Processing

TL;DR: Using the generalized baseband coder formulation, it is demonstrated that under reasonable assumptions concerning the weighting filter, an attractive low-complexity/high-quality coder can be obtained.

...read moreread less

Abstract: This paper describes an effective and efficient time domain speech encoding technique that has an appealing low complexity, and produces toll quality speech at rates below 16 kbits/s. The proposed coder uses linear predictive techniques to remove the short-time correlation in the speech signal. The remaining (residual) information is then modeled by a low bit rate reduced excitation sequence that, when applied to the time-varying model filter, produces a signal that is "close" to the reference speech signal. The procedure for finding the optimal constrained excitation signal incorporates the solution of a few strongly coupled sets of linear equations and is of moderate complexity compared to competing coding systems such as adaptive transform coding and multipulse excitation coding. The paper describes the novel coding idea and the procedure for finding the excitation sequence. We then show that the coding procedure can be considered as an "optimized" baseband coder with spectral folding as high-frequency regeneration technique. The effect of various analysis parameters on the quality of the reconstructed speech is investigated using both objective and subjective tests. Further, modifications of the basic algorithm, and their impact on both the quality of the reconstructed speech signal and the complexity of the encoding algorithm, are discussed. Using the generalized baseband coder formulation, we demonstrate that under reasonable assumptions concerning the weighting filter, an attractive low-complexity/high-quality coder can be obtained.

...read moreread less

202 citations

Journal Article•DOI•

Efficient search and design procedures for robust multi-stage VQ of LPC parameters for 4 kb/s speech coding

[...]

W.P. LeBlanc¹, B. Bhattacharya, S.A. Mahmoud, V. Cuperman•Institutions (1)

Carleton University¹

01 Oct 1993-IEEE Transactions on Speech and Audio Processing

TL;DR: It is shown experimentally that as the number of stages is increased above the optimal performance/complexity tradeoff, the quantizer robustness and outlier performance can be improved at the expense of a slight increase in rate.

...read moreread less

Abstract: A tree-searched multistage vector quantization (VQ) scheme for linear prediction coding (LPC) parameters which achieves spectral distortion lower than 1 dB with low complexity and good robustness using rates as low as 22 b/frame is presented. The M-L search is used, and it is shown that it achieves performance close to that of the optimal search for a relatively small M. A joint codebook design strategy for multistage VQ which improves convergence speed and the VQ performance measures is presented. The best performance/complexity tradeoffs are obtained with relatively small size codebooks cascaded in a 3-6 stage configuration. It is shown experimentally that as the number of stages is increased above the optimal performance/complexity tradeoff, the quantizer robustness and outlier performance can be improved at the expense of a slight increase in rate. Results for log area ratio (LAR) and line spectral pairs (LSPs) parameters are presented. A training technique that reduces outliers at the expense of a slight average performance degradation is introduced. The method significantly outperforms the split codebook approach. >

...read moreread less

201 citations

Journal Article•DOI•

Codebook driven short-term predictor parameter estimation for speech enhancement

[...]

Sriram Srinivasan, J. Samuelsson, W.B. Kleijn

01 Dec 2006-IEEE Transactions on Audio, Speech, and Language Processing

TL;DR: Experimental results show that the use of a priori information and the calculation of the instantaneous speech and noise excitation variances on a frame-by-frame basis result in good performance in both stationary and nonstationary noise conditions.

...read moreread less

Abstract: In this paper, we present a new technique for the estimation of short-term linear predictive parameters of speech and noise from noisy data and their subsequent use in waveform enhancement schemes. The method exploits a priori information about speech and noise spectral shapes stored in trained codebooks, parameterized as linear predictive coefficients. The method also uses information about noise statistics estimated from the noisy observation. Maximum-likelihood estimates of the speech and noise short-term predictor parameters are obtained by searching for the combination of codebook entries that optimizes the likelihood. The estimation involves the computation of the excitation variances of the speech and noise auto-regressive models on a frame-by-frame basis, using the a priori information and the noisy observation. The high computational complexity resulting from a full search of the joint speech and noise codebooks is avoided through an iterative optimization procedure. We introduce a classified noise codebook scheme that uses different noise codebooks for different noise types. Experimental results show that the use of a priori information and the calculation of the instantaneous speech and noise excitation variances on a frame-by-frame basis result in good performance in both stationary and nonstationary noise conditions.

...read moreread less

200 citations

Journal Article•DOI•

Vector quantization: A pattern-matching technique for speech coding

[...]

Allen Gersho¹, Vladimir Cuperman•Institutions (1)

University of California, Berkeley¹

01 Dec 1983-IEEE Communications Magazine

TL;DR: Recent results obtained in waveform coding of speech with vector quantization are reviewed, with Vector quantization appearing to be a suitable coding technique which caters to this dual requirement of effective speech coding.

...read moreread less

Abstract: V ECTOR QUANTIZATION (VQ), a new direction in source coding, has recently emerged as a powerful and widely applicable coding technique. I t was first applied to analysis/synthesis of speech, and has allowed Linear Predictive Coding (LPC) rates to be dramatically reduced to 800 b/s with very slight reduction in quality, and further compressed to rates as low as 150 b/s while retaining intelligibility [ 1,2]. More recently, the technique has found its way to waveform coding [3-51, where its applicability and effectiveness is less obvious and not widely known. There is currently a great need for a low-complexity speech coder at the rate of 16 kb/s which attains essentially “toll” quality, roughly equivalent to that of standard 64-kb/s log PCM codecs. Adaptive DPCM schemes can attain this quality with low complexity for the proposed 32 kb/s CCITT standard, but at 16 kb/s the quality of ADPCM or adaptive delta modulation schemes is inadequate. More powerful methods, such as subband coding or transform coding, are capable of producing acceptable speech quality at 16kb/s but have a much higher implementation complexity. The difficulty is further compounded by the need for a scheme that can handle both speech and voiceband data at the 16 kb/s rate. These two types of waveforms occupy the same bandwidth in the subscriber loop part of the telephone network, yet they have a widely different statistical character. Effective speech coding at this rate must be geared to the specific character of speech and must exploit our knowledge of human hearing. On the other hand, a waveform that carries data must be coded and later reconstructed so that a modem can still extract the data with an acceptably low error rate. This is purely a signal processing operation not involving human perception. Vector quantization appears to be a suitable coding technique which caters to this dual requirement. VQ may become the key to 16 kb/s coding; it may also lead to improved quality waveform coding at 8 or 9.6 kb/s. In this paper, we review recent results obtained in waveform coding of speech with vector quantization and

...read moreread less

198 citations

Collapse

Network Information

Performance

Metrics

6,598

Papers

148,119

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	25
2021	26
2020	42
2019	25
2018	37

Linear predictive coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics