scispace - formally typeset
Search or ask a question
Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.


Papers
More filters
Proceedings ArticleDOI
27 Nov 1989
TL;DR: The LD-VXC coder provides very good speech quality at 16 kb/s, moderate complexity, a delay of under 2 ms, and a gentle degradation of quality with transmission errors, and was submitted to the CCITT as a candidate for a future 16-kb/s speech coding standard.
Abstract: To attain a very-low-delay speech coder at 16 kb/s while maintaining a quality acceptable for the public switched telephone network, low delay vector excitation coding (LD-VXC) is introduced. Backward adaptation is used to track the spectral characteristics of the signal without requiring any buffering of the input speech, thereby allowing a very low delay to be achieved in an analysis-by-synthesis structure. The algorithm differs markedly from conventional VXC or CELP (code-excited linear prediction) coders due to the use of backward adaptive linear prediction for modeling the time-varying short- and long-term correlation of speech. The LD-VXC coder provides very good speech quality at 16 kb/s, moderate complexity, a delay of under 2 ms, and a gentle degradation of quality with transmission errors. The algorithm was submitted to the CCITT as a candidate for a future 16-kb/s speech coding standard. >

31 citations

Journal ArticleDOI
TL;DR: This work proposes a meta-heuristic feature selection (FS) method using a hybrid of Golden Ratio Optimization (GRO) and Equilibriumoptimization (EO) algorithms, which it has named as Golden Ratio based Equilibrium optimization (GREO) algorithm.
Abstract: Speech is the most important media of expressing emotions for human beings. Thus, it has often been an area of interest to understand the emotion of a person out of his/her speech by using the intelligence of the computing devices. Traditional machine learning techniques are very much popular in accomplishing such tasks. To provide a less expensive computational model for emotion classification through speech analysis, we propose a meta-heuristic feature selection (FS) method using a hybrid of Golden Ratio Optimization (GRO) and Equilibrium Optimization (EO) algorithms, which we have named as Golden Ratio based Equilibrium Optimization (GREO) algorithm. The optimally selected features by the model are fed to the XGBoost classifier. Linear Predictive Coding (LPC) and Linear Prediction Cepstral Coefficients (LPCC) based features are considered as the input here, and these are optimized by using the proposed GREO algorithm. We have achieved impressive recognition accuracies of 97.31% and 98.46% on two standard datasets namely, SAVEE and EmoDB respectively. The proposed FS model is also found to perform better than their constituent algorithms as well as many well-known optimization algorithms used for FS in the past. Source code of the present work is made available at: https://github.com/arijitdey1/Hybrid-GREO .

31 citations

Patent
28 Jun 2005
TL;DR: In this article, a method and system for enhancing the frequency response of speech signals is presented, where an average speech spectral shape estimate is calculated over time based on the input speech signal.
Abstract: A method and system for enhancing the frequency response of speech signals are provided. An average speech spectral shape estimate is calculated over time based on the input speech signal. The average speech spectral shape estimate may be calculated in the frequency domain using a first order IIR filtering or “leaky integrators.” Thus, the average speech spectral shape estimate adapts over time to changes in the acoustic characteristics of the voice path or any changes in the electrical audio path that may affect the frequency response of the system. A spectral correction factor may be determined by comparing the average speech spectral shape estimate to a desired target spectral shape. The spectral correction factor may be added (in units of dB) to the spectrum of the input speech signal in order to enhance or adjust the spectrum of the input speech signal toward the desired spectral shape, and an enhanced speech signal re-synthesized from the corrected spectrum.

31 citations

Proceedings ArticleDOI
27 Nov 2001
TL;DR: An LDA-based method for extracting optimal feature sets from codec bitstreams is proposed and it is demonstrated that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs.
Abstract: Communication devices which perform distributed speech recognition (DSR) tasks currently transmit standardized coded parameters of speech signals. Recognition features are extracted from signals reconstructed using these on a remote server. Since reconstruction losses degrade recognition performance, proposals are being considered to standardize DSR-codecs which derive recognition features, to be transmitted and used directly for recognition. However, such a codec must be embedded on the transmitting device, along with its current standard codec. Performing recognition using codec bitstreams avoids these complications: no additional feature-extraction mechanism is required on the device, and there are no reconstruction losses on the server. We propose an LDA-based method for extracting optimal feature sets from codec bitstreams and demonstrate that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs. For GSM and CELP, we show that the performance is comparable to that with uncoded speech and standard DSR-codec features.

31 citations

Proceedings ArticleDOI
01 Nov 2008
TL;DR: This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec.
Abstract: In the following paper, a method for the real-time conversion of whispers to normal phonated speech through a code excited linear prediction analysis-by-synthesis codec is discussed. This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec. Furthermore, since restoring pitch to whispered speech requires some considerations of quality and accuracy, spectral enhancements are required in terms of formant shifting (LSPs modification) and pitch injection based on voiced/unvoiced decision. Spectral shifting is accomplished through line-spectral pair adjustment. Implementing such methods by using the popular CELP codec allows integration of the technique with any modern speech applications and devices. Subjective testing results are presented to determine the effectiveness of the technique.

31 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Noise
110.4K papers, 1.3M citations
81% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
80% related
Filter (signal processing)
81.4K papers, 1M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202225
202126
202042
201925
201837