Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Backward adaptation for low delay vector excitation coding of speech at 16 kbit/s

[...]

Vladimir Cuperman¹, Allen Gersho, Robert Pettigrew, J.J. Shynk, J.-H. Yao - Show less +1 more•Institutions (1)

Simon Fraser University¹

27 Nov 1989

TL;DR: The LD-VXC coder provides very good speech quality at 16 kb/s, moderate complexity, a delay of under 2 ms, and a gentle degradation of quality with transmission errors, and was submitted to the CCITT as a candidate for a future 16-kb/s speech coding standard.

...read moreread less

Abstract: To attain a very-low-delay speech coder at 16 kb/s while maintaining a quality acceptable for the public switched telephone network, low delay vector excitation coding (LD-VXC) is introduced. Backward adaptation is used to track the spectral characteristics of the signal without requiring any buffering of the input speech, thereby allowing a very low delay to be achieved in an analysis-by-synthesis structure. The algorithm differs markedly from conventional VXC or CELP (code-excited linear prediction) coders due to the use of backward adaptive linear prediction for modeling the time-varying short- and long-term correlation of speech. The LD-VXC coder provides very good speech quality at 16 kb/s, moderate complexity, a delay of under 2 ms, and a gentle degradation of quality with transmission errors. The algorithm was submitted to the CCITT as a candidate for a future 16-kb/s speech coding standard. >

...read moreread less

31 citations

Journal Article•DOI•

A Hybrid Meta-Heuristic Feature Selection Method Using Golden Ratio and Equilibrium Optimization Algorithms for Speech Emotion Recognition

[...]

Arijit Dey¹, Soham Chattopadhyay², Pawan Kumar Singh², Ali Ahmadian³, Massimiliano Ferrara⁴, Ram Sarkar² - Show less +2 more•Institutions (4)

Islamic Azad University¹, Jadavpur University², National University of Malaysia³, Mediterranea University of Reggio Calabria⁴

03 Nov 2020-IEEE Access

TL;DR: This work proposes a meta-heuristic feature selection (FS) method using a hybrid of Golden Ratio Optimization (GRO) and Equilibriumoptimization (EO) algorithms, which it has named as Golden Ratio based Equilibrium optimization (GREO) algorithm.

...read moreread less

Abstract: Speech is the most important media of expressing emotions for human beings. Thus, it has often been an area of interest to understand the emotion of a person out of his/her speech by using the intelligence of the computing devices. Traditional machine learning techniques are very much popular in accomplishing such tasks. To provide a less expensive computational model for emotion classification through speech analysis, we propose a meta-heuristic feature selection (FS) method using a hybrid of Golden Ratio Optimization (GRO) and Equilibrium Optimization (EO) algorithms, which we have named as Golden Ratio based Equilibrium Optimization (GREO) algorithm. The optimally selected features by the model are fed to the XGBoost classifier. Linear Predictive Coding (LPC) and Linear Prediction Cepstral Coefficients (LPCC) based features are considered as the input here, and these are optimized by using the proposed GREO algorithm. We have achieved impressive recognition accuracies of 97.31% and 98.46% on two standard datasets namely, SAVEE and EmoDB respectively. The proposed FS model is also found to perform better than their constituent algorithms as well as many well-known optimization algorithms used for FS in the past. Source code of the present work is made available at: https://github.com/arijitdey1/Hybrid-GREO .

...read moreread less

31 citations

Patent•

System and method for adaptive enhancement of speech signals

[...]

David Giesbrecht, Phillip Hetherington

28 Jun 2005

TL;DR: In this article, a method and system for enhancing the frequency response of speech signals is presented, where an average speech spectral shape estimate is calculated over time based on the input speech signal.

...read moreread less

Abstract: A method and system for enhancing the frequency response of speech signals are provided. An average speech spectral shape estimate is calculated over time based on the input speech signal. The average speech spectral shape estimate may be calculated in the frequency domain using a first order IIR filtering or “leaky integrators.” Thus, the average speech spectral shape estimate adapts over time to changes in the acoustic characteristics of the voice path or any changes in the electrical audio path that may affect the frequency response of the system. A spectral correction factor may be determined by comparing the average speech spectral shape estimate to a desired target spectral shape. The spectral correction factor may be added (in units of dB) to the spectrum of the input speech signal in order to enhance or adjust the spectrum of the input speech signal toward the desired spectral shape, and an enhanced speech signal re-synthesized from the corrected spectrum.

...read moreread less

31 citations

Proceedings Article•DOI•

Distributed speech recognition with codec parameters

[...]

Bhiksha Raj¹, J. Migdal², R. Singh²•Institutions (2)

Mitsubishi¹, Mitsubishi Electric Research Laboratories²

27 Nov 2001

TL;DR: An LDA-based method for extracting optimal feature sets from codec bitstreams is proposed and it is demonstrated that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs.

...read moreread less

Abstract: Communication devices which perform distributed speech recognition (DSR) tasks currently transmit standardized coded parameters of speech signals. Recognition features are extracted from signals reconstructed using these on a remote server. Since reconstruction losses degrade recognition performance, proposals are being considered to standardize DSR-codecs which derive recognition features, to be transmitted and used directly for recognition. However, such a codec must be embedded on the transmitting device, along with its current standard codec. Performing recognition using codec bitstreams avoids these complications: no additional feature-extraction mechanism is required on the device, and there are no reconstruction losses on the server. We propose an LDA-based method for extracting optimal feature sets from codec bitstreams and demonstrate that features so derived result in improved recognition performance for the LPC, GSM and CELP codecs. For GSM and CELP, we show that the performance is comparable to that with uncoded speech and standard DSR-codec features.

...read moreread less

31 citations

Proceedings Article•DOI•

Analysis-by-synthesis method for whisper-speech reconstruction

[...]

Farzaneh Ahmadi¹, Ian McLoughlin¹, Hamid Sharifzadeh¹•Institutions (1)

Nanyang Technological University¹

01 Nov 2008

TL;DR: This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec.

...read moreread less

Abstract: In the following paper, a method for the real-time conversion of whispers to normal phonated speech through a code excited linear prediction analysis-by-synthesis codec is discussed. This approach uses a template of a speakerpsilas normal phonated speech for extraction of excitation parameters such as pitch and gain, and then injects these estimated excitations into whispered signal to synthesize normal-sounding speech through the CELP codec. Furthermore, since restoring pitch to whispered speech requires some considerations of quality and accuracy, spectral enhancements are required in terms of formant shifting (LSPs modification) and pitch injection based on voiced/unvoiced decision. Spectral shifting is accomplished through line-spectral pair adjustment. Implementing such methods by using the popular CELP codec allows integration of the technique with any modern speech applications and devices. Subjective testing results are presented to determine the effectiveness of the technique.

...read moreread less

31 citations

Collapse

Network Information

Performance

Metrics

6,598

Papers

148,119

Citations

No. of papers in the topic in previous years
Year	Papers
2023	9
2022	25
2021	26
2020	42
2019	25
2018	37

Linear predictive coding

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics