scispace - formally typeset
Search or ask a question
Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.


Papers
More filters
PatentDOI
TL;DR: A method for encoding a speech signal into digital bits including the steps of dividing thespeech signal into speech frames representing time intervals of the speech signal, determining voicing information for frequency bands of thespeech frames, and determining spectral magnitudes representative of the magnitudes of the spectrum at determined frequencies across the frequency bands.
Abstract: A method for encoding a speech signal into digital bits including the steps of dividing the speech signal into speech frames representing time intervals of the speech signal, determining voicing information for frequency bands of the speech frames, and determining spectral magnitudes representative of the magnitudes of the spectrum at determined frequencies across the frequency bands. The method further includes quantizing and encoding the spectral magnitudes and the voicing information. The steps of determining, quantizing and encoding the spectral magnitudes is done is such a manner that the spectral magnitudes independent of voicing information are available for later synthesizing.

57 citations

Journal ArticleDOI
TL;DR: A series of algorithms for silent and voiced/unvoiced/mixed excitation interval classification, pitch detection, formant estimation and formant tracking was developed, which can surpass the performance of single-channel (acoustic-signal-based) algorithms.
Abstract: The authors describe analysis and synthesis methods for improving the quality of speech produced by D.H. Klatt's (J. Acoust. Soc. Am., vol.67, p.971-95, 1980) software formant synthesizer. Synthetic speech generated using an excitation waveform resembling the glotal volume-velocity was found to be perceptually preferred over speech synthesized using other types of excitation. In addition, listeners ranked speech tokens synthesized with an excitation waveform that simulated the effects of source-tract interaction higher in neutralness than tokens synthesized without such interaction. A series of algorithms for silent and voiced/unvoiced/mixed excitation interval classification, pitch detection, formant estimation and formant tracking was developed. The algorithms can utilize two channels of input data, i.e., speech and electroglottographic signals, and can therefore surpass the performance of single-channel (acoustic-signal-based) algorithms. The formant synthesizer was used to study some aspects of the acoustic correlates of voice quality, e.g., male/female voice conversion and the simulation of breathiness, roughness, and vocal fry. >

57 citations

Proceedings Article
01 Jan 1999
TL;DR: A formalism for data imputation based on the probability distributions of individual Hidden Markov model states is presented and potential advantages are that it can be followed by conventional techniques like cepstral features or artificial neural networks for speech recognition.
Abstract: Within the context of continuous-density HMM speech recognition in noise, we report on imputation of missing time-frequency regions using emission state probability distributions. Spectral subtraction and local signal–to– noise estimation based criteria are used to separate the present from the missing components. We consider two approaches to the problem of classification with missing data: marginalization and data imputation. A formalism for data imputation based on the probability distributions of individual Hidden Markov model states is presented. We report on recognition experiments comparing state based data imputation to marginalization in the context of connected digit recognition of speech mixed with factory noise at various global signal-to-noise ratios, and wideband restoration of speech. Potential advantages of the approach are that it can be followed by conventional techniques like cepstral features or artificial neural networks for speech recognition.

57 citations

Journal ArticleDOI
TL;DR: New feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition and the experimental results show the superiority of the proposed techniques over the conventional methods like linear predictive cepstral coefficients, Mel-frequency cep stral coefficient, spectral subtraction, and cepStral mean normalization in presence of additive white Gaussian noise.
Abstract: In this article, new feature extraction methods, which utilize wavelet decomposition and reduced order linear predictive coding (LPC) coefficients, have been proposed for speech recognition. The coefficients have been derived from the speech frames decomposed using discrete wavelet transform. LPC coefficients derived from subband decomposition (abbreviated as WLPC) of speech frame provide better representation than modeling the frame directly. The WLPC coefficients have been further normalized in cepstrum domain to get new set of features denoted as wavelet subband cepstral mean normalized features. The proposed approaches provide effective (better recognition rate), efficient (reduced feature vector dimension), and noise robust features. The performance of these techniques have been evaluated on the TI-46 isolated word database and own created Marathi digits database in a white noise environment using the continuous density hidden Markov model. The experimental results also show the superiority of the proposed techniques over the conventional methods like linear predictive cepstral coefficients, Mel-frequency cepstral coefficients, spectral subtraction, and cepstral mean normalization in presence of additive white Gaussian noise.

57 citations

Proceedings ArticleDOI
17 May 2004
TL;DR: An iterative tracking algorithm is described and evaluated that embeds both the prediction-residual training and the piecewise linearization design in an adaptive Kalman filtering framework and provides meaningful results even during consonantal closures when the supra-laryngeal source may cause no spectral prominences in speech acoustics.
Abstract: A novel approach is developed for efficient and accurate tracking of vocal tract resonances, which are natural frequencies of the resonator from larynx to lips, in fluent speech. The tracking algorithm is based on a version of the structured speech model consisting of continuous-valued hidden dynamics and a piecewise-linearized prediction function from resonance frequencies and bandwidths to LPC cepstra. We present details of the piecewise linearization design process and an adaptive training technique for the parameters that characterize the prediction residuals. An iterative tracking algorithm is described and evaluated that embeds both the prediction-residual training and the piecewise linearization design in an adaptive Kalman filtering framework. Experiments on tracking vocal tract resonances in Switchboard speech data demonstrate high accuracy in the results, as well as the effectiveness of residual training embedded in the algorithm. Our approach differs from traditional formant trackers in that it provides meaningful results even during consonantal closures when the supra-laryngeal source may cause no spectral prominences in speech acoustics.

57 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Noise
110.4K papers, 1.3M citations
81% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
80% related
Filter (signal processing)
81.4K papers, 1M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202225
202126
202042
201925
201837