scispace - formally typeset
Search or ask a question
Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.


Papers
More filters
Patent
05 Jun 2000
TL;DR: In this article, a method of estimating a confidence measure for a speech recognition system, involves comparing an input speech signal with a number of predetermined models of possible speech signals, and best scores indicating the degree of similarity between the input speech signals and each of the predetermined models are then used to determine a normalized variance, which is used as the Confidence Measure, in order to determine whether the input signal has been correctly recognized.
Abstract: A method of estimating a confidence measure for a speech recognition system, involves comparing an input speech signal with a number of predetermined models of possible speech signals. Best scores indicating the degree of similarity between the input speech signal and each of the predetermined models are then used to determine a normalized variance, which is used as the Confidence Measure, in order to determine whether the input speech signal has been correctly recognized, the Confidence Measure is compared to a threshold value. The threshold value is weighted according to the Signal to Noise Ratio of the input speech signal and according to the number of predetermined models used.

79 citations

Journal ArticleDOI
TL;DR: Experimental results demonstrate the potential of compressed sensing in speech coding techniques, offering high perceptual quality with a very sparse approximated prediction residual.
Abstract: Encouraged by the promising application of compressed sensing in signal compression, we investigate its formulation and application in the context of speech coding based on sparse linear prediction. In particular, a compressed sensing method can be devised to compute a sparse approximation of speech in the residual domain when sparse linear prediction is involved. We compare the method of computing a sparse prediction residual with the optimal technique based on an exhaustive search of the possible nonzero locations and the well known Multi-Pulse Excitation, the first encoding technique to introduce the sparsity concept in speech coding. Experimental results demonstrate the potential of compressed sensing in speech coding techniques, offering high perceptual quality with a very sparse approximated prediction residual.

79 citations

Patent
01 Sep 2005
TL;DR: In this paper, the authors present a method and apparatus for obtaining complete speech signals for speech recognition applications using a Hidden Markov Model (HMM) and a sequence of frames.
Abstract: The present invention relates to a method and apparatus for obtaining complete speech signals for speech recognition applications. In one embodiment, the method continuously records an audio stream comprising a sequence of frames to a circular buffer. When a user command to commence or terminate speech recognition is received, the method obtains a number of frames of the audio stream occurring before or after the user command in order to identify an augmented audio signal for speech recognition processing. In further embodiments, the method analyzes the augmented audio signal in order to locate starting and ending speech endpoints that bound at least a portion of speech to be processed for recognition. At least one of the speech endpoints is located using a Hidden Markov Model.

78 citations

Journal ArticleDOI
TL;DR: The proposed method for time-delay estimation is found to perform better than the generalized cross-correlation (GCC) approach and a method for enhancement of speech is also proposed using the knowledge of the time- delay and the information of the excitation source.
Abstract: In this paper, we present a method of extracting the time-delay between speech signals collected at two microphone locations. Time-delay estimation from microphone outputs is the first step for many sound localization algorithms, and also for enhancement of speech. For time-delay estimation, speech signals are normally processed using short-time spectral information (either magnitude or phase or both). The spectral features are affected by degradations in speech caused by noise and reverberation. Features corresponding to the excitation source of the speech production mechanism are robust to such degradations. We show that these source features can be extracted reliably from the speech signal. The time-delay estimate can be obtained using the features extracted even from short segments (50-100 ms) of speech from a pair of microphones. The proposed method for time-delay estimation is found to perform better than the generalized cross-correlation (GCC) approach. A method for enhancement of speech is also proposed using the knowledge of the time-delay and the information of the excitation source.

78 citations

PatentDOI
Walter Kellermann1
TL;DR: In this paper, a speech processing arrangement has at least two microphones for supplying microphone signals formed by speech components and noise components to microphone signal branches that are coupled to an adder device used for forming a sum signal.
Abstract: A speech processing arrangement has at least two microphones for supplying microphone signals formed by speech components and noise components to microphone signal branches that are coupled to an adder device used for forming a sum signal. The microphone signals are delayed and weighted by weight factors in the microphone signal branches. The arrangement includes an evaluation circuit that a) receives the microphone signals, b) estimates the noise components, c) estimates the speech components by forming the difference between one of the microphone signals and the estimated noise component for this microphone signal, d) selects one of the microphone signals as a reference signal which contains a reference noise component and a reference speech component, e) forms speech signal ratios by dividing the estimated speech components by the estimated reference speech component, f) forms noise signal ratios by dividing the powers of the estimated noise components by the power of the estimated reference noise component, and g) determines the weight factors by dividing each speech signal ratio by the associated noise signal ratio. The signal-to-noise ratio corresponds to the ratio of the power of the speech component to the power of the noise component of the sum signal. Because the speech signals are correlated and noise signals are uncorrelated, the sum signal available on the output of the adder device has a reduced noise component yielding improved speech audibility. Real-time computation of the weight factors eliminates any annoying delay during a conversation held using the speech processing arrangement.

78 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Noise
110.4K papers, 1.3M citations
81% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
80% related
Filter (signal processing)
81.4K papers, 1M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202225
202126
202042
201925
201837