scispace - formally typeset
Search or ask a question
Topic

Linear predictive coding

About: Linear predictive coding is a research topic. Over the lifetime, 6565 publications have been published within this topic receiving 142991 citations. The topic is also known as: Linear predictive coding, LPC.


Papers
More filters
Proceedings ArticleDOI
24 Aug 2009
TL;DR: A measure based on the similarity between the time-varying spectral envelopes of target speech and system output, as measured by correlation, can provide a more meaningful evaluation measure for nonlinear speech enhancement systems, as well as providing a transparent objective function for the optimization of such systems.
Abstract: Applying a binary mask to a pure noise signal can result in speech that is highly intelligible, despite the absence of any of the target speech signal. Therefore, to estimate the intelligibility benefit of highly nonlinear speech enhancement techniques, we contend that SNR is not useful; instead we propose a measure based on the similarity between the time-varying spectral envelopes of target speech and system output, as measured by correlation. As with previous correlation-based intelligibility measures, our system can broadly match subjective intelligibility for a range of enhanced signals. Our system, however, is notably simpler and we explain the practical motivation behind each stage. This measure, freely available as a small Matlab implementation, can provide a more meaningful evaluation measure for nonlinear speech enhancement systems, as well as providing a transparent objective function for the optimization of such systems.

43 citations

DOI
02 Nov 2017
TL;DR: In this article, the authors proposed a combined use of Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) coefficients expressing the basic speech features to improve the reliability of speech recognition system.
Abstract: Statement of the automatic speech recognition problem, the assignment of speech recognition and the application fields are shown in the paper. At the same time as Azerbaijan speech, the establishment principles of speech recognition system and the problems arising in the system are investigated. The computing algorithms of speech features, being the main part of speech recognition system, are analyzed. From this point of view, the determination algorithms of Mel Frequency Cepstral Coefficients (MFCC) and Linear Predictive Coding (LPC) coefficients expressing the basic speech features are developed. Combined use of cepstrals of MFCC and LPC in speech recognition system is suggested to improve the reliability of speech recognition system. To this end, the recognition system is divided into MFCC and LPC-based recognition subsystems. The training and recognition processes are realized in both subsystems separately, and recognition system gets the decision being the same results of each subsystems. This results in decrease of error rate during recognition. The training and recognition processes are realized by artificial neural networks in the automatic speech recognition system. The neural networks are trained by the conjugate gradient method. In the paper the problems observed by the number of speech features at training the neural networks of MFCC and LPC-based speech recognition subsystems are investigated. The variety of results of neural networks trained from different initial points in training process is analyzed. Methodology of combined use of neural networks trained from different initial points in speech recognition system is suggested to improve the reliability of recognition system and increase the recognition quality, and obtained practical results are shown.

43 citations

Proceedings Article
01 Jan 2002
TL;DR: A voiced-unvoiced measure was combined with the standard Mel Frequency Cepstral Coefficients using linear discriminant analysis (LDA) to choose the most relevant features for continuous speech recognition.
Abstract: In this paper, a voiced-unvoiced measure is used as acoustic feature for continuous speech recognition. The voiced-unvoiced measure was combined with the standard Mel Frequency Cepstral Coefficients (MFCC) using linear discriminant analysis (LDA) to choose the most relevant features. Experiments were performed on the SieTill (German digit strings recorded over telephone line) and on the SPINE (English spontaneous speech under different simulated noisy environments) corpus. The additional voiced-unvoiced measure results in improvements in word error rate (WER) of up to 11% relative to using MFCC alone with the same overall number of parameters in the system.

43 citations

Journal ArticleDOI
TL;DR: An algorithm which determines the optimal segmentation with respect to a cost function relating prediction error to modeling cost is presented, whereby the segmentation is implicitly computed while minimizing the modelization distortion for a given modelization cost.
Abstract: A common technique to extend linear prediction to nonstationary signals is time segmentation: the signal is split into small portions and the modelization is carried out locally. The accuracy of the analysis is, however, dependent on the window size and on the signal characteristics, so that the problem of finding a good segmentation is crucial to the entire modeling scheme. In this paper, we present an algorithm which determines the optimal segmentation with respect to a cost function relating prediction error to modeling cost. The proposed approach casts the problem in a rate/distortion (R/D) framework, whereby the segmentation is implicitly computed while minimizing the modelization distortion for a given modelization cost. The algorithm is implemented by means of dynamic programming and takes the form of a trellis-based Lagrangian minimization. The optimal linear predictor, when applied to speech coding, dramatically reduces the number of bits per second devoted to the modeling parameters in comparison to fixed-window schemes.

43 citations

Journal ArticleDOI
TL;DR: A content-dependent watermarking scheme suitable for codebook-excited linear prediction (CELP)-based speech codec that ensures the integrity of compressed speech data.
Abstract: As speech compression technologies have advanced, digital recording devices have become increasingly popular. However, data formats used in popular speech codecs are known a priori, such that compressed data can be modified easily via insertion, deletion, and replacement. This work proposes a content-dependent watermarking scheme suitable for codebook-excited linear prediction (CELP)-based speech codec that ensures the integrity of compressed speech data. Speech data are initially partitioned into many groups, each of which includes multiple speech frames. The watermark embedded in each frame is then generated according to the line spectrum frequency (LSF) feature in the current frame, the pitch extracted from the succeeding frame, the watermark embedded in the preceding frame, and the group index which is determined by the location of the current frame. Finally, some of the least significant bits (LSBs) of the indices indicating the excitation pulse positions or excitation vectors are substituted for the watermark. Conventional watermarking schemes can only detect whether compressed speech data are intact. They cannot determine where compressed speech data are altered by insertion, deletion, or replacement, whereas the proposed scheme can. Experiments established that the proposed scheme used in the G.723.1 6.3 kb/s speech codecs embeds 12 bits in each compressed speech frame with 189 bits, and only decreases the perceptual evaluation of speech quality (PESQ) by 0.11. Additionally, its accuracy in detecting the locations of attacked frames is very high, with only two normal frames mistaken as attacked frames. Therefore, the proposed watermarking scheme effectively ensures the integrity of compressed speech data.

43 citations


Network Information
Related Topics (5)
Signal processing
73.4K papers, 983.5K citations
86% related
Noise
110.4K papers, 1.3M citations
81% related
Feature extraction
111.8K papers, 2.1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
80% related
Filter (signal processing)
81.4K papers, 1M citations
79% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20239
202225
202126
202042
201925
201837