scispace - formally typeset
Search or ask a question
Book

Digital Speech: Coding for Low Bit Rate Communication Systems

TL;DR: A detailed account of the most recently developed digital speech coders designed specifically for use in the evolving communications systems, including an in-depth examination of the important topic of code excited linear prediction (CELP).
Abstract: From the Publisher: A detailed account of the most recently developed digital speech coders designed specifically for use in the evolving communications systems. Discusses the variety of speech coders utilized with such new systems as MBE IMMARSAT-M. Includes an in-depth examination of the important topic of code excited linear prediction (CELP).
Citations
More filters
Journal ArticleDOI
01 Oct 1980

1,565 citations

Journal ArticleDOI
TL;DR: A methodology is developed to derive algorithms for optimal basis selection by minimizing diversity measures proposed by Wickerhauser (1994) and Donoho (1994), which include the p-norm-like (l/sub (p/spl les/1)/) diversity measures and the Gaussian and Shannon entropies.
Abstract: A methodology is developed to derive algorithms for optimal basis selection by minimizing diversity measures proposed by Wickerhauser (1994) and Donoho (1994). These measures include the p-norm-like (l/sub (p/spl les/1)/) diversity measures and the Gaussian and Shannon entropies. The algorithm development methodology uses a factored representation for the gradient and involves successive relaxation of the Lagrangian necessary condition. This yields algorithms that are intimately related to the affine scaling transformation (AST) based methods commonly employed by the interior point approach to nonlinear optimization. The algorithms minimizing the (l/sub (p/spl les/1)/) diversity measures are equivalent to a previously developed class of algorithms called focal underdetermined system solver (FOCUSS). The general nature of the methodology provides a systematic approach for deriving this class of algorithms and a natural mechanism for extending them. It also facilitates a better understanding of the convergence behavior and a strengthening of the convergence results. The Gaussian entropy minimization algorithm is shown to be equivalent to a well-behaved p=0 norm-like optimization algorithm. Computer experiments demonstrate that the p-norm-like and the Gaussian entropy algorithms perform well, converging to sparse solutions. The Shannon entropy algorithm produces solutions that are concentrated but are shown to not converge to a fully sparse solution.

554 citations

Journal ArticleDOI
J.D. Gibson1
01 Apr 1987

385 citations

Journal ArticleDOI
TL;DR: This work chronicles the development of rate-distortion theory and provides an overview of its influence on the practice of lossy source coding.
Abstract: Lossy coding of speech, high-quality audio, still images, and video is commonplace today. However, in 1948, few lossy compression systems were in service. Shannon introduced and developed the theory of source coding with a fidelity criterion, also called rate-distortion theory. For the first 25 years of its existence, rate-distortion theory had relatively little impact on the methods and systems actually used to compress real sources. Today, however, rate-distortion theoretic concepts are an important component of many lossy compression techniques and standards. We chronicle the development of rate-distortion theory and provide an overview of its influence on the practice of lossy source coding.

213 citations

Journal ArticleDOI
TL;DR: The linear prediction error energy method can be considered as an efficient way to detect epileptic seizures on EEG records as it indicates the energy value and can be used to locate the seizure interval.
Abstract: In this study, a method is proposed to detect epileptic seizures over EEG signal. For this purpose, a linear prediction filter is used to observe the presence of spikes and sharp waves on seizure EEG recordings. Linear prediction analysis calculates a coefficient set for each window, which can best model the applied time series signal. Modeling success is observed on the prediction error signal. The presence of spikes and other seizure-specific sharp waves on the signal reduces the modeling success and increases the prediction error of the filter. It is clearly observed that, the energy of prediction error signal during seizures is much higher than that of the seizure free intervals, which indicates the energy value and can be used to locate the seizure interval. The method is applied to 250 distinct EEG records, each of which has 23.6s duration. The results of the proposed algorithm are evaluated with the ROC analysis which indicates 93.6% success in detecting the presence of seizures. As a conclusion, the linear prediction error energy method can be considered as an efficient way to detect epileptic seizures on EEG records.

177 citations

References
More filters
Journal ArticleDOI
John Makhoul1
01 Apr 1975
TL;DR: This paper gives an exposition of linear prediction in the analysis of discrete signals as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal.
Abstract: This paper gives an exposition of linear prediction in the analysis of discrete signals The signal is modeled as a linear combination of its past values and present and past values of a hypothetical input to a system whose output is the given signal In the frequency domain, this is equivalent to modeling the signal spectrum by a pole-zero spectrum The major part of the paper is devoted to all-pole models The model parameters are obtained by a least squares analysis in the time domain Two methods result, depending on whether the signal is assumed to be stationary or nonstationary The same results are then derived in the frequency domain The resulting spectral matching formulation allows for the modeling of selected portions of a spectrum, for arbitrary spectral shaping in the frequency domain, and for the modeling of continuous as well as discrete spectra This also leads to a discussion of the advantages and disadvantages of the least squares error criterion A spectral interpretation is given to the normalized minimum prediction error Applications of the normalized error are given, including the determination of an "optimal" number of poles The use of linear prediction in data compression is reviewed For purposes of transmission, particular attention is given to the quantization and encoding of the reflection (or partial correlation) coefficients Finally, a brief introduction to pole-zero modeling is given

4,206 citations


"Digital Speech: Coding for Low Bit ..." refers background or methods in this paper

  • ...Network or channel dependent mode decision [14] allows a coder to adapt to the network load or the channel error performance, by varying the modes and the bit rate, and changing the relative bit allocation of the source and channel coding [15]....

    [...]

  • ...The above algorithm converges to a local optimum [14, 15]....

    [...]

  • ...Binary Search Codebook Binary search [17], known in the pattern recognition literature as hierarchical clustering [14], is a method for partitioning space in such a way that the search for the minimum distortion code-vector is proportional to log2 L rather than L....

    [...]

  • ...These include the Autocorrelation [12], average magnitude difference function (AMDF) [13], Cepstrum [14] and Maximum Likelihood [15]....

    [...]

  • ...When estimating speech model parameters at about 50 Hz over a 20–30 ms analysis window, speech is assumed to be locally stationary [14] within this analysis window....

    [...]

Journal ArticleDOI
01 Oct 1980

1,565 citations


"Digital Speech: Coding for Low Bit ..." refers methods in this paper

  • ...One of the most powerful speech analysis methods is the method of linear prediction analysis [2, 3], or LPC analysis as it is commonly called....

    [...]

  • ...A popular lattice implementation of LPC analysis is that developed by Burg [3]....

    [...]

  • ...Panter and Dite [3] used analysis based on the assumption that the quantization is sufficiently fine and that the amplitude probability density function of the input samples is constant within the quantization intervals....

    [...]

  • ...The forward and backward transformation are given below [3]....

    [...]

  • ...Burg’s algorithm operates as follows [3]: 1....

    [...]

Journal ArticleDOI
Kuldip K. Paliwal1, B. Atal1
TL;DR: It is shown that the split vector quantizer can quantize LPC information in 24 bits/frame with an average spectral distortion of 1 dB and less than 2% of the frames having spectral distortion greater than 2 dB.
Abstract: For low bit rate speech coding applications, it is important to quantize the LPC parameters accurately using as few bits as possible. Though vector quantizers are more efficient than scalar quantizers, their use for accurate quantization of linear predictive coding (LPC) information (using 24-26 bits/frame) is impeded by their prohibitively high complexity. A split vector quantization approach is used here to overcome the complexity problem. An LPC vector consisting of 10 line spectral frequencies (LSFs) is divided into two parts, and each part is quantized separately using vector quantization. Using the localized spectral sensitivity property of the LSF parameters, a weighted LSF distance measure is proposed. With this distance measure, it is shown that the split vector quantizer can quantize LPC information in 24 bits/frame with an average spectral distortion of 1 dB and less than 2% of the frames having spectral distortion greater than 2 dB. The effect of channel errors on the performance of this quantizer is also investigated and results are reported. >

665 citations


"Digital Speech: Coding for Low Bit ..." refers background or methods in this paper

  • ...Examples include Adaptive Differential Pulse Code Modulation (ADPCM) [6], Code Excited Linear Prediction (CELP) [7, 8], and Improved Multi Band Excitation (IMBE) [9, 10]....

    [...]

  • ...728 standard [9], where a 50-order LPC filter is used!)....

    [...]

  • ...2 [9], the middle values are fairly constant....

    [...]

  • ...2 Step size multiplier values for 2, 3, and 4 bit quantizers [9] Adaptation multiplier values Previous o/p levels 2 bit 3 bit 4 bit L1 0....

    [...]

  • ...Tampa, FL [9] DVSI (1991) INMARSAT-M Voice Codec, Version 1....

    [...]

Book
01 Jan 1987
TL;DR: The toe or heel holder of a safety binding is pivotally mounted on a stub shaft, and held in its angular operating position by a spring-loaded, spherical detent guided in a bore of the holder radially relative to the shaft axis toward one of four equiangularly offset notches in the shaft which differ in their depth.
Abstract: The toe or heel holder of a safety binding is pivotally mounted on a stub shaft, and held in its angular operating position by a spring-loaded, spherical detent guided in a bore of the holder radially relative to the shaft axis toward one of four equiangularly offset notches in the shaft which differ in their depth. The shaft is attached to the top surface of the ski by a square mounting plate and four screws at the corners of the plate so that the detent engages different notches in the shaft, and therefore resists deflection of the holder from its operating position with different force depending on the orientation of the mounting plate on the ski surface.

627 citations


"Digital Speech: Coding for Low Bit ..." refers methods in this paper

  • ...The Mean Opinion Score (MOS) [49] scale shown in Table 2....

    [...]

Book
01 Nov 1995
TL;DR: An introduction to speech coding, W.B. Kleijn evaluation of speech coders, and a robust algorithm for pitch tracking (RAPT), D. McAulay and T.F. Quatieri waveform interpolation for coding and synthesis.
Abstract: An introduction to speech coding, W.B. Kleijn and K.K. Paliwal speech coding standards, R.V. Cox linear-prediction based analysis-by-synthesis coding, P. Kroon and W.B. Kleijn sinusoidal coding, R.J. McAulay and T.F. Quatieri waveform interpolation for coding and synthesis, W.B. Kleijn and J. Haagen low-delay coding of speech, J.-H. Chen multimode and variable-rate coding of speech, A. Das et al wideband speech coding, J.-P. Adoul and R. Lefebvre vector quantization for speech transmission, P. Hedelin et al theory for transmission of vector quantization data, P. Hedelin et al waveform coding and auditory masking, R. Veldhuis and A. Kohlrausch quantization of LPC parameters, K.K. Paliwal and W.B. Kleijn evaluation of speech coders, P. Kroon a robust algorithm for pitch tracking (RAPT), D. Talkin time-domain and frequency-domain techniques for prosodic modification of speech, E. Moulines and W. Verhelst nonlinear processing of speech, G. Kubin an approach to text-to-speech synthesis, R. Sproat and J. Olive the generation of prosodic structure and intonation in speech synthesis, J. Terken and R. Collier computation of timing in text-to-speech synthesis, J.P.H. van Santen objective optimization in algorithms for text-to-speech synthesis, Y. Sagisaka and N. Iwahashi quality evaluation of synthesized speech, V.J. van Heuven and R. van Bezooijen.

621 citations