scispace - formally typeset
Search or ask a question

Showing papers on "Codebook published in 1984"


Journal ArticleDOI
TL;DR: Several algorithms are presented for the design of shape-gain vector quantizers based on a traning sequence of data or a probabilistic model, and their performance is compared to that of previously reported vector quantization systems.
Abstract: Memory and computation requirements imply fundamental limitations on the quality that can be achieved in vector quantization systems used for speech waveform coding and linear predictive voice coding (LPC). One approach to reducing storage and computation requirements is to organize the set of reproduction vectors as the Cartesian product of a vector codebook describing the shape of each reproduction vector and a scalar codebook describing the gain or energy. Such shape-gain vector quantizers can be applied both to waveform coding using a quadratic-error distortion measure and to voice coding using an Itakura-Saito distortion measure. In each case, the minimum distortion reproduction vector can be found by first selecting a shape code-word, and then, based on that choice, selecting a gain codeword. Several algorithms are presented for the design of shape-gain vector quantizers based on a traning sequence of data or a probabilistic model. The algorithms are used to design shape-gain vector quantizers for both the waveform coding and voice coding application. The quantizers are simulated, and their performance is compared to that of previously reported vector quantization systems.

305 citations


Patent
Harald Höge1
12 Sep 1984
TL;DR: In this article, a method of determining speech spectra for automatic speech recognition and speech coding is presented, wherein a codebook tree is used, namely an arrangement of codebook spectra which is orientated with binary branching and thus can be addressed in binary-coded fashion, with levels which contain 2**L spectra, characterized in that in an analysis stage (A) which is supplied with a speech signal which is to be analysed, firstly, preferably using a signal processor (S), the respective spectral parameter set is determined, that each obtained parameter set comprising parameter values
Abstract: 1. A method of determining speech spectra for automatic speech recognition and speech coding, wherein a codebook tree is used, namely an arrangement of codebook spectra which is orientated with binary branching and thus can be addressed in binary-coded fashion, with levels which contain 2**L spectra, characterized in that in an analysis stage (A) which is supplied with a speech signal which is to be analysed, firstly, preferably using a signal processor (S), the respective spectral parameter set is determined, that each obtained parameter set comprising parameter values Pi is binary-coded in a coding stage (C) whereby each parameter value Pi is assigned a code Ci , that all the codes ({Ci }) which have been formed and which are to be evaluated are combined to form an overall code (Cg ), that the overall code (Cg ) is used as addressing signal for a read-only store ROM, namely a Hash-ROM (H), where the store content of the Hash-ROM (H) comprises the numbers, assigned to the required spectrum, of the codebook spectra in a K-th level of the codebook tree, that the numbers of the codebook spectra are used as an indicator of the codebook store section which contains the codebook tree from the K-th level, and that when entering into the K-th level of the codebook tree has been facilitated in this way, a final determination of the codebook spectrum in the L-th level of the codebook tree is carried out by means of a comparator unit (V) in the form of a tree search.

168 citations


Proceedings ArticleDOI
19 Mar 1984
TL;DR: A Hierarchical Vector Quantization scheme that can operate on "supervectors" of dimensionality in the hundreds of samples is introduced and Gain normalization and dynamic codebook allocation are used in coding both feature vectors and the final data subvectors.
Abstract: This paper introduces a Hierarchical Vector Quantization (HVQ) scheme that can operate on "supervectors" of dimensionality in the hundreds of samples. HVQ is based on a tree-structured decomposition of the original super-vector into a large number of low dimensional vectors. The supervector is partitioned into subvectors, the subvectors into minivectors and so on. The "glue" that links subvectors at one level to the parent vector at the next higher level is a feature vector that characterizes the correlation pattern of the parent vector and controls the quantization of lower level feature vectors and ultimately of the final descendant data vectors. Each component of a feature vector is a scalar parameter that partially describes a corresponding subvector. The paper presents a three level HVQ for which the feature vectors are based on subvector energies. Gain normalization and dynamic codebook allocation are used in coding both feature vectors and the final data subvectors. Simulation results demonstrate the effectiveness of HVQ for speech waveform coding at 9.6 and 16 Kb/s.

45 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: A major extension of the classification approach to include edge orientation and location, thereby exploiting an important feature of the human visual mechanism and allowing large codebooks designed from a large database of training images to be used.
Abstract: Vector quantization (VQ) has made it possible to utilize perceptually meaningful techniques for direct space-domain image coding. A simple 2 or 3 way classified codebook approach [2,3] allocates the perceptually important edges with more resolution than the easily encoded monotone regions of an image. In this paper, we introduce a major extension of the classification approach to include edge orientation and location, thereby exploiting an important feature of the human visual mechanism. In particular, each 4 × 4 block of pixels is classified into one of 31 classes for the case of 16 dimensional VQ. The encoding and codebook design complexity is significantly reduced, allowing us to use large codebooks designed from a large database of training images. We present images encoded at 0.7 and 0.8 bits per pixel using this scheme with 16- dimensional vectors. Only a small fraction of one bit per pixel is needed to code the monotone regions of an image; the rest of the bitrate is used to achieve a high level of edge integrity.

34 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: Simulation results demonstrate that vector quantization offers a distinct perceptual improvement compared with scalar quantization of the same subband signals and side information for the same total bit rate.
Abstract: Vector quantization (VQ) is examined as a technique to enhance performance in subband coding of speech at 9.6 kb/s. The set of short-term subband power levels is vector quantized, providing low-rate side information to control the coding of the subband signals. Each subband signal is then vector quantized with variable size codebooks that are dynamically assigned by the quantized side information. Two versions are described, a 7-band coder and a 14-band coder. Simulation results demonstrate that vector quantization offers a distinct perceptual improvement compared with scalar quantization of the same subband signals and side information for the same total bit rate.

31 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: This paper presents a method of incorporating LPC spectral shape and energy into the codebook entries of the vector quantizer, and finds improvements in recognition accuracy by using the VQ with both LPCshape and energy over that obtained using a VQWith LPC shape alone.
Abstract: The theory of vector quantization (VQ) of linear predictive coding (LPC) coefficients has established a wide variety of techniques for quantizing LPC spectral shape to minimize overall spectral distortion. Such vector quantizers have been widely used in the areas of speech coding and speech recognition. The conventional vector quantizer utilizes only spectral shape information and essentially disregards the energy or gain term associated with the optimal LPC fit to the signal being modelled. In this paper we present a method of incorporating LPC spectral shape and energy into the codebook entries of the vector quantizer. To do this we postulate a distortion measure for comparing two LPC vectors which uses a weighted sum of an LPC shape distortion and a log energy distortion. Based on this combined distortion measure we have designed and studied vector quantizers of several sizes for use in isolated word speech recognition experiments. We have found that a fairly significant correlation exists between LPC shape and signal energy; hence a combined LPC shape plus energy vector quantizer with a given distortion requires far fewer codebook entries than one in which LPC shape and energy are quantized separately. Based on isolated word recognition tests on both a 10-digit and a 129 word airlines vocabulary, we have found improvements in recognition accuracy by using the VQ with both LPC shape and energy over that obtained using a VQ with LPC shape alone.

15 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: Two types of isolated digit recognition systems based on vector quantization were tested in a speaker-independent task and involved generating a minimum-distortion segmentation of the unknown by dynamic programming.
Abstract: Two types of isolated digit recognition systems based on vector quantization were tested in a speaker-independent task. In both types of systems, a digit was modelled as a sequence of codebooks generated from segments of training data. In systems of the first type, the training and unknown utterances were simply partitioned into 1, 2 or 3 equal-length segments. Recognition involved computing the distortion when the input spectra were vector quantized using the codebook sequences. These systems are closely related to recognizers proposed by Burton et al.[1]. In systems of the second type, training segments corresponded to acoustic-phonetic units and were obtained from hand-marked data. Recognition involved generating a minimum-distortion segmentation of the unknown by dynamic programming. Accuracies approaching 96-97% were achieved by both types of systems.

10 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: The preliminary results show promise for further work in this direction and the design algorithm is aimed at producing a codebook of matrices which are, at least, locally optimum with respect to a distortion measure.
Abstract: This paper discusses a matrix quantizer design algorithm for image encoding problems. The design algorithm is aimed at producing a codebook of matrices which are, at least, locally optimum with respect to a distortion measure. We have considered the squared error distortion measure in this work and generated codebooks based on a training sequence consisting of a number of pictures of different bit rates. The preliminary results show promise for further work in this direction.

9 citations


Proceedings ArticleDOI
M. Noah1
19 Mar 1984
TL;DR: This paper introduces an approach to scalar quantization of LPC reflection coefficients which outperforms typical scalarquantization schemes using a squared-error distortion measure and suggests the Lloyd-Max approach is economically attractive for implementation in real-time hardware.
Abstract: This paper introduces an approach to scalar quantization of LPC reflection coefficients which outperforms typical scalar quantization schemes using a squared-error distortion measure. As in vector quantization, an iterative algorithm is used to generate the source codebook. Scalar quantization produces a greater distortion, for a given number of bits- than vector quantization but reduced computation time and storage requirements make the Lloyd-Max approach economically attractive for implementation in real-time hardware. The algorithm is developed and a comparison with another scalar quantization scheme is made.

6 citations


Proceedings ArticleDOI
01 Mar 1984
TL;DR: Vector quantization techniques were applied to a continuous speech recognition system as a means of reducing both memory usage and computation time and provided a significant computational advantage over the clustering technique.
Abstract: Vector quantization techniques were applied to a continuous speech recognition system as a means of reducing both memory usage and computation time. The speech recognition system computes time-aligned distances between unknown speech segments and template frames. Vector quantization allowed the replacement of speech frames (vectors) with single index numbers which referenced an ordered set, or codebook, of representative frames. Two techniques for generating this codebook, clustering and covering, were examined. The covering technique provided a significant computational advantage over the clustering technique although both techniques generated codebooks which performed well in this task. Results are presented for a ten speaker, 100 word vocabulary experiment. Using speaker dependent codebooks, system performance levels were maintained while the number of distance calculations was reduced by a factor of 2 and the template storage required was reduced by a factor of 4.6. With an increase in error rate of about one third, these factors were 6.8 and 7.8 respectively.

4 citations