scispace - formally typeset
Search or ask a question

Showing papers on "Codebook published in 1997"


Patent
Navin Chaddha1
30 Jun 1997
TL;DR: In this article, a hierarchical vector quantization table that outputs embedded code is proposed for image compression, which can be divided into codebook design and fill-in procedures for each stage, using splitting generalized Lloyd algorithm (LBG/GLA) using a perceptually weighted distortion measure.
Abstract: An image compression system includes a vectorizer and a hierarchical vector quantization table that outputs embedded code. The vectorizer converts an image into image vectors representing respective blocks of image pixels. The table provides computation-free transformation and compression of the image vectors. Table design can be divided into codebook design and fill-in procedures for each stage. Codebook design for the preliminary stages uses a splitting generalized Lloyd algorithm (LBG/GLA) using a perceptually weighted distortion measure. Codebook design for the final stage uses a greedily-grown and then entropy-pruned tree-structure variation of GLA with an entropy-constrained distortion measure. Table fill-in for all stages uses an unweighted proximity measure for assigning inputs to codebook vectors. Transformations and compression are fast because they are computation free. The hierarchical, multi-stage, character of the table allow it to operate with low memory requirements. The embedded output allows convenient scalability suitable for collaborative video applications over heterogeneous networks.

183 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed a new algorithm for both vector quantizer design and clustering analysis as an alternative to the conventional K-means algorithm, which converges to a better locally optimal codebook with an accelerated convergence speed.
Abstract: The K-means algorithm is widely used in vector quantizer (VQ) design and clustering analysis In VQ context, this algorithm iteratively updates an initial codebook and converges to a locally optimal codebook in certain conditions It iteratively satisfies each of the two necessary conditions for an optimal quantizer; the nearest neighbor condition for the partition and centroid condition for the codevectors In this letter, we propose a new algorithm for both vector quantizer design and clustering analysis as an alternative to the conventional K-means algorithm The algorithm is almost the same as the K-means algorithm except for a modification at codebook updating step It does not satisfy the centroid condition iteratively, but asymptotically satisfies it as the number of iterations increases Experimental results show that the algorithm converges to a better locally optimal codebook with an accelerated convergence speed

98 citations


Journal ArticleDOI
TL;DR: It can be proved that LBG-U terminates in a finite number of steps, and experiments with artificial data demonstrate significant improvements in terms of RMSE over LBG combined with only modestly higher computational costs.
Abstract: A new vector quantization method (LBG-U) closely related to a particular class of neural network models (growing self-organizing networks) is presented. LBG-U consists mainly of repeated runs of the well-known LBG algorithm. Each time LBG converges, however, a novel measure of utility is assigned to each codebook vector. Thereafter, the vector with minimum utility is moved to a new location, LBG is run on the resulting modified codebook until convergence, another vector is moved, and so on. Since a strictly monotonous improvement of the LBG-generated codebooks is enforced, it can be proved that LBG-U terminates in a finite number of steps. Experiments with artificial data demonstrate significant improvements in terms of RMSE over LBG combined with only modestly higher computational costs.

97 citations


Patent
23 Sep 1997
TL;DR: In this article, the degree of similarity between an input vector and all code vectors stored in the codebook is found by approximation for pre-selecting a smaller plural number of code vectors.
Abstract: The processing volume for codebook search for vector quantization is to be diminished. In sending data representing an envelope of spectral components of the harmonics from a spectrum evaluation unit 148 of a sinusoidal analytic encoder 114 to a vector quantizer 116 for vector quantization, the degree of similarity between an input vector and all code vectors stored in the codebook is found by approximation for pre-selecting a smaller plural number of code vectors. From these plural pre-selected code vectors, such a code vector minimizing an error with respect to the input vector is ultimately selected. In this manner, a smaller number of candidate code vectors are pre-selected by pre-selection involving simplified processing and subsequently subjected to ultimate selection with high precision.

62 citations


Journal ArticleDOI
TL;DR: An improved codebook search algorithm called DTPC, which inherits several benefits from some previous techniques, such as the double test (DT) and the principal component analysis (PCA), and is much more efficient than the other algorithms.

53 citations


Proceedings ArticleDOI
13 Apr 1997
TL;DR: This paper describes a genetic algorithm for the problem of codebook design, which uses fitness inheritance to assign fitness values to most new chromosomes, rather than evaluating them.
Abstract: Data compression techniques recode data into more compact forms. One such technique is vector quantization, which maps groups of input symbols, called vectors, onto a small set of vectors, called the codebook. Each vector in the codebook is a codeword. The indexes of the codewords represent the original vectors, and writing the codewords that the indexes indicate restores a facsimile of the original data. The similarity of the restored data to the original under vector quantization depends on the codebook, and several algorithms have been proposed for designing it from a training set of typical vectors. This paper describes a genetic algorithm for the problem of codebook design. The genetic algorithm's chromosomes represent partitions of the training set; each vector maps to the codeword that is the centroid of its set in the partition. To speed up its operation, the genetic algorithm uses fitness inheritance to assign fitness values to most new chromosomes, rather than evaluating them. Tests using five standard digitized images compare the genetic algorithm to a popular non-genetic algorithm for codebook design. The genetic algorithm is found to be effective, but slow.

47 citations


Proceedings Article
01 Jan 1997
TL;DR: Through the evaluation by the spectral distance measure, it was found that the proposed method achieved a lower spectral distortion than the other methods.
Abstract: This paper proposes a recovery method of broadband speech form narrowband speech based on piecewise linear mapping. In this method, narrowband spectrum envelope of input speech is transformed to broadband spectrum envelope using linearly transformed matrices which are associated with several spectrum spaces. These matrices were estimated by speech training data, so as to minimize the mean square error between the transformed and the original spectra. This algorithm is compared the following other methods, (1)the codebook mapping, (2)the neural network. Through the evaluation by the spectral distance measure, it was found that the proposed method achieved a lower spectral distortion than the other methods. Perceptual experiments indicates a good performance for the reconstructed broadband speech.

47 citations


Journal ArticleDOI
TL;DR: A new iterative splitting method is proposed, which is applicable to codebook generation without the GLA and outperforms both theGLA and the other existing splitting-based algorithms.
Abstract: The well-known LBG algorithm uses binary splitting for gen- erating an initial codebook, which is then iteratively improved by the generalized Lloyd algorithm (GLA). We study different variants of the splitting method and its application to codebook generation with and without the GLA. A new iterative splitting method is proposed, which is applicable to codebook generation without the GLA. Experiments show that the improved splitting method outperforms both the GLA and the other existing splitting-based algorithms. The best combination uses hy- perplane partitioning of the clusters along the principal axis as proposed by Wu and Zhang, integrated with a local repartitioning phase at each step of the algorithm. © 1997 Society of Photo-Optical Instrumentation Engineers. (S0091-3286(97)02311-8)

44 citations


Patent
15 Oct 1997
TL;DR: In vector quantization, the space is partitioned such that the integrals of the probability density functions over all the subspaces are approximately equal and thereafter the center of gravity of each subspace is taken to represent the subspace population.
Abstract: A data compression system using vector quantisation utilises a codebook or index tables constructed by finding a small set of points in an n-dimensional space which are representative of a much larger population. The space is partitioned such that the integrals of the probability density functions over all the subspaces are approximately equal and thereafter the centre of gravity of each subspace is taken to represent the subspace population. The partitioning is effected sequentially to map n-dimensional space into one dimensional code space.

43 citations


Patent
14 Mar 1997
TL;DR: In this article, a method for encoding video data that includes a first frame and a subsequent frame is presented, in which the first frame is segmentable into at least one first block, and the subsequent frames are segmented into at most one subsequent block.
Abstract: The present invention provides, in one aspect, a computer-implemented method for encoding video data that includes a first frame and a subsequent frame. The first frame is segmentable into at least one first block, and the subsequent frame is segmentable into at least one subsequent block. The method involves obtaining the first frame, and obtaining the subsequent frame in luminance and chrominance space format. A motion analysis is then performed between the subsequent frame and the first frame, and the subsequent block is encoded. Encoding the subsequent block involves using an encoding table generated from an encoding codebook which is designed using a codebook design procedure for structured vector quantization.

43 citations


Journal ArticleDOI
TL;DR: A new search algorithm is proposed which is used to speed up both the codebook generation and the encoding of VQ, and it contains two major techniques: diagonal axes analysis (DAA) and orthogonal checking (OC).
Abstract: Vector quantization (VQ) is a fundamental technique for image compression. But it takes time to search for a similar codeword in a codebook. Thus, the codebook search is one of the major bottlenecks in VQ. We propose a new search algorithm which is used to speed up both the codebook generation and the encoding. We call it the diagonal axes method (DAM). This new algorithm contains two major techniques: diagonal axes analysis (DAA) and orthogonal checking (OC). Since most of these procedures simply involve additions and subtractions, DAM is more efficient than some other related algorithms. Simulation results confirm this effectiveness.

Patent
30 Jan 1997
TL;DR: In this paper, a speech encoding method and apparatus including analyzing, using a codebook expressing speech parameters within a predetermined search range, an input speech signal in an audibility weighting filter corresponding to a pitch period longer than the search range of the codebook, and searching, from the code book, on the basis of the analysis result, a combination of speech parameters by which the distortion of the input speech signals is minimized, and encoding the combination.
Abstract: A speech encoding method and apparatus including analyzing, using a codebook expressing speech parameters within a predetermined search range, an input speech signal in an audibility weighting filter corresponding to a pitch period longer than the search range of the codebook, and searching, from the codebook, on the basis of the analysis result, a combination of speech parameters by which the distortion of the input speech signal is minimized, and encoding the combination. The apparatus uses an adaptive codebook of pitch and a noise codebook. The codebooks search a group formed by extracting vectors of predetermined length from one original code vector, while sequentially shifting position so that the vectors overlap each other. The search group is further restricted and another preselection is made before the final search. Search is based on inversely convoluted, orthogonally transformed vectors.

Patent
05 Sep 1997
TL;DR: In this paper, the same parameter is repeatedly used in an unvoiced frame inherently devoid of pitch, thus producing an extraneous feeling, which can be prevented from occurring by evading repeated use of excitation vectors having the same waveform shape.
Abstract: If the same parameter is repeatedly used in an unvoiced frame inherently devoid of pitch, there is produced a pitch of the frame length period, thus producing an extraneous feeling. This can be prevented from occurring by evading repeated use of excitation vectors having the same waveform shape. To this end, when decoding an encoded speech signal obtained on waveform encoding an encoding-unit-based time-axis speech signal obtained on splitting an input speech signal in terms of a pre-set encoding unit on the time axis, input data is checked by CRC by a CRC and bad frame masking circuit 281, which processes a frame corrupted with an error with bad frame masking of repeatedly using parameters of a directly previous frame. If the error-corrupted frame is unvoiced, an unvoiced speech synthesis unit 220 adds the noise to an excitation vector from a noise codebook or randomly selects the excitation vector of the noise codebook.

Patent
27 Jun 1997
TL;DR: In this paper, a fuzzy Viterbi algorithm is used by a processor to compute maximum likelihood probabilities PR(O|λj) for each vocabulary word and the fuzzy distance measures and maximum likelihood probability are mixed in a variety of ways to preferably optimize speech recognition accuracy and speech recognition speed performance.
Abstract: One embodiment of a speech recognition system is organized with speech input signal preprocessing and feature extraction followed by a fuzzy matrix quantizer (FMQ) designed with respective codebook sets at multiple signal to noise ratios. The FMQ quantizes various training words from a set of vocabulary words and produces observation sequences O output data to train a hidden Markov model (HMM) processes λj and produces fuzzy distance measure output data for each vocabulary word codebook. A fuzzy Viterbi algorithm is used by a processor to compute maximum likelihood probabilities PR(O|λj) for each vocabulary word. The fuzzy distance measures and maximum likelihood probabilities are mixed in a variety of ways to preferably optimize speech recognition accuracy and speech recognition speed performance.

Journal ArticleDOI
01 Apr 1997-Fractals
TL;DR: A fast encoding scheme for fractal image compression is presented that uses a clustering algorithm based on Kohonen's self-organizing maps, yielding a classification with a notion of distance which is not given in traditional classification schemes.
Abstract: A fast encoding scheme for fractal image compression is presented. The method uses a clustering algorithm based on Kohonen's self-organizing maps. Domain blocks are clustered, yielding a classification with a notion of distance which is not given in traditional classification schemes.

Proceedings ArticleDOI
11 Jun 1997-Sequence
TL;DR: This work considers aspects of estimating conditional and unconditional densities in conjunction with Bayes-risk weighted vector quantization for joint compression and classification.
Abstract: The connection between compression and the estimation of probability distributions has long been known for the case of discrete alphabet sources and lossless coding. A universal lossless code which does a good job of compressing must implicitly also do a good job of modeling. In particular, with a collection of codebooks, one for each possible class or model, if codewords are chosen from among the ensemble of codebooks so as to minimize bit rate, then the codebook selected provides an implicit estimate of the underlying class. Less is known about the corresponding connections between lossy compression and continuous sources. We consider aspects of estimating conditional and unconditional densities in conjunction with Bayes-risk weighted vector quantization for joint compression and classification.

Patent
25 Jun 1997
TL;DR: In this paper, a changeover control switch selects the output of an adaptive codebook or a fixed codebook in case input speech frequency components are changed significantly, and the output is sent to a linear prediction synthesis filter.
Abstract: In encoding in which an adaptive codebook such as PSI-CELP or a fixed codebook is used on switching selection, waveform distortion caused by selection of the fixed codebook in case input speech frequency components are changed significantly is diminished. An output of an adaptive codebook 21 or an output of a fixed codebook 22 is selected by a changeover selection switch 26 and summed to an output of noise codebooks 23, 24 so as to be sent to a linear prediction synthesis filter 16. A switching control circuit 19 for controlling the switching of a changeover control switch 26 operates in response to a prediction gain which is a ratio of the linear prediction residual energy to the initial signal energy from a linear prediction analysis circuit 14 so that, if the prediction gain is smaller than a pre-set threshold value, the switching control circuit 19 judges the input signal to be voiced and controls the changeover control switch 26 for compulsorily selecting the output of the adaptive codebook 21.

Patent
27 May 1997
TL;DR: In this article, an adaptive codebook is searched using an open-loop pitch extracted on the basis of the residual minus of a speech, and a renewal excited codebook produced from a codebook excited signal is searched.
Abstract: In a voice coding and decoding method and apparatus using an RCELP technique, a CELP-series decoder can be obtained at a low transmission rate. A voice spectrum is extracted by performing a short-term linear prediction on voice signal. An error range in a formant region is widened during adaptive and renewal codebook search by passing said preprocessed voice through a formant weighting filter and widening an error range in a pitch on-set region by passing the same through a voice synthesis filter and a harmonic noise shaping filter. An adaptive codebook is searched using an open-loop pitch extracted on the basis of the residual minus of a speech. A renewal excited codebook produced from an adaptive codebook excited signal is searched. Finally, a predetermined bit is allocated to various parameters to form a bit stream.

Patent
25 Nov 1997
TL;DR: In this paper, a codebook renewal circuit determines a correlative value between a noise code selected by the noise codebook and the input speech vector, subsequently calculates a multiplication value for each of noise codes to generate a renewal code by using the multiplication value with respect to the code selected most frequently by the coding processing at the time of voice.
Abstract: A noise codebook selects a code most suitable to the characteristics of an input speech vector from an inside quantification table. Furthermore, a codebook renewal circuit determines a correlative value between a noise code selected by the noise codebook and the input speech vector, subsequently calculates a multiplication value for each of noise codes to generate a renewal code by using the multiplication value with respect to the code selected most frequently by the coding processing at the time of voice. Renewal processing is preformed by replacing a desired code of the codebook with the renewal code. Furthermore, the renewal code is sent to a multiplexing circuit together with a renewal flag value to be sent to a decoding device by using the superfluous bit portion of an unvoice frame.

Journal ArticleDOI
TL;DR: In this article, the authors first estimate the rate-distortion bound, which is the theoretical limit in the compression of the ECG data, and then present ECG-data-compression schemes based on codebook quantizer and finite-state VQ (FSVQ), which are suitable for coding a correlative signal.
Abstract: The authors first estimate the ECG rate-distortion bound, which is the theoretical limit in the compression of the ECG data. They then present ECG data-compression schemes based on codebook quantizer and finite-state VQ (FSVQ), which are suitable for coding a correlative signal. Then, the authors' modified FSVQ-based scheme is presented.

Patent
23 Dec 1997
TL;DR: In this paper, the authors proposed a method for automatically generating a codebook for use in the transmitter and receiver components of an amplitude-adaptive normalized differential vector quantization system based upon a specified signal-to-noise ratio goal.
Abstract: The present invention relates to the field of vector quantization of transmitted imagery. In particular, the invention is a method for automatically generating a codebook for use in the transmitter and receiver components of an amplitude-adaptive normalized differential vector quantization system based upon a specified signal-to-noise ratio goal. In accordance with this method, the specified signal-to-noise ratio goal automatically derives the amplitude thresholds and other required data to determine the tradeoff between image quality and data compression for the transmitted imagery.

Proceedings ArticleDOI
26 Oct 1997
TL;DR: In order to optimize the codebook used by the vector quantization compression scheme, a process based on the max-min algorithm is developed that optimizes color space partitioning from vector blocks selected iteratively within the training set according to three algorithms.
Abstract: In order to optimize the codebook used by the vector quantization compression scheme, we have developed a process based on the max-min algorithm. This process optimizes color space partitioning from vector blocks selected iteratively within the training set according to three algorithms. The partitioning algorithm is based on the nearest neighbor query. The selection algorithm searches the furthest color of the nearest vector block of the training set already computed. A centroid process generates the codebook in refining the vector block selection. In order to counterbalance cases of study for which the centroid process modifies the vector block selection, we have introduced three tests. These tests restrict the training set from which representative colors can be selected.

PatentDOI
TL;DR: In one embodiment, a fuzzy Viterbi algorithm is used with the hidden Markov models to describe the speech input signal probabilistically.
Abstract: In one embodiment, a speech recognition system is organized with a fuzzy matrix quantizer with a single codebook representing u codewords. The single codebook is designed with entries from u codebooks which are designed with respective words at multiple signal to noise ratio levels. Such entries are, in one embodiment, centroids of clustered training data. The training data is, in one embodiment, derived from line spectral frequency pairs representing respective speech input signals at various signal to noise ratios. The single codebook trained in this manner provides a codebook for a robust front end speech processor, such as the fuzzy matrix quantizer, for training a speech classifier such as a u hidden Markov models and a speech post classifier such as a neural network. In one embodiment, a fuzzy Viterbi algorithm is used with the hidden Markov models to describe the speech input signal probabilistically.

Journal ArticleDOI
TL;DR: The experiments with LVQ and SOMs show reductions both in the average phoneme recognition error rate and in the computational load compared to the maximum likelihood training and the Generalized Probabilistic Descent.

Proceedings ArticleDOI
21 Apr 1997
TL;DR: A heuristic training procedure is proposed to retrain the codebooks so that they give a lower classification error rate for randomly selected vector-groups in VQ-based speaker recognition.
Abstract: VQ-based speaker recognition has proven to be a successful method. Usually, a codebook is trained to minimize the quantization error for the data from an individual speaker. The codebooks trained based on this criterion have weak discriminative power when used as a classifier. The LVQ algorithm can be used to globally train the VQ-based classifier. However, the correlation between the feature vectors is not taken into consideration, in consequence, a high classification rate for feature vectors does not lead to a high classification rate for the test sentences. A heuristic training procedure is proposed to retrain the codebooks so that they give a lower classification error rate for randomly selected vector-groups. Evaluation experiments demonstrated that the codebooks trained with this method provide much higher recognition rates than that trained with the LBG algorithm alone, and often they can outperform the more powerful Gaussian mixture speaker models.


Journal ArticleDOI
TL;DR: An automatic target recognition (ATR) classifier is constructed that uses a set of dedicated vector quantizers (VQs) that are iteratively adapted to represent a particular subband of a given target class at a specific range of aspects.
Abstract: An automatic target recognition (ATR) classifier is constructed that uses a set of dedicated vector quantizers (VQs). The background pixels in each input image are properly clipped out by a set of aspect windows. The extracted target area for each aspect window is then enlarged to a fixed size, after which a wavelet decomposition splits the enlarged extraction into several subbands. A dedicated VQ codebook is generated for each subband of a particular target class at a specific range of aspects. Thus, each codebook consists of a set of feature templates that are iteratively adapted to represent a particular subband of a given target class at a specific range of aspects. These templates are then further trained by a modified learning vector quantization (LVQ) algorithm that enhances their discriminatory characteristics.

Patent
29 Dec 1997
TL;DR: In this article, the adaptive codebook excitation signal and the scaled fixed codebook gain are combined to generate the excitation signals having a first word length greater than the first one. And an overall gain signal of the signal may also be received.
Abstract: A synthesizer may synthesize speech by receiving an adaptive codebook excitation signal and an adaptive codebook gain. The adaptive codebook excitation signal may be scaled using the adaptive codebook gain to generate a scaled adaptive codebook excitation signal. A fixed excitation signal and a fixed excitation gain may also be received. The fixed excitation signal may be scaled using the fixed excitation gain to generate a scaled fixed excitation signal. The scaled adaptive codebook excitation signal and the scaled fixed excitation signal may be combined to generate the excitation signal having a first word length. An overall gain signal of the excitation signal may also be received. A scaled excitation signal may then be generated by scaling the excitation signal using the overall gain signal. The scaled excitation signal may have a second word length greater than the first word length.

Journal ArticleDOI
TL;DR: An adaptive vector quantization scheme with codebook transmission is derived for the variable-rate source coding of image data using an entropy-constrained Lagrangian framework and guarantees that the operational codebook C(0) will have rate-distortion performance better than or equal to that of any initial codebooks C(I).
Abstract: An adaptive vector quantization (VQ) scheme with codebook transmission is derived for the variable-rate source coding of image data using an entropy-constrained Lagrangian framework. Starting from an arbitrary initial codebook C/sub I/ available to both the encoder and decoder, the proposed algorithm iteratively generates an improved operational codebook C/sub 0/ that is well adapted to the statistics of a particular image or subimage. Unlike other approaches, the rate-distortion trade-offs associated with the transmission of updated code vectors to the decoder are explicitly considered in the design. In all cases, the algorithm guarantees that the operational codebook C/sub 0/ will have rate-distortion performance (including all side-information) better than or equal to that of any initial codebook C/sub I/. When coding the Barbara image, improvement at all rates is demonstrated with observed gains of up to 3 dB in peak signal-to-noise ratio (PSNR). Whereas in general the algorithm is multipass in nature, encoding complexity can be mitigated without an exorbitant rate-distortion penalty by restricting the total number of iterations. Experiments are provided that demonstrate substantial rate-distortion improvement can be achieved with just a single pass of the algorithm.

Patent
Masami Akamine1, Tadashi Amada1
15 Aug 1997
TL;DR: In this paper, a perceptual weighting filter is used to generate a weighted error vector, and the codebook for a code vector that minimizes the weight vector is searched, and an index corresponding to the code vector found as an encoding parameter is output.
Abstract: A speech encoding method including generating a reconstruction speech vector by using a code vector extracted from a codebook storing a plurality of code vectors for encoding a speech signal. In addition an input speech signal to be encoded is used as a target vector to generate an error vector representing the error of the reconstruction speech vector with respect to the target vector, and the error vector is passed through a perceptual weighting filter having a transfer function including the inverse characteristics of the transfer function of a filter for emphasizing the spectrum of a reconstructed speech signal. Thus a weighted error vector is generated, the codebook for a code vector that minimizes the weighted error vector is searched, and an index corresponding to the code vector found as an encoding parameter is output.