scispace - formally typeset
Search or ask a question

Showing papers on "Lossless compression published in 1989"


Journal ArticleDOI
E. R. Fiala1, Daniel H. Greene1
TL;DR: The article describes modifications of McCreight's suffix tree data structure that support cyclic maintenance of a window on the most recent source characters and explores the tradeoffs between compression time, expansion time, data structure size, and amount of compression achieved.
Abstract: Several methods are presented for adaptive, invertible data compression in the style of Lempel's and Ziv's first textual substitution proposal. For the first two methods, the article describes modifications of McCreight's suffix tree data structure that support cyclic maintenance of a window on the most recent source characters. A percolating update is used to keep node positions within the window, and the updating process is shown to have constant amortized cost. Other methods explore the tradeoffs between compression time, expansion time, data structure size, and amount of compression achieved. The article includes a graph-theoretic analysis of the compression penalty incurred by our codeword selection policy in comparison with an optimal policy, and it includes empirical studies of the performance of various adaptive compressors from the literature.

240 citations


Patent
29 Mar 1989
TL;DR: In this article, a computer-based video compression system for processing natural information signals, such as video signals or audio signals, for the purpose of forming a compact data file of compressed signals which can be expanded to produce the original information signals is provided by a unique subsystem in a host computer which uses a high speed signal processor to compress video images and to expand video images.
Abstract: A computer-based video compression system for processing natural information signals, such as video signals or audio signals, for the purpose of forming a compact data file of compressed signals which can be expanded to produce the original information signals is provided by a unique subsystem in a host computer which uses a high speed signal processor to compress video images and to expand video images. The unique compression process uses segmentation and predictive techniques in conjunction with a discrete sine transform, quantization and Huffman coding to provide optimal compression. Further the segmentation and predictive techniques in conjunction with the discrete sine transform diminish the magnitude of correlated errors generated by the compression process. The compression system of this invention is further enhanced by a special circuit which transfers data between the host computer and the compression subsystem at rates greater than were previously possible.

178 citations


Journal ArticleDOI
TL;DR: The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression, and here the implementation of models based on recognizing and recording words is considered.
Abstract: The development of efficient algorithms to support arithmetic coding has meant that powerful models of text can now be used for data compression. Here the implementation of models based on recognizing and recording words is considered. Move-to-the-front and several variable-order Markov models have been tested with a number of different data structures, and first the decisions that went into the implementations are discussed and then experimental results are given that show English text being represented in under 2-2 bits per character. Moreover the programs run at speeds comparable to other compression techniques, and are suited for practical use.

176 citations


Patent
10 Oct 1989
TL;DR: In this paper, a method for the compression and decompression of binary test images is proposed, which distinguishes between large low-frequency areas and small high frequency areas in the original frame.
Abstract: The invention relates to a method for the compression and decompression of binary test images. The method distinguishes between large low-frequency areas and small high-frequency areas in the original frame. For the low-frequency areas, a scheme for lossy compression is used, whereas for the high-frequency areas, a scheme permitting lossless compression is applied. The compression/decompression process involves five stages; namely prefiltering to remove all black patches (e.g. by removing all black pixels, except where they belong to a large black segment), fast evaluation of compressibility by partitioning the images into mutually exclusive segments and applying different compression modes to each segment, connectivity-oriented subsampling to reduce the reslolution in horizontal and vertical directions which cause the image to be segmented into blocks and a 1-pixel representation for each block is determined, lossless compression and decompression where the reduced file is compressed by conventional techniques, and reconstruction by sequence reversal so that lossless decompression will retrieve the subsampled file, expansion of the subsampled file through replacement of each pixel by a block having equal value and postfiltering.

171 citations


Journal ArticleDOI
TL;DR: A self-contained discussion of discrete-time lossless systems and their properties and relevance in digital signal processing is presented and the most general form of a rational lossless transfer matrix is presented along with synthesis procedures for the FIR (finite impulse response) case.
Abstract: A self-contained discussion of discrete-time lossless systems and their properties and relevance in digital signal processing is presented. The basic concept of losslessness is introduced, and several algebraic properties of lossless systems are studied. An understanding of these properties is crucial in order to exploit the rich usefulness of lossless systems in digital signal processing. Since lossless systems typically have many input and output terminals, a brief review of multiinput multioutput systems is included. The most general form of a rational lossless transfer matrix is presented along with synthesis procedures for the FIR (finite impulse response) case. Some applications of lossless systems in signal processing are presented. >

105 citations


Proceedings ArticleDOI
01 Jan 1989
TL;DR: The applied network model is a feedforward-type, three-layered network with the backpropagation learning algorithm, and the implementation of this model on a hypercube parallel computer and its computation performance are described.
Abstract: Data compression and generalization capabilities are important for neural network models as learning machines From this point of view, the image data compression characteristics of a neural network model are examined The applied network model is a feedforward-type, three-layered network with the backpropagation learning algorithm The implementation of this model on a hypercube parallel computer and its computation performance are described Image data compression, generalization, and quantization characteristics are examined experimentally Effects of learning using the discrete cosine transformation coefficients as initial connection weights are shown experimentally >

78 citations


Journal ArticleDOI
L. Wang1, M. Goldberg1
TL;DR: The experiments demonstrate that it is possible to achieve simultaneously lossless and progressive transmission with compression, and at the intermediate level, the use of vector quantization results in a coding gain over that obtained using only a Huffman coder.
Abstract: A progressive image transmission scheme in which vector quantization is applied to images represented by pyramids is proposed. A mean pyramid representation of an image is first built up by forming a sequence of reduced-size images by averaging over blocks of 2*2 pixels. A difference pyramid is then built up by taking the differences between successive levels in the mean pyramid. Progressive transmission is achieved by sending all the nodes in the difference pyramid starting from the top level and ending at the bottom level. The kth approximate image can be formed by adding the information of level k to the previously reproduced (k-1)st approximation. To gain efficiency, vector quantization is applied to the difference pyramid of the image on a level-by-level basis. If the errors due to quantization at level k are properly delivered and included in the next level, k+1, then it is demonstrated that the original image can be reconstructed. An entropy coder is used to encode the final residual error image losslessly, thus ensuring perfect reproduction of the original image. The experiments demonstrate that it is possible to achieve simultaneously lossless and progressive transmission with compression. At the intermediate level, the use of vector quantization results in a coding gain over that obtained using only a Huffman coder. Excellent reproduction is achieved at a bit rate of only 0.06 bits/pixel. >

60 citations


Proceedings ArticleDOI
15 Aug 1989
TL;DR: This work describes a model of the CSF that includes changes as a function of image noise level by using the concepts of internal visual noise, and tests this model in the context of image compression with an observer study.
Abstract: The visual contrast sensitivity function (CSF) has found increasing use in image compression as new algorithms optimize the display-observer interface in order to reduce the bit rate and increase the perceived image quality. In most compression algorithms, increasing the quantization intervals reduces the bit rate at the expense of introducing more quantization error, a potential image quality degradation. The CSF can be used to distribute this error as a function of spatial frequency such that it is undetectable by the human observer. Thus, instead of being mathematically lossless, the compression algorithm can be designed to be visually lossless, with the advantage of a significantly reduced bit rate. However, the CSF is strongly affected by image noise, changing in both shape and peak sensitivity. This work describes a model of the CSF that includes these changes as a function of image noise level by using the concepts of internal visual noise, and tests this model in the context of image compression with an observer study.

60 citations


Journal ArticleDOI
TL;DR: This paper introduces a reduced-difference pyramid data structure in which the number of nodes, corresponding to a set of decorrelated difference values, is exactly equal to thenumber of pixels.
Abstract: Pyramid data structures have found an important role in progressive image transmission. In these data structures, the image is hierarchically represented, with each level corresponding to a reduced-resolution approximation. To achieve progressive image transmission, the pyramid is transmitted starting from the top level. However, in the usual pyramid data structures, extra significant bits may be required to accurately record the node values, the number of data to be transmitted may be expanded, and the node values may be highly correlated. In this paper, we introduce a reduced-difference pyramid data structure in which the number of nodes, corresponding to a set of decorrelated difference values, is exactly equal to the number of pixels. Experimental results demonstrate that the reduced-difference pyramid results in lossless progressive image transmission with some degree of compression. By use of an appropriate interpolation method, reasonable quality approximations are achieved at a bit rate less than 0.1 bit/pixel and excellent quality approximations at a bit rate of about 1.3 bits/pixel.

55 citations


Journal ArticleDOI
TL;DR: A novel universal data compression algorithm encodes L source symbols at a time and an upper limit for the number of bits per source symbol is given for the class of binary stationary sources.
Abstract: A novel universal data compression algorithm is described. This algorithm encodes L source symbols at a time. An upper limit for the number of bits per source symbol is given for the class of binary stationary sources. In the author's analysis, a property of repetition times turns out to be of crucial importance. >

52 citations


Patent
31 Aug 1989
TL;DR: In this article, a lossless guided-wave switches with more than 30 dB of crosstalk-isolation are comprised of branched channel waveguides with laser-like cross-sections.
Abstract: Lossless guided-wave switches with more than 30 dB of crosstalk-isolation are comprised of branched channel waveguides with laser-like cross-sections. Optical gain, sufficient to overcome power-splitting losses, is provided by carrier-injection currents. Due to its low-noise properties, the single-quantum-well structure is found to be optimum for cascading switches into a multi-stage network. A lossless 1×N network with 1024 switched outputs should be feasible.

Proceedings ArticleDOI
01 Nov 1989
TL;DR: Experimental results from applying morphological pyramids for image compression and progressive transmission show that high quality reconstructions of original images from the corresponding error pyramids can be achieved with significant reduction in total entropies.
Abstract: In this paper, the concept of morphological pyramids for image compression and progressive transmission is discussed. Experimental results from applying these pyramids to three real images, a satellite cloud image, a tank image from arerial photography and a NMR skull image, are presented. For lossless compression, no reduction in total (first order) entropies of error pyramids derived from the original images are observed in all three cases. However, high quality reconstructions of original images from the corresponding error pyramids can be achieved with significant reduction in total entropies.

ReportDOI
01 Sep 1989
TL;DR: A short review of the theory of iterated function systems (IFS), a thorough explanation of their implementation, and an example using computer code useful in developing encoded images are presented.
Abstract: : A short review of the theory of iterated function systems (IFS), a thorough explanation of their implementation, and an example using computer code useful in developing encoded images are presented. An example of an encoded map, with a brief discussion of data compression and error analysis, is presented. Details of an extension of IFS codes which allows for mixing of images, thereby resulting in a system with substantially increased power, are given. A simple scheme for automatic generation of IFS codes is given, followed by a discussion of improvements which may lead to a more generally useful data compression system. (RRH)

Proceedings ArticleDOI
15 Aug 1989
TL;DR: A psychophysical method is used to estimate, for a number of compression techniques, a threshold bit-rate yielding a criterion level of performance in discriminating original and compressed images.
Abstract: Image compression schemes abound with little work which compares their bit-rate performance based on subjective fidelity measures. Statistical measures of image fidelity, such as squared error measures, do not necessarily correspond to subjective measures of image fidelity. Most previous comparisons of compression techniques have been based on these statistical measures. A psychophysical method has been used to estimate, for a number of compression techniques, a threshold bit-rate yielding a criterion level of performance in discriminating original and compressed images. The compression techniques studied include block truncation, Laplacian pyramid, block discrete cosine transform, with and without a human visual system scaling, and cortex transform coders.

Patent
10 Oct 1989
TL;DR: In this article, a method and apparatus featuring the compression and decompression of the video image data on a single application specific integrated circuit (ASIC) which utilizes typical Huffman coding techniques, but which defines a run count generally as a plurality of zeros between each pair of ones and further uses a prediction window having substantially no directional biasing errors associated therewith.
Abstract: A method and apparatus featuring the compression and decompression of the video image data on a single application specific integrated circuit (ASIC) which utilizes typical Huffman coding techniques, but which defines a run count generally as a plurality of zeros between each pair of ones and which further uses a prediction window having substantially no directional biasing errors associated therewith. Additionally, the compression/decompression apparatus of this invention utilizes a reordering technique whereby all of the "high confidence" predictions are placed in one end of the buffer and all the "low confidence" predictions are placed in another end of the same buffer. This reordering, together with the unbiased window configuration and associated run length count definition, allows for high efficiency lossless compression and decompression of associated image data.

Proceedings ArticleDOI
05 Apr 1989
TL;DR: Many image transmission/storage applications requiring some form of data compression additionally require that the decoded image be an exact replica of the original, and lossless image coding algorithms meet this requirement by generating a decodes image that is numerically identical to the original.
Abstract: Many image transmission/storage applications requiring some form of data compression additionally require that the decoded image be an exact replica of the original. Lossless image coding algorithms meet this requirement by generating a decoded image that is numerically identical to the original. Several lossless coding techniques are modifications of well-known lossy schemes, whereas others are new. Traditional Markov-based models and newer arithmetic coding techniques are applied to predictive coding, bit plane processing, and lossy plus residual coding. Generally speaking, the compression ratio offered by these techniques are in the area of 1.6:1 to 3:1 for 8-bit pictorial images. Compression ratios for 12-bit radiological images approach 3:1, as these images have less detailed structure, and hence, their higher pel correlation leads to a greater removal of image redundancy.

Proceedings ArticleDOI
23 May 1989
TL;DR: The proposed coder is one of a class of pyramidal coders employing what can be regarded as noncausal predictive methods and is shown to be effective at rates below 1 bit/pixel and to be capable of producing decoded images with no gross distortions at compression ratios of up to 100.
Abstract: The authors describe an image data compression algorithm based on multiresolution interpolation controlled by local feature orientation. The proposed coder is one of a class of pyramidal coders employing what can be regarded as noncausal predictive methods. It is shown to be effective at rates below 1 bit/pixel and to be capable of producing decoded images with no gross distortions at compression ratios of up to 100. The coder is described in the context of pyramid models, and its performance on a number of test images is presented and compared with that of other methods. Parametric forms giving varying degrees of global and local nonuniformity are presented and applied successfully to the problem of image data compression. The implications of this work for image modeling and data compression theory are discussed. >

Proceedings ArticleDOI
01 Nov 1989
TL;DR: This study revisits simple methods, subsampling and modulo differentiation, to combine them in a system suitable for operation in ISDN as well as non-ISDN environments with good overall performance.
Abstract: This study revisits simple methods, subsampling and modulo differentiation, to combine them in a system suitable for operation in ISDN as well as non-ISDN environments with good overall performance. The modulo difference of selected pairs of pixels in a digital image enables the design of a hierarchal code which represents the original digital image without any information loss. The pixels are selected by pairing together consecutive non-overlapping pixels in the original image and then in successive subsampled versions of it. The subsampling is 2:1 and, as the pairing, is alternated horizontally and vertically. This algorithm uniquely combines hierarchal and parallel processing. Further processing of the hierarchal code through an entropy coder achieves good compression. If a Huffman code is used, default tables can be designed with minimum penalty. Parallel processing allows the coder to operate at various speeds, up to real time, and the hierarchal data structure enables operations in a progressive transmission mode. The algorithm's inherent simplicity and the use of the modulo difference operation make this coding scheme computationally simple and robust to noise and errors in coding or transmission. The system can be implemented economically in software as well as in hardware. Coder and decoder are symmetrical. Compression results can be slightly improved by using 2-D prediction but at the cost of an increase in system complexity and sensitivity to noise.

Proceedings ArticleDOI
27 Nov 1989
TL;DR: The ISO/CCITT Joint Photographic Experts Group is in the process of developing an international standard for general-purpose, continuous-tone still-image compression, which consists of a baseline system, a simple coding method sufficient for many applications, a set of extended system capabilities, and an independent lossless method for applications needing that type of compression only.
Abstract: The ISO/CCITT Joint Photographic Experts Group is in the process of developing an international standard for general-purpose, continuous-tone still-image compression. A brief history is presented as background to a summary of the past year's progress, which was highlighted by definition of the overall structure of the proposed standard. This structure consists of a baseline system, a simple coding method sufficient for many applications, a set of extended system capabilities, which extend the baseline system to satisfy a broader range of applications, and an independent lossless method for applications needing that type of compression only. >

Proceedings ArticleDOI
05 Apr 1989
TL;DR: This paper presents a practical error-tolerant compression algorithm for recording color pictures digitally on a tape and provides a real-time architecture such that the processing of picture compression is implemented in a single VLSI chip.
Abstract: Real-time color image compression is always needed. This paper presents a practical error-tolerant compression algorithm for recording color pictures digitally on a tape and provides a real-time architecture such that the processing of picture compression is implemented in a single VLSI chip. The algorithm is based on the principle of block truncation coding (BTC). The picture is represented in Y-I-Q color space and each plane is divided into small blocks such that a reconstructed picture still contains the quality appropriate for 3R prints and keep compression efficiency. Any single channel error is restricted into a very small block of the picture and this feature of error-tolerance is important for the application of picture recording. The real-time architectures for three signal channels are working in parallel and each channel has a pipelinal architecture. This architecture also needs two 4-line input buffers with 24 bits in depth and two 96-bit output buffers. The whole architecture for both compression and decompression can be implemented with a single VLSI chip and be executed in real time. This approach provides two unique properties: error-tolerance and real-time execution, with which most other image compression algorithms have problems.

Proceedings ArticleDOI
14 Aug 1989
TL;DR: A simple, easy to implement real-time lossless image coding algorithm which takes into account the pixel-to-pixel correlation in an image and seems to react robustly to mismatch between the assumed and actual statistics.
Abstract: A simple, easy to implement real-time lossless image coding algorithm which takes into account the pixel-to-pixel correlation in an image is presented. The algorithm has built-in limits to the variance in the size of the codewords, and seems to react robustly to mismatch between the assumed and actual statistics. Because of the limited dynamic range in codeword size it is not expected to have significant buffer overflow and underflow problems. Test results using this algorithm are presented. >

Proceedings ArticleDOI
08 May 1989
TL;DR: The effects of breaking data into fragments was tested using a simulation tool dubbed PAW (Performance Analysis Workstation) and the results show that processing the data in fragments is desirable.
Abstract: A compression server in a PACS environment has to deal with images of different types and sizes. The images will flow into compression at different rates, ranging from 1-2 Mbytes/sec to 9.6 Kbytes/sec. Additionally, the pattern of the flow can vary. Some images will enter compression in one block. Other images, especially large images (i.e., digitized film images) will enter compression as a series of blocks. Interleaving of blocks can also occur. For example, the first two blocks from a digitized X-ray film may enter compression followed by a block from a CT image followed by more blocks from the X-ray film. In order to process incoming images rapidly, the compression service must compress blocks as they arrive. The compression of a CT image should not have to wait for the final arrival of a slowly transmitted large digital X-ray image. On the other hand, temporarily buffering large images or adding extra compression hardware may make the compression service too expensive. These PACS network considerations argue for compression that operates on images locally. That is, the compression algorithm should not have to know the statistics of the entire image to be effective. Any transforms should operate on local blocks within the image independent of the results on antecedent or subsequent blocks. In addition, since the network may present relatively small blocks to compression (as small as 64 Kbytes), the compression technique should not add a large amount of overhead to the compressed data in the form of tables, descriptors, and so forth. In this paper, the effects of breaking data into fragments was tested using a simulation tool dubbed PAW (Performance Analysis Workstation.) Two cases were considered. In the first case, large images were compressed in their entirety. In the second case, large images were broken into fragments and the fragments were compressed separately. The results show that processing the data in fragments is desirable.

Proceedings ArticleDOI
23 Jun 1989
TL;DR: The most important parameter, the compression ratio, was found to be similar to or better than that of other Ziv-Lempel compression techniques, whereas coding speed and memory requirements depended on the programming techniques used.
Abstract: The Lempel-Ziv-Welch (LZW) data compression algorithm is evaluated for use in the removal of the redundancy in computer files. The Ziv-Lempel algorithm and related algorithms are compared with respect to encoding and decoding speed, memory requirements, and compression ratio. Although the LZW algorithm is optimized for hardware implementation, the possibility of implementing it in software is considered. The most important parameter, the compression ratio, was found to be similar to or better than that of other Ziv-Lempel compression techniques, whereas coding speed and memory requirements depended on the programming techniques used. >

01 Feb 1989
TL;DR: In this article, an image segmentation based compression technique is applied to LANDSAT Thematic Mapper (TM) and Nimbus-7 Coastal Zone Color Scanner (CZCS) data.
Abstract: A case study is presented where an image segmentation based compression technique is applied to LANDSAT Thematic Mapper (TM) and Nimbus-7 Coastal Zone Color Scanner (CZCS) data. The compression technique, called Spatially Constrained Clustering (SCC), can be regarded as an adaptive vector quantization approach. The SCC can be applied to either single or multiple spectral bands of image data. The segmented image resulting from SCC is encoded in small rectangular blocks, with the codebook varying from block to block. Lossless compression potential (LDP) of sample TM and CZCS images are evaluated. For the TM test image, the LCP is 2.79. For the CZCS test image the LCP is 1.89, even though when only a cloud-free section of the image is considered the LCP increases to 3.48. Examples of compressed images are shown at several compression ratios ranging from 4 to 15. In the case of TM data, the compressed data are classified using the Bayes' classifier. The results show an improvement in the similarity between the classification results and ground truth when compressed data are used, thus showing that compression is, in fact, a useful first step in the analysis.

Proceedings ArticleDOI
01 Nov 1989
TL;DR: It is observed that more than 50% compression of motion data is achieved, which corresponds to approximately 0.05 bits/pel for the motion information in the tested scheme, which is about 10-15% of total bits used in low bit rate video applications.
Abstract: It is well known that the motion compensation is a powerful approach to the video coding. The redundancy within adjacent video frames is exploited by motion compensated interframe prediction. The motion information and the prediction error are transmitted to the receiver to reconstruct the video frames. The motion information must be compressed in a lossless way. Several lossless data compression techniques are tested and compared here for the motion information. It is observed that more than 50% compression of motion data is achieved. This corresponds to approximately 0.05 bits/pel for the motion information in the tested scheme, which is about 10-15% of total bits used in low bit rate video applications.

Proceedings ArticleDOI
08 May 1989
TL;DR: Three different source models for word-based data compression are proposed: move to front, frequency toFront, and alpha-numeric to front: principles and methods for encoding their gathered data context are presented.
Abstract: Documents, papers, and reports contain large amounts of redundancy. This redundancy can be minimized by data-compression techniques to save storage space or to increase transmission efficiency. Several data-compression algorithms that are character based have been proposed in the literature. In English test files, however, the natural units of repetition are words or phrases, rather than characters. Three different source models for word-based data compression are proposed: move to front, frequency to front, and alpha-numeric to front. Their principles and methods for encoding their gathered data context are presented. Results of compression ratios obtained are included and compared. Comparisons with the performances of the Lempel-Ziv algorithm and fourth-order arithmetic encoding are also made. Some ideas for further improving the performance already obtained are proposed. >

Proceedings ArticleDOI
13 Sep 1989
TL;DR: The use of charge-coupled device (CCD) circuits in Z-plane architectures for focal-plane image processing is discussed, and the use of CCDs for buffering multiple image frames enables spatial-temporal image transformation for lossless compression.
Abstract: The use of charge-coupled device (CCD) circuits in Z-plane architectures for focal-plane image processing is discussed. The low-power, compact layout nature of CCDs makes them attractive for Z-plane application. Three application areas are addressed; non-uniformity compensation using CCD MDAC circuits, neighborhood image processing functions implemented with CCD circuits, and the use of CCDs for buffering multiple image frames. Such buffering enables spatial-temporal image transformation for lossless compression.

Patent
Robert Klein1, Debora Y. Grosse1, Karen A. Wilds, Robert D'Aoust, Stephen R. Krebs 
10 Oct 1989
TL;DR: In this paper, an application specific integrated circuit (ASIC) allows for the transposition, compression, and decompression of acquired image video data using Huffman encoding techniques in conjunction with run length encoding.
Abstract: An application specific integrated circuit (ASIC) allows for the transposition, compression, and decompression of acquired image video data The transposition method associated with this ASIC provides for the reordering of received vertically scanned columns of acquired pixel image data into horizontal rows and provides for the deleting of overscan and underscan pixel data associated with each of the vertically scanned columns of acquired data This transposed data is then compressed employing Huffman encoding techniques in conjunction with run length encoding The compression/decompression apparatus and method of this invention utilizes a reordering technique whereby all of the 'high confidence' predictions are placed in one end of the buffer and all the 'low confidence' predictions are placed in another end of the same buffer This reordering, together with the biased window configuration and associated run length count definition, allows for high efficiency lossless compression and decompression of associated image data

Proceedings ArticleDOI
20 Sep 1989
TL;DR: The enhanced coder compresses approximately 2000 bytes of text every second before optimization, making it fast enough for regular use, and reduces the length of a typical compressed text file by about 25%.
Abstract: A description is given of several modifications to the Ziv-Lempel data compression scheme that improve its compression ratio at a moderate cost in run time (J. Ziv, A. Lempel, 1976, 1977, 1978). The best algorithm reduces the length of a typical compressed text file by about 25%. The enhanced coder compresses approximately 2000 bytes of text every second before optimization, making it fast enough for regular use. >

Anselm Blumer1
01 Feb 1989
TL;DR: Experimental results indicate that compression is often far better than that obtained using independent-letter models, and sometimes also significantly better than other non-independent techniques.
Abstract: Adaptive data compression techniques can be viewed as consisting of a model specified by a database common to the encoder and decoder, an encoding rule and a rule for updating the model to ensure that the encoder and decoder always agree on the interpretation of the next transmission. The techniques which fit this framework range from run-length coding, to adaptive Huffman and arithmetic coding, to the string-matching techniques of Lempel and Ziv. The compression obtained by arithmetic coding is dependent on the generality of the source model. For many sources, an independent-letter model is clearly insufficient. Unfortunately, a straightforward implementation of a Markov model requires an amount of space exponential in the number of letters remembered. The Directed Acyclic Word Graph (DAWG) can be constructed in time and space proportional to the text encoded, and can be used to estimate the probabilities required for arithmetic coding based on an amount of memory which varies naturally depending on the encoded text. The tail of that portion of the text which was encoded is the longest suffix that has occurred previously. The frequencies of letters following these previous occurrences can be used to estimate the probability distribution of the next letter. Experimental results indicate that compression is often far better than that obtained using independent-letter models, and sometimes also significantly better than other non-independent techniques.