Approximate string matching system and process for lossless data compression

Patent

Approximate string matching system and process for lossless data compression

Chats0

TLDR

In this paper, an approximate string matching scheme was proposed for lossless data compression employing an entropy-based compression technique, where the residual data represents the difference between each value of an earlier occurring block of source data, whose location and length is identified by a pointer, and an equal-sized block of the source data associated with the pointer.

Abstract:

A system and process for lossless data compression employing a unique approximate string matching scheme. The encoder of the system characterizes source data as a set of pointers and associated blocks of residual data. Each pointer identifies a location earlier in the source data, as well as the number of source data values associated with the identified location. The residual data represents the difference between each value of an earlier occurring block of source data, whose location and length is identified by a pointer, and an equal-sized block of source data associated with the pointer. The choice of a block of earlier occurring source data for use in forming a residual data block is based on a cost analysis which is designed to minimize the entropy of the differences between the previous block and the new block of source data to a desired degree. The encoded data, which will exhibit a significantly lower entropy, can be compressed effectively using an entropy-based compression technique. The decoder portion of the system operates by initially decompressing the encoded data. Next, the first data value is decoded by adding the first residual to a predetermined constant. Once the first data value has been decoded, subsequent data values are decoded by first finding the block in the previously decoded data indicated by a pointer, and then adding each data value in the block to its corresponding data element in the residual data block associated with the pointer. The process is repeated until all the data is decoded.

Approximate string matching system and process for lossless data compression

Citations

Method and apparatus for efficient hardware based deflate

Method of compression of binary data with a random number generator

PPM-based data compression

Method and system for image compression and decompression using span of interest of an imaging sequence

Optimization of decoder memory usage for VLC tables

References

A universal algorithm for sequential data compression

On the Complexity of Finite Sequences

A Technique for High-Performance Data Compression

Data Compression Using Adaptive Coding and Partial String Matching

Implementing the PPM data compression scheme

Related Papers (5)

Video signal recording data overflow allocation apparatus and method

Variable length coding method and variable length decoding method

Reducing latencies in computing systems using probabilistic and/or decision-theoretic reasoning under scarce memory resources

Methods and apparatus for processing variable length coded data

Variable length coding of video with controlled deletion of codewords