scispace - formally typeset
Open AccessJournal ArticleDOI

Common phrases and minimum-space text storage

Robert A. Wagner
- 01 Mar 1973 - 
- Vol. 16, Iss: 3, pp 148-152
Reads0
Chats0
TLDR
A dynamic programming algorithm is presented which solves the problem in time which grows linearly with the number of characters in the text, which is nontrivial when phrases which overlap exist.
Abstract
A method for saving storage space for text strings, such as compiler diagnostic messages, is described. The method relies on hand selection of a set of text strings which are common to one or more messages. These phrases are then stored only once. The storage technique gives rise to a mathematical optimization problem: determine how each message should use the available phrases to minimize its storage requirement. This problem is nontrivial when phrases which overlap exist. However, a dynamic programming algorithm is presented which solves the problem in time which grows linearly with the number of characters in the text. Algorithm 444 applies to this paper.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Data compression via textual substitution

TL;DR: A general model for data compression which includes most data compression systems in the fiterature as special cases is presented and trade-offs between different varieties of macro schemes, exact lower bounds on the amount of compression obtainable, and the complexity of encoding and decoding are discussed.
Journal ArticleDOI

Data compression

TL;DR: A variety of data compression methods are surveyed, from the work of Shannon, Fano, and Huffman in the late 1940s to a technique developed in 1986, which has important application in the areas of file storage and distributed systems.
Patent

Data compression apparatus and method

TL;DR: An apparatus and method for converting an input data character stream into a variable length encoded data stream in a data compression system is described in this article. But this method requires the input data characters to be stored in the history array.
Journal ArticleDOI

Modeling for text compression

TL;DR: This paper surveys successful strategies for adaptive modeling that are suitable for use in practical text compression systems, and falls into three main classes: finite-context modeling, in which the last few characters are used to condition the probability distribution for the next one.
Journal ArticleDOI

List Partitions

Tomas Feder
TL;DR: tools which allow us to classify the complexity of many list partition problems and, in particular, yield the complete classification for small matrices M, and it is shown that the dichotomy (NP-complete versus polynomial time solvable), conjectured for certain graph homomorphism problems, would, if true, imply a slightly weaker dichotomies for these problems.
References
More filters
Journal ArticleDOI

The quadratic quotient method: a hash code eliminating secondary clustering

TL;DR: Secondary clustering as a cause of hash code inefficiency is discussed, and a new hashing method based on its elimination is presented, both analytically and empirically.
Journal ArticleDOI

PUFFT—The Purdue University fast FORTRAN translator

TL;DR: An extension to the AXLE language will permit the computer to make inferences during run time, and a facility is needed for permitting the computer itself to change imperative and assertion tables as a result of earlier operations.