scispace - formally typeset
Search or ask a question

Showing papers on "Compressed pattern matching published in 1992"


Journal ArticleDOI
TL;DR: T h e string-matching problem is a very c o m m o n problem; there are many extensions to t h i s problem; for example, it may be looking for a set of patterns, a pattern w i t h "wi ld cards," or a regular expression.
Abstract: T h e string-matching problem is a very c o m m o n problem. We are searching for a string P = PtP2. . "Pro i n s i d e a la rge t ex t f i le T = t l t2. . . t . , b o t h sequences of characters from a f i n i t e character set Z. T h e characters may be English characters in a text file, DNA base pairs, lines of source code, angles between edges in polygons, machines or machine parts in a production schedule, music notes and tempo in a musical score, and so fo r th . We w a n t to f i n d a l l occurrences of P i n T; n a m e l y , we are searching for the set of starting posit ions F = {i[1 --i--n m + 1 s u c h t h a t titi+ l " " t i + m 1 = P } " T h e two most famous algorithms for this problem are t h e B o y e r M o o r e algorithm [3] and t h e K n u t h Morris Pratt algorithm [10]. There are many extensions to t h i s problem; for example, we may be looking for a set of patterns, a pattern w i t h "wi ld cards," or a regular expression. String-matching tools are included in every reasonable text editor, word processor, and many other applications.

806 citations


Proceedings ArticleDOI
24 Mar 1992
TL;DR: The authors propose that compression be used as a time saving tool, in addition to its traditional role of space saving, by introducing a new pattern matching paradigm, compressed matching.
Abstract: Digitized images are known to be extremely space consuming. However, regularities in the images can often be exploited to reduce the necessary storage area. Thus, many systems store images in a compressed form. The authors propose that compression be used as a time saving tool, in addition to its traditional role of space saving. They introduce a new pattern matching paradigm, compressed matching. A text array T and pattern array P are given in compressed forms c(T) and c(P). They seek all appearances of P in T, without decompressing T. This achieves a search time that is sublinear in the size of the uncompressed text mod T mod . They show that for the two-dimensional run-length compression there is a O( mod c(T) mod log mod P mod + mod P mod ), or almost optimal algorithm. The algorithm uses a novel multidimensional pattern matching technique, two-dimensional periodicity analysis. >

182 citations


Proceedings Article
01 Sep 1992
TL;DR: This paper presents a new algorithmic technique for two-dimensional matching, that of periodicity analysis, and introduces a new pattern matching paradigm - Compressed Matching
Abstract: String matching is rich with a variety of algorithmic tools. In contrast, multidimensional matching has a rather sparse set of techniques. This paper presents a new algorithmic technique for two-dimensional matching, that of periodicity analysis.Periodicity in strings has been used to solve string matching problems. The success of these algorithms suggests that periodicity can be as important a tool in multidimensional matching. However, multidimensional periodicity is not as simple as it is in strings and was not formally studied or used in pattern matching.This paper's main contribution is defining and analysing two-dimensional periodicity in rectangular arrays. In addition, we introduce a new pattern matching paradigm - Compressed Matching. A text array T and a pattern array P are given in compressed forms c(T) and c(P). We seek all appearances of P in T, without decompressing T. By using periodicity analysis, we show that for the two-dimensional run-length compression there is a O(|c(T)|log|P|+|P|), or almost optimal algorithm that can achieve a search time that is sublinear in the size of the text |T|.

87 citations