scispace - formally typeset
Proceedings ArticleDOI

Efficient approximate and dynamic matching of patterns using a labeling paradigm

Reads0
Chats0
TLDR
The authors show that this general method based on assigning labels to some of the substrings of a given string is also useful for several central problems in the area of string processing: approximate string matching, dynamic dictionary matching, and dynamic text indexing.
Abstract
A key approach in string processing algorithmics has been the labeling paradigm which is based on assigning labels to some of the substrings of a given string. If these labels are chosen consistently, they can enable fast comparisons of substrings. Until the first optimal parallel algorithm for suffix tree construction was given by the authors in 1994 the labeling paradigm was considered not to be competitive with other approaches. They show that this general method is also useful for several central problems in the area of string processing: approximate string matching, dynamic dictionary matching, and dynamic text indexing. The approximate string matching problem deals with finding all substrings of a text which match a pattern "approximately", i.e., with at most m differences. The differences can be in the form of inserted, deleted, or replaced characters. The text indexing problem deals with finding all occurrences of a pattern in a text, after the text is preprocessed. In the dynamic text indexing problem, updates to the text in the form of insertions and deletions of substrings are permitted. The dictionary matching problem deals with finding all occurrences of each pattern set of a set of patterns in a text, after the pattern set is preprocessed. In the dynamic dictionary matching problem, insertions and deletions of patterns to the pattern set are permitted.

read more

Citations
More filters
Journal ArticleDOI

Data streams: algorithms and applications

TL;DR: Data Streams: Algorithms and Applications surveys the emerging area of algorithms for processing data streams and associated applications, which rely on metric embeddings, pseudo-random computations, sparse approximation theory and communication complexity.
Book

Data Streams: Algorithms and Applications

TL;DR: In this paper, the authors present a survey of basic mathematical foundations for data streaming systems, including basic mathematical ideas, basic algorithms, and basic algorithms and algorithms for data stream processing.
Proceedings Article

Approximate String Joins in a Database (Almost) for Free

TL;DR: In this article, the authors propose a technique for building approximate string join capabilities on top of commercial databases by exploiting facilities already available in them. But this technique relies on matching short substrings of length, called -grams, and taking into account both positions of individual matches and the total number of such matches.
Book

Flexible Pattern Matching in Strings: Practical On-Line Search Algorithms for Texts and Biological Sequences

TL;DR: This book presents a practical approach to string matching problems, focusing on the algorithms and implementations that perform best in practice, and includes all of the most significant new developments in complex pattern searching.
Proceedings ArticleDOI

Dictionary matching and indexing with errors and don't cares

TL;DR: This paper considers various flavors of the following online problem: preprocess a text or collection of strings, so that given a query string p, all matches of p with the text can be reported quickly.
References
More filters
Journal ArticleDOI

Efficient string matching: an aid to bibliographic search

TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
Journal ArticleDOI

Fast Pattern Matching in Strings

TL;DR: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
Journal ArticleDOI

A fast string searching algorithm

TL;DR: The algorithm has the unusual property that, in most cases, not all of the first i.” in another string, are inspected.
Proceedings ArticleDOI

Linear pattern matching algorithms

Peter Weiner
TL;DR: A linear time algorithm for obtaining a compacted version of a bi-tree associated with a given string is presented and indicated how to solve several pattern matching problems, including some from [4] in linear time.
Journal ArticleDOI

Suffix arrays: a new method for on-line string searches

TL;DR: A new and conceptually simple data structure, called a suffixarray, for on-line string searches is introduced in this paper, and it is believed that suffixarrays will prove to be better in practice than suffixtrees for many applications.