Showing papers on "Approximate string matching published in 1988"

PDF

Open Access

Journal Article•DOI•

Fast string matching with k -differences

[...]

Gad M. Landau¹, Uzi Vishkin²•Institutions (2)

Tel Aviv University¹, New York University²

01 Aug 1988

TL;DR: This work presents an algorithm for finding all occurrences of the pattern in the text, each with at most k differences, given a text of length n, a pattern of length m, and an integer k.

...read moreread less

Abstract: Consider the string matching problem where differences between characters of the pattern and characters of the text are allowed. Each difference is due to either a mismatch between a character of the text and a character of the pattern or a superfluous character in the text or a superfluous character in the pattern. Given a text of length n , a pattern of length m , and an integer k , we present an algorithm for finding all occurrences of the pattern in the text, each with at most k differences. It runs in O ( m + nk 2 ) time for an alphabet whose size is fixed. For general input the algorithm requires O ( m log m + nk 2 ) time. In both cases the space requirement is O ( m ).

...read moreread less

203 citations

Journal Article•DOI•

Data structures and algorithms for approximate string matching

[...]

Zvi Galil¹, Raffaele Giancarlo¹•Institutions (1)

Columbia University¹

01 Mar 1988-Journal of Complexity

TL;DR: This paper surveys techniques for designing efficient sequential and parallel approximate string matching algorithms and special attention is given to the methods for the construction of data structures that efficiently support primitive operations needed in approximatestring matching.

...read moreread less

153 citations

Journal Article•DOI•

Parallel construction of a suffix tree with applications

[...]

Alberto Apostolico¹, Costas S. Iliopoulos¹, Gad M. Landau², Baruch Schieber², Uzi Vishkin³ - Show less +1 more•Institutions (3)

Purdue University¹, Tel Aviv University², Courant Institute of Mathematical Sciences³

01 Nov 1988-Algorithmica

TL;DR: This paper presents a CRCW parallel RAM algorithm that constructs the suffix tree associated with a string ofn symbols inO(logn) time withn processors that requires Θ(n2) space.

...read moreread less

Abstract: Many string manipulations can be performed efficiently on suffix trees. In this paper a CRCW parallel RAM algorithm is presented that constructs the suffix tree associated with a string ofn symbols inO(logn) time withn processors. The algorithm requires ź(n2) space. However, the space needed can be reduced toO(n1+ź) for any 0< ź ≤1, with a corresponding slow-down proportional to 1/ź. Efficient parallel procedures are also given for some string problems that can be solved with suffix trees.

...read moreread less

152 citations

Journal Article•DOI•

Fast approximate string matching

[...]

Olumide Owolabi¹, D. R. McGregor¹•Institutions (1)

University of Strathclyde¹

01 Apr 1988-Software - Practice and Experience

TL;DR: A new similarity measure based on the Levenshtein metric is defined for this comparison and the resulting method is both computationally fast and storage‐efficient.

...read moreread less

Abstract: Approximate string matching is an important operation in information systems because an input string is often an inexact match to the strings already stored. Commonly known accurate methods are computationally expensive as they compare the input string to every entry in the stored dictionary. This paper describes a two-stage process. The first uses a very compact n-gram table to preselect sets of roughly similar strings. The second stage compares these with the input string using an accurate method to give an accurately matched set of strings. A new similarity measure based on the Levenshtein metric is defined for this comparison. The resulting method is both computationally fast and storage-efficient.

...read moreread less

70 citations

Journal Article•

On improving the average case of the Boyer-Moore string matching algorithm

[...]

Zhu Rui Feng¹, Tadao Takaoka¹•Institutions (1)

Ibaraki University¹

01 Jul 1988-Journal of Information Processing

54 citations

Book Chapter•DOI•

String Matching with Constraints

[...]

Maxime Crochemore¹•Institutions (1)

University of Paris¹

29 Aug 1988

TL;DR: Two string-matching algorithms belonging to the second family are presented, which respectively obey to time and space constraints.

...read moreread less

Abstract: Pattern recognition in a constantly growing field of research. Identification of pattern in images, for instance, is a first step towards their interpretation. More generally, all formal systems handling strings of symbols involve parsing phases to recognize certain patterns. Regular expressions is one of the techniques to specify simple patterns [26]. It leads to practicable algorithms available under most operating systems or edition tons especially with Unix. String-matching is a particular case of pattern recognition. It consists in locating a word inside another word, called the text. Solutions to this problem can be divided into two families. In the first one the text is considered as fixed while the word is variable. This situation occurs when the text is a dictionary, for example. The basic solution of that sort is due to Weiner who introduced the notion of position trees [29]. It is a kind of index which as been improved in different ways (see [21], [5], [10]). For the second family of solutions to string-matching, it is the word that is fixed. The two most famous and efficient string-matching algorithms of this family have been designed by Knuth, Morris & Pratt [t8] and Boyer & Moore [7]. They have been subject to several studies, improvements or extensions (see [1], [11], [13-16], [22], [23], [25], [28]). A variation to the initial problem happens when approximate patterns are considered (see [20], [27]). Stringmatching is close to detection of repetitions in strings (see [3], [10], [17], [25]). In fact, the study of regularities in strings is a part of the analysis of string-matching algorithms. In this paper, two string-matching algorithms belonging to the second family are presented. They respectively obey to time and space constraints. Both algorithms start by a first phase during which the word alone is processed. Then, the search is done during a second phase which essentially supports the contraints.

...read moreread less

32 citations

A fast parallel algorithm to determine edit distance

[...]

Thomas R. Mathies¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1988

TL;DR: An algorithm that runs in <9(logmlogrt) time and uses mn processors on a CRCW PRAM, where m and n are the lengths of the strings and the largest common submatrix of two matrices is considered and shown to be NP-hard.

...read moreread less

Abstract: We consider the problem of determining in parallel the cost of converting a source string to a destination string by a sequence of insert, delete and transform operations. Each operation has an integer cost in some fixed range. We present an algorithm that runs in <9(logmlogrt) time and uses mn processors on a CRCW PRAM, where m and n are the lengths of the strings. The best known sequential algorithm [MP83] runs in time 0(n/ log n) for strings of length n, indicating that our parallel algorithm (with time-processor product equal to 0(mn log m log n)) is nearly optimal. An instance of the edit distance problem is represented as a graph. The algorithm finds the shortest path in the graph using a path doubling method with efficient pruning due to the structure of the problem. Extensions of the algorithm solve approximate string matching and local best fit problems. The problem of finding the largest common submatrix of two matrices is considered and shown to be NP-hard. Finally we present an algorithm for exact two-dimensional pattern matching that runs in OClog n) time using n processors for a n x n search matrix.

...read moreread less

20 citations

Book Chapter•DOI•

Constant-Space String-Matching

[...]

Maxime Crochemore¹•Institutions (1)

University of Paris¹

21 Dec 1988

TL;DR: A string-matching algorithm with the following properties: it is linear in time with a small multiplicative constant during all its phases; it preprocesses the string and scans the searched text with constant memory space in addition to the strings.

...read moreread less

Abstract: We present a string-matching algorithm with the following properties: it is linear in time with a small multiplicative constant during all its phases; it preprocesses the string and scans the searched text with constant memory space in addition to the strings.

...read moreread less

6 citations

Journal Article•DOI•

On string pattern matching: a quantitative analysis and a proposal

[...]

Ken-Chih Liu¹•Institutions (1)

Iowa State University¹

03 Jan 1988-Computer Languages

TL;DR: A quantitative analysis of the widely recognized inefficiency of the SNOBOL4 pattern matching algorithm is presented and the possibility of increasing the efficiency of pattern matching by special case processing is discussed and a new approach for string processing languages design along this line is proposed.

...read moreread less

1 citations

Book Chapter•DOI•

Approximate String Matching: Investigations with a Hardware String Comparator

[...]

Olumide Owolabi¹, John D. Ferguson•Institutions (1)

University of Strathclyde¹

28 Mar 1988

TL;DR: An algorithm is developed for determining relative string similarity and an architecture for comparing strings using this algorithm is also developed.

...read moreread less

Abstract: Approximate string matching attempts to determine how similar two strings are. An algorithm is developed for determining relative string similarity. An architecture for comparing strings using this algorithm is also developed.

...read moreread less

Journal Article•DOI•

A hardware string comparator

[...]

Olumide Owolabi¹, John D. Ferguson¹•Institutions (1)

University of Strathclyde¹

01 Jan 1988-Journal of Microcomputer Applications

TL;DR: The paper compares the performance of the dynamic programming algorithm and the Proximity processor, and highlights the speed and recall advantages of the latter.

...read moreread less

Journal Article•

Approximate String Matching: Investigations with a Hardware String Comparator.

[...]

Olumide Owolabi¹, John D. Ferguson•Institutions (1)

University of Strathclyde¹

01 Jan 1988-Pattern Recognition

TL;DR: In this article, an approximate string matching algorithm was developed to determine how similar two strings are, and an architecture for comparing strings using this algorithm was also developed, which can be used to determine relative string similarity.

...read moreread less