scispace - formally typeset
Search or ask a question

Showing papers on "Approximate string matching published in 1980"


Journal ArticleDOI
TL;DR: An algorithm is described for computing the edit distance between two strings of length n and m, n ⪖ m, which requires O(n · max(1, mlog n) steps whenever the costs of edit operations are integral multiples of a single positive real number and the alphabet for the strings is finite.

739 citations


Journal ArticleDOI
TL;DR: Approximate matching of strings is reviewed with the aim of surveying techniques suitable for finding an item in a database when there may be a spelling mistake or other error in the keyword.
Abstract: Approximate matching of strings is reviewed with the aim of surveying techniques suitable for finding an item in a database when there may be a spelling mistake or other error in the keyword. The methods found are classified as either equivalence or similarity problems. Equivalence problems are seen to be readily solved using canonical forms. For sinuiarity problems difference measures are surveyed, with a full description of the wellestablmhed dynamic programming method relating this to the approach using probabilities and likelihoods. Searches for approximate matches in large sets using a difference function are seen to be an open problem still, though several promising ideas have been suggested. Approximate matching (error correction) during parsing is briefly reviewed.

672 citations


Journal ArticleDOI
TL;DR: The combinatorial structure of periodic strings is studied and a new proof of the linearity of the Boyer-Moore algorithm in the worst case is derived, reducing the previously best known bound of $7n$ to $4n$, where n is the length of the text.
Abstract: The Boyer-Moore algorithm searches for all occurrences of a specified string, the pattern, in another string, the text. We study the combinatorial structure of periodic strings and use these results to derive a new proof of the linearity of the Boyer-Moore algorithm in the worst case. Our proof reduces the previously best known bound of $7n$ to $4n$, where n is the length of the text.

47 citations



Journal ArticleDOI
TL;DR: An attempt to develop a string searching algorithm that begins the search for a match in the middle of the strings being compared, using information gained from mismatches and the location of the search area in the large string to make decisions and direct the search.

11 citations



Journal ArticleDOI
TL;DR: ‘I he string warch problem is defined as follows: Given rwo seqllences of characters, a text of n characfClS T = and a key of m characters K = Ckl, k2, find all indices i such that ‘aa’ is mat&ed at 3 positions in ‘aasbaa’.

3 citations