scispace - formally typeset
Search or ask a question
Topic

Edit distance

About: Edit distance is a research topic. Over the lifetime, 2887 publications have been published within this topic receiving 71491 citations.


Papers
More filters
Book ChapterDOI
03 Jul 2002
TL;DR: This paper shows that the faster algorithm of Myers can be adapted to support all the required operations for approximate string matching, and involves extending it to compute edit distance, to search for any pattern suffix, and to detect in advance the impossibility of a later match.
Abstract: We present a new bit-parallel technique for approximate string matching. We build on two previous techniques. The first one [Myers, J. of the ACM, 1999], searches for a pattern of length m in a text of length n permitting k differences in O(mn/w) time, where w is the width of the computer word. The second one [Navarro and Raffinot, ACM JEA, 2000], extends a sublinear-time exact algorithm to approximate searching. The latter technique makes use of an O(kmn/w) time algorithm [Wu and Manber, Comm. ACM, 1992] for its internal workings. This algorithm is slow but flexible enough to support all the required operations. In this paper we show that the faster algorithm of Myers can be adapted to support all those operations. This involves extending it to compute edit distance, to search for any pattern suffix, and to detect in advance the impossibility of a later match. The result is an algorithm that performs better than the original version of Navarro and Raffinot and that is the fastest for several combinations of m, k and alphabet sizes that are useful, for example, in natural language searching and computational biology.

45 citations

Book ChapterDOI
31 May 2000
TL;DR: This work presents an approach to automatically create wrappers by means of an incremental grammar induction algorithm that uses an adaptation of the string edit distance to create such wrappers.
Abstract: To facilitate effective search on the World Wide Web, meta search engines have been developed which do not search the Web themselves, but use available search engines to find the required information. By means of wrappers, meta search engines retrieve information from the pages returned by search engines. We present an approach to automatically create such wrappers by means of an incremental grammar induction algorithm. The algorithm uses an adaptation of the string edit distance. Our method performs well; it is quick, can be used for several types of result pages and requires a minimal amount of user interaction.

45 citations

Journal ArticleDOI
TL;DR: A polynomial time greedy algorithm for non-recursive moves which on a subclass of instances of a problem of size n achieves an approximation factor to optimal of at most O(logn).

45 citations

Book ChapterDOI
TL;DR: This paper shows how the eigenstructure of the adjacency matrix can be used for the purposes of robust graph-matching, by finding the sequence of string edit operations which minimise edit distance.
Abstract: This paper shows how the eigenstructure of the adjacency matrix can be used for the purposes of robust graph-matching. We commence from the observation that the leading eigenvector of a transition probability matrix is the steady state of the associated Markov chain. When the transition matrix is the normalised adjacency matrix of a graph, then the leading eigenvector gives the sequence of nodes of the steady state random walk on the graph. We use this property to convert the nodes in a graph into a string where the node-order is given by the sequence of nodes visited in the random walk. We match graphs represented in this way, by finding the sequence of string edit operations which minimise edit distance.

45 citations

Book ChapterDOI
16 May 2017
TL;DR: An international workshop on Graph-Based Representations in Pattern Recognition and its applications in machine learning and natural language understanding.
Abstract: International Workshop on Graph-Based Representations in Pattern Recognition. GbRPR 2017: Graph-Based Representations in Pattern Recognition pp. 242-252.

45 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Unsupervised learning
22.7K papers, 1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
81% related
Cluster analysis
146.5K papers, 2.9M citations
81% related
Scalability
50.9K papers, 931.6K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202339
202296
2021111
2020149
2019145
2018139