scispace - formally typeset
Search or ask a question
Topic

Edit distance

About: Edit distance is a research topic. Over the lifetime, 2887 publications have been published within this topic receiving 71491 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A dynamic programming algorithm is presented to solve the problem based on the distance measure originated from Tanaka and Tanaka as fast as the best-known algorithm for comparing two trees using Tanaka's distance measure when the allowed distance between the common substructures is a constant independent of the input trees.

58 citations

Proceedings Article
23 Sep 2007
TL;DR: This paper develops the formulas for selectivity estimation and provides the algorithm BasicEQ, and shows a comprehensive set of experiments using three benchmarks comparing Opt EQ with the state-of-the-art method SEPIA.
Abstract: There are many emerging database applications that require accurate selectivity estimation of approximate string matching queries. Edit distance is one of the most commonly used string similarity measures. In this paper, we study the problem of estimating selectivity of string matching with low edit distance. Our framework is based on extending q-grams with wildcards. Based on the concepts of replacement semi-lattice, string hierarchy and a combinatorial analysis, we develop the formulas for selectivity estimation and provide the algorithm BasicEQ. We next develop the algorithm Opt EQ by enhancing BasicEQ with two novel improvements. Finally we show a comprehensive set of experiments using three benchmarks comparing Opt EQ with the state-of-the-art method SEPIA. Our experimental results show that Opt EQ delivers more accurate selectivity estimations.

58 citations

Book ChapterDOI
11 Sep 2002
TL;DR: This paper investigates the performance of metric trees, namely the M-tree, when they are extended using a cheap approximate distance function as a filter to quickly discard irrelevant strings, and shows an improvement in performance up to 90% with respect to the basic case.
Abstract: Searching in a large data set those strings that are more similar, according to the edit distance, to a given one is a time-consuming process. In this paper we investigate the performance of metric trees, namely the M-tree, when they are extended using a cheap approximate distance function as a filter to quickly discard irrelevant strings. Using the bag distance as an approximation of the edit distance, we show an improvement in performance up to 90% with respect to the basic case. This, along with the fact that our solution is independent on both the distance used in the pre-test and on the underlying metric index, demonstrates that metric indices are a powerful solution, not only for many modern application areas, as multimedia, data mining and pattern recognition, but also for the string matching problem.

57 citations

Book ChapterDOI
Horst Bunke1
TL;DR: Some new optimal algorithms for error-tolerant graph matching are discussed, and under specific conditions, the new algorithms may be significantly more efficient than traditional methods.
Abstract: This paper first reviews some theoretical results in error-tolerant graph matching that were obtained recently. The results include a new metric for error-tolerant graph matching based on maximum common subgraph, a relation between maximum common subgraph and graph edit distance, and the existence of classes of cost functions for error-tolerant graph matching. Then some new optimal algorithms for error-tolerant graph matching are discussed. Under specific conditions, the new algorithms may be significantly more efficient than traditional methods.

57 citations

Book ChapterDOI
02 Sep 2008
TL;DR: A graph-based approach to calculate the minimal edit distance between a given defective service and synthesized correct services to automatically fix found errors while keeping the rest of the service untouched is introduced.
Abstract: Many work has been conducted to analyze service choreographies to assert manyfold correctness criteria. While errors can be detectedautomatically, the correctionof defective services is usually done manually. This paper introduces a graph-based approach to calculate the minimal edit distance between a given defective service and synthesized correct services. This edit distance helps to automatically fix found errors while keeping the rest of the service untouched. A prototypic implementation shows that the approach is applicable to real-life services.

56 citations


Network Information
Related Topics (5)
Graph (abstract data type)
69.9K papers, 1.2M citations
86% related
Unsupervised learning
22.7K papers, 1M citations
81% related
Feature vector
48.8K papers, 954.4K citations
81% related
Cluster analysis
146.5K papers, 2.9M citations
81% related
Scalability
50.9K papers, 931.6K citations
80% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202339
202296
2021111
2020149
2019145
2018139