Home
/
Topics
/
Edit distance

Topic

Edit distance

About: Edit distance is a research topic. Over the lifetime, 2887 publications have been published within this topic receiving 71491 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1981
1980
1976
1975
1974

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Learning Metrics Between Tree Structured Data: Application to Image Recognition

[...]

Laurent Boyer, Amaury Habrard¹, Marc Sebban•Institutions (1)

University of Provence¹

17 Sep 2007

TL;DR: This article proposes an original experimental approach aiming at representing images by a tree-structured representation and then at using the learned metric in an image recognition task.

...read moreread less

Abstract: The problem of learning metrics between structured data (strings, trees or graphs) has been the subject of various recent papers. With regard to the specific case of trees, some approaches focused on the learning of edit probabilities required to compute a so-called stochastic tree edit distance. However, to reduce the algorithmic and learning constraints, the deletion and insertion operations are achieved on entire subtrees rather than on single nodes. We aim in this article at filling the gap with the learning of a more general stochastic tree edit distance where node deletions and insertions are allowed. Our approach is based on an adaptation of the EM optimization algorithm to learn parameters of a tree model. We propose an original experimental approach aiming at representing images by a tree-structured representation and then at using our learned metric in an image recognition task. Comparisons with a non learned tree edit distance confirm the effectiveness of our approach.

...read moreread less

19 citations

Journal Article•

Large edit distance with multiple block operations

[...]

Dana Shapira¹, James A. Storer¹•Institutions (1)

Brandeis University¹

01 Jan 2003-Lecture Notes in Computer Science

TL;DR: Ergun, Muthukrishnan and Sahinalp as mentioned in this paper showed that the edit distance problem with block deletions can be solved optimally, but edit distance with block moves and deletions remains NP-complete and can be reduced to the problem of block moves only.

...read moreread less

Abstract: We consider the addition of some or all of the operations block move, block delete, block copy, block reversals, and block copy reversals, to the traditional edit distance problem (finding the minimum number of insert-character and delete-character operations to convert one string to another). When all of the above operations are allowed, the problem, called the nearest neighbors problem, is NP hard, and the best known approximation is O(log n log* n), which was achieved by Muthukrishnan and Sahinalp [2000,2002a]. In this paper we show that this problem can be approximated by a constant factor of 3.5 using a simple sliding window method. When eliminating reversals, the same method reduces the best known approximation of 12, achieved by Ergun, Muthukrishnan and Sahinalp [2003], down to a factor of 4. Both constant factors are proved to be tight. Allowing only subsets of these operations does not necessarily make the problem easier. Shapira and Storer [2002] present a log n factor approximation algorithm for edit distance with block moves (which is also an NP-complete problem). Here, we show that edit distance with block deletions can be solved optimally, but edit distance with block moves and block deletions remains NP-complete and can be reduced to the problem of block moves only, keeping the same log n factor approximation.

...read moreread less

18 citations

Journal Article•DOI•

Approximate all-pairs suffix/prefix overlaps

[...]

Niko Välimäki¹, Susana Ladra², Veli Mäkinen¹•Institutions (2)

Helsinki Institute for Information Technology¹, University of A Coruña²

01 Apr 2012-Information & Computation

TL;DR: This work proposes a new solution for approximate overlaps based on backward backtracking (Lam, et al., 2008) and suffix filters (Karkkainen and Na, 2008), and uses nH"k+o([email protected])+rlogr bits of space, where H"k is the k-th order entropy and @s the alphabet size.

...read moreread less

Abstract: Finding approximate overlaps is the first phase of many sequence assembly methods. Given a set of strings of total length n and an error-rate @e, the goal is to find, for all-pairs of strings, their suffix/prefix matches (overlaps) that are within edit distance [email protected][email protected]@[email protected]?, where @? is the length of the overlap. We propose a new solution for this problem based on backward backtracking (Lam, et al., 2008) and suffix filters (Karkkainen and Na, 2008). Our technique uses nH"k+o([email protected])+rlogr bits of space, where H"k is the k-th order entropy and @s the alphabet size. In practice, it is more scalable in terms of space, and comparable in terms of time, than q-gram filters (Rasmussen, et al., 2006). Our method is also easy to parallelize and scales up to millions of DNA reads.

...read moreread less

18 citations

An Adaptive String Comparator for Record Linkage

[...]

William E. Yancey

01 Jan 2004

TL;DR: A string comparator based on edit distance that uses variable edit-step costs derived from training data and is compared with the JaroWinkler string comparators and with the Census Bureau’s record linkage software.

...read moreread less

Abstract: We develop a string comparator based on edit distance that uses variable edit-step costs derived from training data. Using first and last name data from Census files, we compare the performance of this string comparator with one without variable edit step costs and with the JaroWinkler string comparator, which is standardly used in the Census Bureau’s record linkage software.

...read moreread less

18 citations

Journal Article•DOI•

The edit-distance between a regular language and a context-free language

[...]

Yo-Sub Han¹, Sang-Ki Ko¹, Kai Salomaa²•Institutions (2)

Yonsei University¹, Queen's University²

01 Nov 2013-International Journal of Foundations of Computer Science

TL;DR: The edit-distance between two strings is the smallest number of operations required to transform one string into the other.

...read moreread less

Abstract: The edit-distance between two strings is the smallest number of operations required to transform one string into the other. The distance between languages L1 and L2 is the smallest edit-distance be...

...read moreread less

18 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
…
152
153
154
155
156
157
158
…
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

3,030

Papers

78,281

Citations

No. of papers in the topic in previous years
Year	Papers
2023	39
2022	96
2021	111
2020	149
2019	145
2018	139

Edit distance

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics