Home
/
Topics
/
Edit distance

Topic

Edit distance

About: Edit distance is a research topic. Over the lifetime, 2887 publications have been published within this topic receiving 71491 citations.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1981
1980
1976
1975
1974

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Recognition of Noisy Subsequences Using Constrained Edit Distances

[...]

B. John Oommen¹•Institutions (1)

Carleton University¹

01 May 1987-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: An algorithm to compute the constrained edit distance subject to any arbitrary edit constraint involving the number and type of edit operations to be performed has been presented and demonstrates remarkable accuracy.

...read moreread less

Abstract: Let X* be any unknown word from a finite dictionary H Let U be any arbitrary subsequence of X* We consider the problem of estimating X* by processing Y, which is a noisy version of U We do this by defining the constrained edit distance between XH and Y subject to any arbitrary edit constraint involving the number and type of edit operations to be performed An algorithm to compute this constrained edit distance has been presented Although in general the algorithm has a cubic time complexity, within the framework of our solution the algorithm possesses a quadratic time complexity Recognition using the constrained edit distance as a criterion demonstrates remarkable accuracy Experimental results which involve strings of lengths between 40 and 80 and which contain an average of 26547 errors per string demonstrate that the scheme has about 995 percent accuracy

...read moreread less

55 citations

Book Chapter•DOI•

A Metric Index for Approximate String Matching

[...]

Edgar Chávez¹, Gonzalo Navarro²•Institutions (2)

Universidad Michoacana de San Nicolás de Hidalgo¹, University of Chile²

03 Apr 2002

TL;DR: A radically new indexing approach for approximate string matching where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space.

...read moreread less

Abstract: We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us finding the R occurrences of a pattern of length m in a text of length n in average time O(mlog2 n+m2+R), using O(n log n) space and O(n log2 n) index construction time. This complexity improves by far over all other previous methods. We also show a simpler scheme needing O(n) space.

...read moreread less

55 citations

Proceedings Article•

Evaluating Text Segmentation using Boundary Edit Distance

[...]

Chris Fournier¹•Institutions (1)

University of Ottawa¹

01 Aug 2013

TL;DR: This work proposes a new segmentation evaluation metric, named boundary similarity (B), an inter-coder agreement coefficient adaptation, and a confusion-matrix for segmentation that are all based upon an adaptation of the boundary edit distance in Fournier and Inkpen (2012).

...read moreread less

Abstract: This work proposes a new segmentation evaluation metric, named boundary similarity (B), an inter-coder agreement coefficient adaptation, and a confusion-matrix for segmentation that are all based upon an adaptation of the boundary edit distance in Fournier and Inkpen (2012). Existing segmentation metrics such as Pk, WindowDiff, and Segmentation Similarity (S) are all able to award partial credit for near misses between boundaries, but are biased towards segmentations containing few or tightly clustered boundaries. Despite S’s improvements, its normalization also produces cosmetically high values that overestimate agreement & performance, leading this work to propose a solution.

...read moreread less

55 citations

Proceedings Article•DOI•

Approximating edit distance in near-linear time

[...]

Alexandr Andoni¹, Krzysztof Onak¹•Institutions (1)

Massachusetts Institute of Technology¹

31 May 2009

TL;DR: This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time, improving on the state-of-the-art n^{(1/3+o(1)) approximation.}

...read moreread less

Abstract: We show how to compute the edit distance between two strings of length n up to a factor of 2(O-tilde(sqrt(log n))) in n(1+o(1)) time. This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time, improving on the state-of-the-art n(1/3+o(1)) approximation. Previously, approximation of 2O √log n) was known only for embedding edit distance into l1, and it is not known if that embedding can be computed in less than a quadratic time.

...read moreread less

55 citations

Posted Content•

Approximating Edit Distance in Near-Linear Time

[...]

Alexandr Andoni, Krzysztof Onak

26 Sep 2011-arXiv: Data Structures and Algorithms

TL;DR: In this paper, the first sub-polynomial approximation algorithm for the edit distance between two strings of length n up to a factor of 2^(1+o(1)) was presented.

...read moreread less

Abstract: We show how to compute the edit distance between two strings of length n up to a factor of 2^{\~O(sqrt(log n))} in n^(1+o(1)) time. This is the first sub-polynomial approximation algorithm for this problem that runs in near-linear time, improving on the state-of-the-art n^(1/3+o(1)) approximation. Previously, approximation of 2^{\~O(sqrt(log n))} was known only for embedding edit distance into l_1, and it is not known if that embedding can be computed in less than quadratic time.

...read moreread less

54 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
…
54
55
56
57
58
59
60
…
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

3,030

Papers

78,281

Citations

No. of papers in the topic in previous years
Year	Papers
2023	39
2022	96
2021	111
2020	149
2019	145
2018	139

Edit distance

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics