Home
/
Topics
/
Approximate string matching

Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973

Papers

PDF

Open Access

More filters

Journal Article•DOI•

String Noninclusion Optimization Problems

[...]

Anatoly R. Rubinov, Vadim G. Timkovsky

01 Aug 1998-SIAM Journal on Discrete Mathematics

TL;DR: This paper considers a class of opposite problems connected with string noninclusion relations: find a shortest string included in no string of a given finite language and find a longest string including nostring of agiven finite language.

...read moreread less

Abstract: For every string inclusion relation there are two optimization problems: find a longest string included in every string of a given finite language, and find a shortest string including every string of a given finite language. As an example, the two well-known pairs of problems, the longest common substring (or subsequence) problem and the shortest common superstring (or supersequence) problem, are interpretations of these two problems. In this paper we consider a class of opposite problems connected with string noninclusion relations: find a shortest string included in no string of a given finite language and find a longest string including no string of a given finite language. The predicate "string $\alpha$ is not included in string $\beta$" is interpreted as either "$\alpha$ is not a substring of $\beta$" or "$\alpha$ is not a subsequence of $\beta$". The main purpose is to determine the complexity status of the string noninclusion optimization problems. Using graph approaches we present polynomial-time algorithms for the first interpretation and NP-hardness proofs for the second. We also discuss restricted versions of the problems, correlations between the string inclusion and noninclusion problems, and generalized problems which are the string inclusion problems for one language and the string noninclusion problems for another. In applications the string inclusion problems are used to find a similarity between any structures which can be represented by strings. Respectively, the noninclusion problems can be used to find a nonsimilarity. Such problems occur in computational molecular biology, data compression, pattern recognition, and flexible manufacturing. The above generalized problems arise naturally in all of these applied areas. Apart from this practical reason, we hope that studying the string noninclusion problems will yield deeper understanding of the string inclusion problems.

...read moreread less

12 citations

Proceedings Article•

Forward-Fast-Search: Another Fast Variant of the Boyer-Moore String Matching Algorithm.

[...]

Domenico Cantone, Simone Faro

01 Jan 2003

12 citations

Book Chapter•DOI•

Hardness of String Similarity Search and Other Indexing Problems

[...]

S. Cenk Sahinalp¹, Andrey Utis²•Institutions (2)

Simon Fraser University¹, University of Maryland, College Park²

12 Jul 2004

TL;DR: This paper presents a solution to the similarity search problem of finding the string that has the smallest edit distance to a query string on a point Q in the form of nearest neighbors.

...read moreread less

Abstract: Similarity search is a fundamental problem in computer science. Given a set of points A={A 1,...,A p } from a universe U and a distance measure D, it is possible to pose similarity search queries on a point Q in the form of nearest neighbors (find the string that has the smallest edit distance to a query string) or in the form of furthest neighbors (find the string that has the longest common subsequence with a query string).

...read moreread less

12 citations

Journal Article•DOI•

Bit-parallel approximate string matching algorithms with transposition

[...]

Heikki Hyyrö¹•Institutions (1)

University of Tampere¹

01 Jun 2005-Journal of Discrete Algorithms

TL;DR: A uniform way of modifying each of these algorithms to permit also a fourth type of edit operation: transposing two adjacent characters in the pattern, also known as Damerau edit distance is discussed.

...read moreread less

12 citations

Journal Article•DOI•

Quick-Skip Search Hybrid Algorithm for the Exact String Matching Problem

[...]

Mustafa Abdul Sahib Naser, Nur'Aini Abdul Rashid, Mohammed Faiz Aboalmaaly

01 Jan 2012-International Journal of Computer Theory and Engineering

TL;DR: This research proposes a hybrid exact string matching algorithm by combining the good properties of the Quick Search and the Skip Search algorithms to demonstrate and devise a better method to solve the string matching problem with higher speed and lower cost.

...read moreread less

Abstract: The string matching problem occupies a corner stone in many computer science fields because of the fundamental role it plays in various computer applications. Thus, several string matching algorithms have been produced and applied in most operating systems, information retrieval, editors, internet searching engines, firewall interception and searching nucleotide or amino acid sequence patterns in genome and protein sequence databases. Several important factors are considered during the matching process such as number of character comparisons, number of attempts and the consumed time. This research proposes a hybrid exact string matching algorithm by combining the good properties of the Quick Search and the Skip Search algorithms to demonstrate and devise a better method to solve the string matching problem with higher speed and lower cost. The hybrid algorithm was tested using different types of standard data. The hybrid algorithm provides efficient results and reliability compared with the original algorithms in terms of number of character comparisons and number of attempts when the hybrid algorithm applied with different pattern lengths. Additionally, the hybrid algorithm produced better quality in performance through providing less time complexity for the worst and best cases comparing with other hybrid algorithms.

...read moreread less

12 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
…
133
134
135
136
137
138
139
…
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,944

Papers

64,998

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	30
2021	33
2020	30
2019	48
2018	39

Approximate string matching

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics