Home
/
Topics
/
Approximate string matching

Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Stacked ensemble combined with fuzzy matching for biomedical named entity recognition of diseases.

[...]

Balu Bhasuran¹, Gurusamy Murugesan¹, Sabenabanu Abdulkadhar¹, Jeyakumar Natarajan¹•Institutions (1)

Bharathiar University¹

01 Dec 2016-Journal of Biomedical Informatics

TL;DR: A stacked ensemble approach combined with fuzzy matching for biomedical named entity recognition of disease names and fuzzy string matching to tag rare disease names from the authors' in-house disease dictionary is implemented.

...read moreread less

46 citations

Book Chapter•DOI•

Approximate String Matching over Ziv-Lempel Compressed Text

[...]

Juha Kärkkäinen¹, Gonzalo Navarro², Esko Ukkonen¹•Institutions (2)

University of Helsinki¹, University of Chile²

21 Jun 2000

TL;DR: The algorithm can be adapted to run in O(k2n+min(mkn,m2(mσ)k) + R) average time, where σ is the alphabet size, and results show a speedup over the basic approach for moderate m and small k.

...read moreread less

Abstract: We present a solution to the problem of performing approximate pattern matching on compressed text. The format we choose is the Ziv-Lempel family, specifically the LZ78 and LZW variants. Given a text of length u compressed into length n, and a pattern of length m, we report all the R occurrences of the pattern in the text allowing up to k insertions, deletions and substitutions, in O(mkn+R) time. The existence problem needs O(mkn) time. We also show that the algorithm can be adapted to run in O(k2n+min(mkn,m2(mσ)k) + R) average time, where σ is the alphabet size. The experimental results show a speedup over the basic approach for moderate m and small k.

...read moreread less

46 citations

Journal Article•DOI•

Approximately matching context-free languages

[...]

Gene Myers¹•Institutions (1)

University of Arizona¹

28 Apr 1995-Information Processing Letters

TL;DR: An O(PN2(N + log P)) algorithm for approximately matching a string of length N and a context-free language specified by a grammar of size P is given, which generalizes the Cocke-Younger-Kasami algorithm for determining membership in a context -free language.

...read moreread less

45 citations

Patent•

Systems and methods for validating an address

[...]

Robert F. Snapp¹, James Daniel Self¹•Institutions (1)

United States Postal Service¹

29 Jun 2007

TL;DR: Fuzzy matching fields as discussed by the authors can be used to replace the input field, thereby correcting the input information, which can also be used for the verification of a digital representation of an input digital representation.

...read moreread less

Abstract: Systems, methods, and software determine whether a field of an input digital representation of information, such as the street name field in an address, is correct by quickly comparing the field to a list of valid choices for that field. The list of valid choices is generated based on information from the input digital representation, such as a character string. If an exact match is not found, a fuzzy match comparison determines the most closely matching valid choice. If a suitable fuzzy match is not found, then the input information is invalid. Otherwise, another field of the input information, such as the building number field of an address, is tested for validity. If the second field passes the validity check, then the fuzzy match (or exact match) for the field is valid. A fuzzy matching field may replace the input field, thereby correcting the input information.

...read moreread less

45 citations

Patent•

Adaptively weighted, partitioned context edit distance string matching

[...]

Alvin Richardson, Charles Davis, Daniel P. Miranker

26 Jul 2001

TL;DR: In this article, a pattern is partitioned into context and value components, and candidate matches for each of the components is identified by calculating an edit distance between that component and each potentially matching set (sub-string) of symbols within the string.

...read moreread less

Abstract: A system and method for examining a string of symbols and identifying portions of the string which match a predetermined pattern using adaptively weighted, partitioned context edit distances. A pattern is partitioned into context and value components, and candidate matches for each of the components is identified by calculating an edit distance between that component and each potentially matching set (sub-string) of symbols within the string. One or more candidate matches having the lowest edit distances are selected as matches for the pattern. The weighting of each of the component matches may be adapted to optimize the pattern matching and, in one embodiment, the context components may be heavily weighted to obtain matches of a value for which the corresponding pattern is not well defined. In one embodiment, an edit distance matrix is evaluated for each of a prefix component, a value component and a suffix component of a pattern. The evaluation of the prefix matrix provides a basis for identifying indicators of the beginning of a value window, while the evaluation of the suffix matrix provides a basis for identifying the alignment of the end of the value window. The value within the value window can then be evaluated via the value matrix to determine a corresponding value match score.

...read moreread less

45 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
…
48
49
50
51
52
53
54
…
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,942

Papers

64,998

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	30
2021	32
2020	30
2019	48
2018	39

Approximate string matching

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics