Home
/
Topics
/
Approximate string matching

Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973

Papers

PDF

Open Access

More filters

Book Chapter•DOI•

Approximate String Matching with Address Bit Errors

[...]

Amihood Amir¹, Yonatan Aumann², Oren Kapah², Avivit Levy³, Ely Porat² - Show less +1 more•Institutions (3)

Johns Hopkins University¹, Bar-Ilan University², University of Haifa³

18 Jun 2008

TL;DR: In this paper, the case where bits of imay be erroneously flipped, either in a consistent or transient manner is considered, and the corresponding approximate pattern matching problems are formally defined and efficient algorithms for their resolution are provided.

...read moreread less

Abstract: A string Si¾? Σmcan be viewed as a set of pairs S= { (i¾? i , i) : ii¾? { 0,..., mi¾? 1} }. We consider approximate pattern matching problems arising from the setting where errors are introduced to the location component (i), rather than the more traditional setting, where errors are introduced to the content itself (i¾? i ). In this paper, we consider the case where bits of imay be erroneously flipped, either in a consistent or transient manner. We formally define the corresponding approximate pattern matching problems, and provide efficient algorithms for their resolution, while introducing some novel techniques.

...read moreread less

22 citations

Journal Article•DOI•

A mean string algorithm to compute the average among a set of 2D shapes

[...]

Gemma Sánchez¹, Josep Lladós¹, Karl Tombre•Institutions (1)

Autonomous University of Barcelona¹

01 Jan 2002-Pattern Recognition Letters

TL;DR: An algorithm to compute the mean shape, when the shape is represented by a string, is presented as a modification of the well-known string edit algorithm, which converts sets of mapped symbols into piecewise linear functions and compute their mean.

...read moreread less

22 citations

Journal Article•DOI•

An accurate toponym-matching measure based on approximate string matching

[...]

Deniz Kilinç¹•Institutions (1)

Celal Bayar University¹

01 Apr 2016-Journal of Information Science

TL;DR: The creation of a single string-matching measure that can perform toponym matching process regardless of the language was attempted, and the creation of an ASM measure called DAS, which comprises name similarity, word similarity and sentence similarity phases, was created.

...read moreread less

Abstract: Approximate string matching ASM is a challenging problem, which aims to match different string expressions representing the same object In this paper, detailed experimental studies were conducted on the subject of toponym matching, which is a new domain where ASM can be performed, and the creation of a single string-matching measure that can perform toponym matching process regardless of the language was attempted For this purpose, an ASM measure called DAS, which comprises name similarity, word similarity and sentence similarity phases, was created Considering the experimental results, the retrieval performance and system accuracy of DAS were much better than those of other well-known five measures that were compared on toponym test datasets In addition, DAS had the best metric values of mean average precision in six languages, and precision/recall graphs confirm this result

...read moreread less

22 citations

Journal Article•DOI•

Information retrieval from historical newspaper collections in highly inflectional languages: A query expansion approach

[...]

Anni Järvelin¹, Heikki Keskustalo¹, Eero Sormunen¹, Miamaria Saastamoinen¹, Kimmo Kettunen - Show less +1 more•Institutions (1)

University UCINF¹

01 Dec 2016

TL;DR: Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.

...read moreread less

Abstract: The aim of the study was to test whether query expansion by approximate string matching methods is beneficial in retrieval from historical newspaper collections in a language rich with compounds and inflectional forms Finnish. First, approximate string matching methods were used to generate lists of index words most similar to contemporary query terms in a digitized newspaper collection from the 1800s. Top index word variants were categorized to estimate the appropriate query expansion ranges in the retrieval test. Second, the effectiveness of approximate string matching methods, automatically generated inflectional forms, and their combinations were measured in a Cranfield-style test. Finally, a detailed topic-level analysis of test results was conducted. In the index of historical newspaper collection the occurrences of a word typically spread to many linguistic and historical variants along with optical character recognition OCR errors. All query expansion methods improved the baseline results. Extensive expansion of around 30 variants for each query word was required to achieve the highest performance improvement. Query expansion based on approximate string matching was superior to using the inflectional forms of the query words, showing that coverage of the different types of variation is more important than precision in handling one type of variation.

...read moreread less

22 citations

Patent•

Voice fuzzy retrieval method and apparatus

[...]

Ji Wu, Ping Lu, Chen Zhigang, Wang Zhiguo, Guoping Hu, Xiaoru Wu, Sheng Qian, Yu Hu, Renhua Wang, Qingfeng Liu - Show less +6 more

11 Aug 2010

TL;DR: In this paper, a method and a device which are used for speech fuzzy retrieval, wherein, the method comprises the following steps: speech recognition is performed on the obtained speech signals by utilizing a preset acoustic model and a language model, and recognition results are obtained; retrieval is performed in a preset text entry database using a preset index table according to the recognition results.

...read moreread less

Abstract: The invention discloses a method and a device which are used for speech fuzzy retrieval, wherein, the method comprises the following steps: speech recognition is performed on the obtained speech signals by utilizing a preset acoustic model and a language model, and recognition results are obtained; retrieval is performed in a preset text entry database by utilizing a preset index table according to the recognition results, and primarily elected entries are obtained; fuzzy matching for character strings is performed between the primarily elected entries and the recognition results, entries of which the matching degree is in a threshold value range of preset matching degree are selected as well-chosen entries, and meanwhile, the matching position of each entry is recorded; posterior probability between the text of the matching part and the well-chosen entries and voice signals are calculated; and finally, a plurality of entries are selected as the retrieval results of voice signals by utilizing the posterior probability and the matching proportion obtained through the matching positions. By adopting the invention, text entries matched with the voice signals can be retrieved quickly and accurately in a great capacity text entry database on the basis of voice signals.

...read moreread less

21 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
…
91
92
93
94
95
96
97
…
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,942

Papers

64,998

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	30
2021	32
2020	30
2019	48
2018	39

Approximate string matching

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics