Home
/
Topics
/
Approximate string matching

Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.

...read moreread less

Papers published on a yearly basis

2023
2022
2021
2020
2019
2018
2017
2016
2015
2014
2013
2012
2011
2010
2009
2008
2007
2006
2005
2004
2003
2002
2001
2000
1999
1998
1997
1996
1995
1994
1993
1992
1991
1990
1989
1988
1987
1986
1985
1984
1983
1982
1981
1980
1979
1978
1977
1976
1975
1974
1973

Papers

PDF

Open Access

More filters

Journal Article•DOI•

The string merging problem

[...]

Stephen Y. Itoga¹•Institutions (1)

University of Hawaii¹

01 Mar 1981-Bit Numerical Mathematics

TL;DR: A special case where deletion is the only allowed edition operation is shown to have the longest common subsequence of the strings as its solution.

...read moreread less

Abstract: The string merging problem is to determine a merged string from a given set of strings. The distinguishing property of a solution is that the total cost of editing all of the given strings into this solution is minimal. Necessary and sufficient conditions are presented for the case where this solution matches the solution to the string-to-string correction problem. A special case where deletion is the only allowed edition operation is shown to have the longest common subsequence of the strings as its solution.

...read moreread less

36 citations

Proceedings Article•DOI•

A Discriminative Candidate Generator for String Transformations

[...]

Naoaki Okazaki¹, Yoshimasa Tsuruoka², Sophia Ananiadou², Jun'ichi Tsujii²•Institutions (2)

University of Tokyo¹, University of Manchester²

25 Oct 2008

TL;DR: A discriminative approach for generating candidate strings that uses substring substitution rules as features and scores them using an L1-regularized logistic regression model and demonstrates the remarkable performance of the proposed method in normalizing inflected words and spelling variations.

...read moreread less

Abstract: String transformation, which maps a source string s into its desirable form t*, is related to various applications including stemming, lemmatization, and spelling correction. The essential and important step for string transformation is to generate candidates to which the given string s is likely to be transformed. This paper presents a discriminative approach for generating candidate strings. We use substring substitution rules as features and score them using an L1-regularized logistic regression model. We also propose a procedure to generate negative instances that affect the decision boundary of the model. The advantage of this approach is that candidate strings can be enumerated by an efficient algorithm because the processes of string transformation are tractable in the model. We demonstrate the remarkable performance of the proposed method in normalizing inflected words and spelling variations.

...read moreread less

35 citations

Proceedings Article•DOI•

A mathematics retrieval system for formulae in layout presentations

[...]

Xiaoyan Lin¹, Liangcai Gao¹, Xuan Hu¹, Zhi Tang¹, Yingnan Xiao², Xiaozhong Liu³ - Show less +2 more•Institutions (3)

Peking University¹, Beijing University of Posts and Telecommunications², Indiana University³

03 Jul 2014

TL;DR: Experiments show that the new system along with novel algorithms, comparing with two representative mathematics retrieval systems, provides more efficient mathematical formula index and retrieval, while simplifying user query input for PDF documents.

...read moreread less

Abstract: The semantics of mathematical formulae depend on their spatial structure, and they usually exist in layout presentations such as PDF, LaTeX, and Presentation MathML, which challenges previous text index and retrieval methods. This paper proposes an innovative mathematics retrieval system along with the novel algorithms, which enables efficient formula index and retrieval from both webpages and PDF documents. Unlike prior studies, which require users to manually input formula markup language as query, the new system enables users to "copy" formula queries directly from PDF documents. Furthermore, by using a novel indexing and matching model, the system is aimed at searching for similar mathematical formulae based on both textual and spatial similarities. A hierarchical generalization technique is proposed to generate sub-trees from the semi-operator tree of formulae and support substructure match and fuzzy match. Experiments based on massive Wikipedia and CiteSeer repositories show that the new system along with novel algorithms, comparing with two representative mathematics retrieval systems, provides more efficient mathematical formula index and retrieval, while simplifying user query input for PDF documents.

...read moreread less

35 citations

Book Chapter•DOI•

Approximate Pattern Matching with Samples

[...]

Tadao Takaoka¹•Institutions (1)

Ibaraki University¹

25 Aug 1994

TL;DR: A more general analysis of expected time with the simplified algorithm for the one-dimensional case under a non-uniform probability distribution, and it is shown that the method can easily be generalized to the two-dimensional approximate pattern matching problem with sublinear expected time.

...read moreread less

Abstract: We simplify in this paper the algorithm by Chang and Lawler for the approximate string matching problem, by adopting the concept of sampling. We have a more general analysis of expected time with the simplified algorithm for the one-dimensional case under a non-uniform probability distribution, and we show that our method can easily be generalized to the two-dimensional approximate pattern matching problem with sublinear expected time.

...read moreread less

35 citations

Algorithms for computing approximate repetitions in musical sequences

[...]

Emilios Cambouropoulos¹, Maxime Crochemore, Costas S. Iliopoulos, Laurent Mouchard, Yoan José Pinzón Ardila - Show less +1 more•Institutions (1)

Austrian Research Institute for Artificial Intelligence¹

01 Jan 1999

TL;DR: In this paper, the authors introduce two new notions of approximate matching with application in computer assisted music analysis, and present algorithms for each notion of approximation: for approximate string matching and for computing approximate squares.

...read moreread less

Abstract: Here we introduce two new notions of approximate matching with application in computer assisted music analysis. We present algorithms for each notion of approximation: for approximate string matching and for computing approximate squares.

...read moreread less

35 citations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
…
62
63
64
65
66
67
68
…
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse

Network Information

Performance

Metrics

1,942

Papers

64,998

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	30
2021	32
2020	30
2019	48
2018	39

Approximate string matching

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics