Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

INSPIRE: A Framework for Incremental Spatial Prefix Query Relaxation

[...]

Yuxin Zheng¹, Zhifeng Bao², Lidan Shou³, Anthony K. H. Tung¹•Institutions (3)

National University of Singapore¹, RMIT University², Zhejiang University³

01 Jul 2015-IEEE Transactions on Knowledge and Data Engineering

TL;DR: In this paper, INSPIRE is proposed, a general framework, which adopts a unifying strategy for processing different variants of spatial keyword queries, and adopts the auto completion paradigm that generates an initial query as a prefix matching query.

...read moreread less

Abstract: Geo-textual data are generated in abundance. Recent studies focused on the processing of spatial keyword queries which retrieve objects that match certain keywords within a spatial region. To ensure effective retrieval, various extensions were done including the allowance of errors in keyword matching and autocompletion using prefix matching. In this paper, we propose INSPIRE, a general framework, which adopts a unifying strategy for processing different variants of spatial keyword queries. We adopt the autocompletion paradigm that generates an initial query as a prefix matching query. If there are few matching results, other variants are performed as a form of relaxation that reuses the processing done in the earlier phase. The types of relaxation allowed include spatial region expansion and exact/approximate prefix/substring matching. Moreover, since the autocompletion paradigm allows appending characters after the initial query, we look at how query processing done for the initial query and relaxation can be reused in such instances. Compared to existing works which process variants of spatial keyword query as new queries over different indexes, our approach offers a more compelling way to efficient and effective spatial keyword search. Extensive experiments substantiate our claims.

...read moreread less

12 citations

Proceedings Article•DOI•

A hardware-efficent multi-character string matching architecture using brute-force algorithm

[...]

Seongyong Ahn¹, Hyejong Hong¹, HyunJin Kim¹, Jin-Ho Ahn², Dongmyong Baek³, Sungho Kang¹ - Show less +2 more•Institutions (3)

Yonsei University¹, Hoseo University², Electronics and Telecommunications Research Institute³

01 Nov 2009

TL;DR: A hardware-efficient string matching architecture using the brute-force algorithm is proposed and a process element that organizes the proposed architecture is optimized by reducing the number of the comparators.

...read moreread less

Abstract: Due to the growth of network environment complexity, the necessity of packet payload inspection at application layer is increased. String matching, which is critical to network intrusions detection systems, inspects packet payloads and detects malicious network attacks using a set of rules. Because string matching is a computationally intensive task, hardware based string matching is required. In this paper, we propose a hardware-efficient string matching architecture using the brute-force algorithm. A process element that organizes the proposed architecture is optimized by reducing the number of the comparators. The performance of the proposed architecture is nearly equal to a previous work. The experimental results show that the proposed architecture with any process width reduces the comparator requirements in comparison with the previous work.

...read moreread less

12 citations

Dissertation•DOI•

Filter algorithms for approximate string matching

[...]

Stefan Burkhardt

01 Jan 2002

TL;DR: The results prove that gapped q-grams are superior to existing filter approaches with respect to speed, filtration efficiency and their potential for use in lossy filters.

...read moreread less

Abstract: In this work we present new results and methods for approximate string matching with filter algorithms We begin with the presentation of QUASAR, our efficient implementation of an improved version of the filter based on the q-gram lemma The q-gram lemma provides a method based on matching substrings to quickly detect potential matches to a query in a subject or target We improved and implemented an algorithm originally introduced in 1991 This resulted in a very efficient program for approximate string matching using an index It was successfully applied to EST-clustering, a problem from computational biology The second part of our work introduces a new class of filters based on gapped q-grams We analyze the potential of this somewhat more complicated approach for use in filters for approximate string matching with an index We consider two important distance measures in approximate string matching, the Hamming and the edit distance For both problems we provide all the tools required to solve them using gapped q-grams This includes threshold computation and the selection of good gapped q-grams using predictions of their speed and filtration effciency Furthermore we consider the potential of gapped q-grams for use in lossy filters We support our findings with extensive experiments Our results prove that gapped q-grams are superior to existing filter approaches with respect to speed, filtration efficiency and their potential for use in lossy filters In dieser Arbeit beschreiben wir neue Ergebnisse und Verfahren auf dem Gebiet der Filteralgorithmen fur Aehnlichkeitssuche in Textdatenbanken Im ersten Teil stellen wir QUASAR, die Implementierung eines verbesserten Filters basierend auf dem sogenannten q-gram Lemma, vor Dieses Lemma basiert auf dem Vergleich von kurzen Teilwoerten und ermoglicht die effiziente Erkennung der Teile einer Textdatenbank, die einer bestimmten Anfrage ahneln Der zweite Teil der Arbeit stellt eine neue Klasse von Filtern die q-grams mit Lucken, sogenannte "gapped q-grams", benutzen vor Wir untersuchen das Potential dieser komplexeren q-grams fur die Nutzung in Filteralgorithmen fur Index-basierte Ahnlichkeitssuche in Textdatenbanken-

...read moreread less

12 citations

Proceedings Article•DOI•

Modeling and prototyping dynamically reconfigurable systems for efficient computation of dynamic programming methods by rewriting-logic

[...]

Mauricio Ayala-Rincón, Ricardo P. Jacobi¹, Luis G. A. Carvalho¹, Carlos H. Llanos¹, Reiner W. Hartenstein² - Show less +1 more•Institutions (2)

University of Brasília¹, Kaiserslautern University of Technology²

04 Sep 2004

TL;DR: This work shows how to use rewriting-logic to model and evaluate reconfigurable systolic architectures which are applied to the efficient treatment of several dynamic programming methods for resolving well-known problems such as global and local sequence alignment, approximate string matching and computation of the longest common subsequence.

...read moreread less

Abstract: Systolic arrays provide a large amount of parallelism. However, their applicability is restricted to a small set of computational problems due to their lack of flexibility. This limitation can be circumvented by using reconfigurable systolic arrays, where the node interconnections and operations can be redefined even at run time. In this context, several alternative systolic architectures can be explored and powerful tools are needed to model and evaluate them. We show how well-known rewriting-logic environments could be used to quickly model and simulate complex application specific digital systems speeding-up its subsequent prototyping. We show how to use rewriting-logic to model and evaluate reconfigurable systolic architectures which are applied to the efficient treatment of several dynamic programming methods for resolving well-known problems such as global and local sequence alignment (Smith-Waterman algorithm), approximate string matching and computation of the longest common subsequence. A VHDL description of the conceived architecture was implemented from the rewriting-logic based abstract models and synthesized over an FPGA of the APEX family.

...read moreread less

11 citations

Book Chapter•DOI•

Approximate Matching of Network Expressions with Spacers

[...]

Gene Myers¹•Institutions (1)

University of Arizona¹

06 Apr 1992

TL;DR: A threshold-sensitive algorithm for approximately matching both network and regular expressions and a backtracking procedure whose order of evaluation is optimal in the sense that its expected time is minimal over all such procedures are presented.

...read moreread less

Abstract: We present two algorithmic results pertinent to the matching of patterns of interest in macromolecular sequences. The first result is an output sensitive algorithm for approximately matching network expressions, i.e., regular expressions without Kleene closure. This result generalizes the O(kn) expected-time algorithm of Ukkonen for approximately matching keywords [Ukk85]. The second result concerns the problem of matching a pattern that is a network expression whose elements are approximate matches to network expressions interspersed with specifiable distance ranges. For this class of patterns, it is shown how to determine a backtracking procedure whose order of evaluation is optimal in the sense that its expected time is minimal over all such procedures.

...read moreread less

11 citations

Collapse

Network Information

Performance

Metrics

1,942

Papers

64,998

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	30
2021	32
2020	30
2019	48
2018	39

Approximate string matching

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics