scispace - formally typeset
Search or ask a question
Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.


Papers
More filters
Book ChapterDOI
TL;DR: A practical parallel algorithm of comparable simplicity that requires only time, where w is the word size of the machine and p the number of processors, and the algorithm’s performance is independent of k and the alphabet size |Σ|.
Abstract: This paper deals with the approximate string-matching problem with Hamming distance. The approximate string-matching with k-mismatches problem is to find all locations at which a query of length m matches a factor of a text of length n with k or fewer mismatches. The approximate string-matching algorithms have both pleasing theoretical features, as well as direct applications, especially in computational biology. We consider a generalisation of this problem, the fixed-length approximate string-matching with k-mismatches problem: given a text t, a pattern x and an integer l, search for all the occurrences in t of all factors of x of length l with k or fewer mismatches with a factor of t. We present a practical parallel algorithm of comparable simplicity that requires only time, where w is the word size of the machine (e.g. 32 or 64 in practice) and p the number of processors. Thus the algorithm’s performance is independent of k and the alphabet size |Σ|. The proposed parallel algorithm makes use of message-passing parallelism model, and word-level parallelism for efficient approximate string-matching.

8 citations

Journal ArticleDOI
TL;DR: A simple recursive, memoized version of the Knuth-Morris-Pratt string matching algorithm is given, along with a proof of correctness and worst-case analysis.

8 citations

Proceedings ArticleDOI
01 Mar 2010
TL;DR: This paper shows that the optimal execution of intersecting posting lists of q-grams for substring matching queries should be decided judiciously, and presents the optimal and approximate algorithms based on cost estimation for subst ring matching queries.
Abstract: With the widespread of the internet, text-based data sources have become ubiquitous and the demand of effective support for string matching queries becomes ever increasing. The relational query language SQL also supports LIKE clause over string data to handle substring matching queries. Due to popularity of such substring matching queries, there have been a lot of study on designing efficient indexes to support the LIKE clause in SQL. Among them, q-gram based indexes have been studied extensively. However, how to process substring matching queries efficiently with such indexes has received very little attention until recently. In this paper, we show that the optimal execution of intersecting posting lists of q-grams for substring matching queries should be decided judiciously. Then we present the optimal and approximate algorithms based on cost estimation for substring matching queries. Performance study confirms that our techniques improve query execution time with q-gram indexes significantly compared to the traditional algorithms.

8 citations

Book ChapterDOI
18 Dec 2006
TL;DR: An O( n2) time algorithm for approximating the unit cost edit distance for ordered and rooted trees of bounded degree within a factor of O(n3/4), where n is the maximum size of two input trees, and the algorithm is based on transformation of anordered and rooted tree into a string.
Abstract: This paper presents an O(n2) time algorithm for approximating the unit cost edit distance for ordered and rooted trees of bounded degree within a factor of O(n3/4), where n is the maximum size of two input trees, and the algorithm is based on transformation of an ordered and rooted tree into a string.

8 citations

Proceedings ArticleDOI
10 Apr 2013
TL;DR: An interactive tabletop based on hand gesture recognition with a method based histogram for hand detection and extraction and a scale and rotation invariant method merging Chain Code algorithm and modified approximate string matching method is presented.
Abstract: In this paper, we present an interactive tabletop based on hand gesture recognition, In this case, we propose a method based histogram for hand detection and extraction. For gesture recognition we propose a scale and rotation invariant method merging Chain Code algorithm and modified approximate string matching method. With this developed system, the user can perform different gestures as zoom, move, draw, and write on a virtual keyboard. This implemented system provides more flexible, natural and intuitive interaction possibilities, and also offers an economic and practical way of interaction.

8 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
81% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Scheduling (computing)
78.6K papers, 1.3M citations
79% related
Network packet
159.7K papers, 2.2M citations
78% related
Optimization problem
96.4K papers, 2.1M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20238
202230
202132
202030
201948
201839