scispace - formally typeset
Search or ask a question
Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.


Papers
More filters
Journal ArticleDOI
TL;DR: A generic in-place framework is designed that fits to solve both the exact and approximate k -mismatch SUS finding, using the minimum 2 n memory words, each of log 2 ⁡ ( n ) ⌉ bits, plus n bytes space, where n is the input string size.

9 citations

Proceedings ArticleDOI
27 Mar 2015
TL;DR: In this paper, a new string matching algorithm which matches the pattern from neither the left nor the right end, instead a special position was proposed, which is more flexible to pick the position for starting comparisons.
Abstract: String matching is of great importance in pattern recognition. We put forth a new string matching algorithm which matches the pattern from neither the left nor the right end, instead a special position. Comparing with the Knuth-Morris-Pratt algorithm and the Boyer-Moore algorithm, the new algorithm is more flexible to pick the position for starting comparisons. The option really brings it a saving in cost. The method requires a statistical probability table for alphabets which can be set up using evolution strategies for dynamic conditions. If the chosen lowlight character in a given pattern has the probability λ, the length of the text is n and the length of the pattern is m. then we conjecture that the complexity of the new algorithm is Θ(n/λm).

9 citations

Journal ArticleDOI
TL;DR: A classifier based on Approximate String Matching (ASM) is proposed, which encodes the trajectories of the hand joints as character sequences using the K-means algorithm and then analyzes these sequences with ASM.
Abstract: New interaction paradigms combined with emerging technologies have produced the creation of diverse Natural User Interface (NUI) devices in the market. These devices enable the recognition of body gestures allowing users to interact with applications in a more direct, expressive, and intuitive way. In particular, the Leap Motion Controller (LMC) device has been receiving plenty of attention from NUI application developers because it allows them to address limitations on gestures made with hands. Although this device is able to recognize the position of several parts of the hands, developers are still left with the difficult task of recognizing gestures. For this reason, several authors approached this problem using machine learning techniques. We propose a classifier based on Approximate String Matching (ASM). In short, we encode the trajectories of the hand joints as character sequences using the K-means algorithm and then we analyze these sequences with ASM. It should be noted that, when using the K-means algorithm, we select the number of clusters for each part of the hands by considering the Silhouette Coefficient. Furthermore, we define other important factors to take into account for improving the recognition accuracy. For the experiments, we generated a balanced dataset including different types of gestures and afterwards we performed a cross-validation scheme. Experimental results showed the robustness of the approach in terms of recognizing different types of gestures, time spent, and allocated memory. Besides, our approach achieved higher performance rates than well-known algorithms proposed in the current state-of-art for gesture recognition.

9 citations

Patent
Dharmendra S. Modha1
30 Oct 2002
TL;DR: In this paper, a system and method for lossy compression of finite alphabet source sequences subject to an average-per-letter distortion constraint is presented, where the source sequence is sequentially parsed into phrases and each source phrase is mapped to a distorted phrase such that average perletter distortion between the two phrases does not exceed the desired distortion.
Abstract: A system and method are provided for lossy compression of finite alphabet source sequences subject to an average-per-letter distortion constraint. The source sequence is sequentially parsed into phrases and each source phrase is mapped to a distorted phrase such that average per-letter distortion between the two phrases does not exceed the desired distortion. The present system adaptively maintains a codebook as the collection of all one-letter extensions of previously emitted distorted phrases. The present system uses approximate string matching and carries out a sequential procedure by iterating the following steps: (i) given the current codebook find the longest source phrase that can be transmitted at a given distortion, (ii) from all codewords that match the source phrase carefully choose that which is most likely to be useful in the future. For every new source phrase, the present system judiciously selects one of the many approximately matching codewords to balance between the code rate for the current phrase versus the code rate from resulting codebooks for the future source phrases. The present system outputs a distorted sequence that can be naturally losslessly compressed using the Lempel-Ziv algorithm or any variation thereof. Such judicious codeword selection is intended to iteratively improve the codebook quality. The entire present sequence can be implemented in quadratic-time in the length of the source sequence. The present system is sequential and adaptive.

9 citations

Journal ArticleDOI
TL;DR: The DMA, a general approach algorithm to perform direct access matching for the exact pattern or its similarities within a text depending on the location of a character in alphabetical order, is proposed.
Abstract: Approximate string matching algorithms are techniques used to find a pattern 'P' in a text 'T' partially or exactly. These techniques become very important in terms of performance and the accuracy of searching results. In this paper, we propose a general approach algorithm, called the Direct Matching Algorithm (DMA). The function of this algorithm is to perform direct access matching for the exact pattern or its similarities within a text depending on the location of a character in alphabetical order. We simulated the DMA in order to show its competence. The simulation result showed significant improvement in the exact string matching or similarity matching, and therefore extreme competence in the real applications.

9 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
81% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Scheduling (computing)
78.6K papers, 1.3M citations
79% related
Network packet
159.7K papers, 2.2M citations
78% related
Optimization problem
96.4K papers, 2.1M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20238
202230
202132
202030
201948
201839