Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

In-place algorithms for exact and approximate shortest unique substring problems

[...]

Wing-Kai Hon¹, Sharma V. Thankachan², Bojian Xu³•Institutions (3)

National Tsing Hua University¹, University of Central Florida², Eastern Washington University³

22 Aug 2017-Theoretical Computer Science

TL;DR: A generic in-place framework is designed that fits to solve both the exact and approximate k -mismatch SUS finding, using the minimum 2 n memory words, each of log 2 ⁡ ( n ) ⌉ bits, plus n bytes space, where n is the input string size.

...read moreread less

9 citations

Proceedings Article•DOI•

A fast string matching algorithm based on lowlight characters in the pattern

[...]

Zhengjun Cao¹, Zhenzhen Yan¹, Lihua Liu²•Institutions (2)

Shanghai University¹, Shanghai Maritime University²

27 Mar 2015

TL;DR: In this paper, a new string matching algorithm which matches the pattern from neither the left nor the right end, instead a special position was proposed, which is more flexible to pick the position for starting comparisons.

...read moreread less

Abstract: String matching is of great importance in pattern recognition. We put forth a new string matching algorithm which matches the pattern from neither the left nor the right end, instead a special position. Comparing with the Knuth-Morris-Pratt algorithm and the Boyer-Moore algorithm, the new algorithm is more flexible to pick the position for starting comparisons. The option really brings it a saving in cost. The method requires a statistical probability table for alphabets which can be set up using evolution strategies for dynamic conditions. If the chosen lowlight character in a given pattern has the probability λ, the length of the text is n and the length of the pattern is m. then we conjecture that the complexity of the new algorithm is Θ(n/λm).

...read moreread less

9 citations

Journal Article•DOI•

Hand gesture recognition in real world scenarios using approximate string matching

[...]

Diego Álvarez Alonso¹, Alfredo Raúl Teyseyre¹, Alvaro Soria¹, Luis Berdún¹•Institutions (1)

National Scientific and Technical Research Council¹

24 Apr 2020-Multimedia Tools and Applications

TL;DR: A classifier based on Approximate String Matching (ASM) is proposed, which encodes the trajectories of the hand joints as character sequences using the K-means algorithm and then analyzes these sequences with ASM.

...read moreread less

Abstract: New interaction paradigms combined with emerging technologies have produced the creation of diverse Natural User Interface (NUI) devices in the market. These devices enable the recognition of body gestures allowing users to interact with applications in a more direct, expressive, and intuitive way. In particular, the Leap Motion Controller (LMC) device has been receiving plenty of attention from NUI application developers because it allows them to address limitations on gestures made with hands. Although this device is able to recognize the position of several parts of the hands, developers are still left with the difficult task of recognizing gestures. For this reason, several authors approached this problem using machine learning techniques. We propose a classifier based on Approximate String Matching (ASM). In short, we encode the trajectories of the hand joints as character sequences using the K-means algorithm and then we analyze these sequences with ASM. It should be noted that, when using the K-means algorithm, we select the number of clusters for each part of the hands by considering the Silhouette Coefficient. Furthermore, we define other important factors to take into account for improving the recognition accuracy. For the experiments, we generated a balanced dataset including different types of gestures and afterwards we performed a cross-validation scheme. Experimental results showed the robustness of the approach in terms of recognizing different types of gestures, time spent, and allocated memory. Besides, our approach achieved higher performance rates than well-known algorithms proposed in the current state-of-art for gesture recognition.

...read moreread less

9 citations

Patent•

Polynomial-time, sequential, adaptive system and method for lossy data compression

[...]

Dharmendra S. Modha¹•Institutions (1)

IBM¹

30 Oct 2002

TL;DR: In this paper, a system and method for lossy compression of finite alphabet source sequences subject to an average-per-letter distortion constraint is presented, where the source sequence is sequentially parsed into phrases and each source phrase is mapped to a distorted phrase such that average perletter distortion between the two phrases does not exceed the desired distortion.

...read moreread less

Abstract: A system and method are provided for lossy compression of finite alphabet source sequences subject to an average-per-letter distortion constraint. The source sequence is sequentially parsed into phrases and each source phrase is mapped to a distorted phrase such that average per-letter distortion between the two phrases does not exceed the desired distortion. The present system adaptively maintains a codebook as the collection of all one-letter extensions of previously emitted distorted phrases. The present system uses approximate string matching and carries out a sequential procedure by iterating the following steps: (i) given the current codebook find the longest source phrase that can be transmitted at a given distortion, (ii) from all codewords that match the source phrase carefully choose that which is most likely to be useful in the future. For every new source phrase, the present system judiciously selects one of the many approximately matching codewords to balance between the code rate for the current phrase versus the code rate from resulting codebooks for the future source phrases. The present system outputs a distorted sequence that can be naturally losslessly compressed using the Lempel-Ziv algorithm or any variation thereof. Such judicious codeword selection is intended to iteratively improve the codebook quality. The entire present sequence can be implemented in quadratic-time in the length of the source sequence. The present system is sequential and adaptive.

...read moreread less

9 citations

Journal Article•DOI•

An algorithm to improve the performance of string matching

[...]

Abdallah A. Hlayel¹, Adnan A. Hnaif¹•Institutions (1)

Al-Zaytoonah University of Jordan¹

01 Jun 2014-Journal of Information Science

TL;DR: The DMA, a general approach algorithm to perform direct access matching for the exact pattern or its similarities within a text depending on the location of a character in alphabetical order, is proposed.

...read moreread less

Abstract: Approximate string matching algorithms are techniques used to find a pattern 'P' in a text 'T' partially or exactly. These techniques become very important in terms of performance and the accuracy of searching results. In this paper, we propose a general approach algorithm, called the Direct Matching Algorithm (DMA). The function of this algorithm is to perform direct access matching for the exact pattern or its similarities within a text depending on the location of a character in alphabetical order. We simulated the DMA in order to show its competence. The simulation result showed significant improvement in the exact string matching or similarity matching, and therefore extreme competence in the real applications.

...read moreread less

9 citations

Collapse

Network Information

Performance

Metrics

1,942

Papers

64,998

Citations

No. of papers in the topic in previous years
Year	Papers
2023	8
2022	30
2021	32
2020	30
2019	48
2018	39

Approximate string matching

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics