scispace - formally typeset
Search or ask a question
Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.


Papers
More filters
Patent
Zhigang Liu1
04 Feb 2004
TL;DR: In this paper, a method and device for compressing a text string into a compressed string, taking into account case sensitivity of the text string, is presented, which can be performed in lossy mode or lossless mode.
Abstract: A method and device for compressing a text string into a compressed string, taking into account case sensitivity of the text string. Compression can be performed in lossy mode or lossless mode. In lossy mode, the text string is parsed to determine its case sensitivity so that a search for finding a match for the text string in a reference source is based on the case sensitivity. Alternatively, the case configuration of the characters in the text string is transformed into a target case, and a case-sensitive search is performed to find a match for the case-transformed text string. In lossless mode, a case-insensitive search is performed for finding a match for the text string regardless of its case sensitivity, and a case-info-element is attached to the compressed string so that the compressed string can be reconstructed based on the case-info-element.

10 citations

Book ChapterDOI
02 Sep 2014
TL;DR: This paper proposes a novel service matching approach taking into account a service’s signatures and privacy policies and applies fuzzy matching techniques that are able to deal with incomplete service specifications.
Abstract: Service matching approaches determine to what extent a provided service matches a requester’s requirements. This process is based on service specifications describing functional (e.g., signatures) as well as non-functional properties (e.g., privacy policies). However, we cannot expect service specifications to be complete as providers do not want to share all details of their services’ implementation. Moreover, creating complete specifications requires much effort. In this paper, we propose a novel service matching approach taking into account a service’s signatures and privacy policies. In particular, our approach applies fuzzy matching techniques that are able to deal with incomplete service specifications. As a benefit, decision-making based on matching results is improved and service matching becomes better applicable in practice.

10 citations

Proceedings Article
01 Jan 2014
TL;DR: Practical experiments show that the new bit-parallel algorithms for exact and approximate string matching are competitive with earlier algorithms and are shown to be linear in the worst case and sublinear in the average case.
Abstract: New bit-parallel algorithms for exact and approximate string matching are introduced. TSO is a two-way Shift-Or algorithm, TSA is a two-way Shift-And algorithm, and TSAdd is a two-way Shift-Add algorithm. Tuned Shift-Add is a minimalist improvement to the original Shift-Add algorithm. TSO and TSA are for exact string matching, while TSAdd and tuned Shift-Add are for approximate string matching with k mismatches. TSO and TSA are shown to be linear in the worst case and sublinear in the average case. Practical experiments show that the new algorithms are competitive with earlier algorithms.

10 citations

01 Jan 2002
TL;DR: This work has investigated the properties of the D-index to approximate searching and matching of text databases and found a suitable metric for such a task is the edit distance measure.
Abstract: Text collections of data need not only search support for identical objects, but approximate matching is even more important. A suitable metric for such a task is the edit distance measure. However, the quadratic complexity of the edit distance prevents from applying storage organizations such as the sequential search. We have investigated the properties of the D-index to approximate searching and matching of text databases.

10 citations

01 Jan 2010
TL;DR: This document explains the process of searching for optimal alignment of two finite-length strings in which comparable patterns may not be obvious; long strings subject to natural variations or random noise may share subtle, characteristic, underlying patterns of symbols.
Abstract: Approximate string matching is used when a query string is similar to but not identical with desired matches many patterns can be symbolically encoded as strings. Approximate string matching is the process of searching for optimal alignment of two finite-length strings in which comparable patterns may not be obvious; long strings subject to natural variations or random noise, for example, may share subtle, characteristic, underlying patterns of symbols. Use of the term approximate merely emphasizes the fact that a perfect match may not be achievable and that imperfections such as missing and extraneous symbols have to be considered. In many applications, one of the two strings is a prototype string that represents a pattern class and the other is a test string that we wish to analyze and/or classify.

10 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
81% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Scheduling (computing)
78.6K papers, 1.3M citations
79% related
Network packet
159.7K papers, 2.2M citations
78% related
Optimization problem
96.4K papers, 2.1M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20238
202230
202132
202030
201948
201839