Topic
Approximate string matching
About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.
Papers published on a yearly basis
Papers
More filters
•
04 Feb 2004TL;DR: In this paper, a method and device for compressing a text string into a compressed string, taking into account case sensitivity of the text string, is presented, which can be performed in lossy mode or lossless mode.
Abstract: A method and device for compressing a text string into a compressed string, taking into account case sensitivity of the text string. Compression can be performed in lossy mode or lossless mode. In lossy mode, the text string is parsed to determine its case sensitivity so that a search for finding a match for the text string in a reference source is based on the case sensitivity. Alternatively, the case configuration of the characters in the text string is transformed into a target case, and a case-sensitive search is performed to find a match for the case-transformed text string. In lossless mode, a case-insensitive search is performed for finding a match for the text string regardless of its case sensitivity, and a case-info-element is attached to the compressed string so that the compressed string can be reconstructed based on the case-info-element.
10 citations
••
02 Sep 2014TL;DR: This paper proposes a novel service matching approach taking into account a service’s signatures and privacy policies and applies fuzzy matching techniques that are able to deal with incomplete service specifications.
Abstract: Service matching approaches determine to what extent a provided service matches a requester’s requirements. This process is based on service specifications describing functional (e.g., signatures) as well as non-functional properties (e.g., privacy policies). However, we cannot expect service specifications to be complete as providers do not want to share all details of their services’ implementation. Moreover, creating complete specifications requires much effort. In this paper, we propose a novel service matching approach taking into account a service’s signatures and privacy policies. In particular, our approach applies fuzzy matching techniques that are able to deal with incomplete service specifications. As a benefit, decision-making based on matching results is improved and service matching becomes better applicable in practice.
10 citations
•
01 Jan 2014TL;DR: Practical experiments show that the new bit-parallel algorithms for exact and approximate string matching are competitive with earlier algorithms and are shown to be linear in the worst case and sublinear in the average case.
Abstract: New bit-parallel algorithms for exact and approximate string matching are introduced. TSO is a two-way Shift-Or algorithm, TSA is a two-way Shift-And algorithm, and TSAdd is a two-way Shift-Add algorithm. Tuned Shift-Add is a minimalist improvement to the original Shift-Add algorithm. TSO and TSA are for exact string matching, while TSAdd and tuned Shift-Add are for approximate string matching with k mismatches. TSO and TSA are shown to be linear in the worst case and sublinear in the average case. Practical experiments show that the new algorithms are competitive with earlier algorithms.
10 citations
01 Jan 2002
TL;DR: This work has investigated the properties of the D-index to approximate searching and matching of text databases and found a suitable metric for such a task is the edit distance measure.
Abstract: Text collections of data need not only search support for
identical objects, but approximate matching is even more
important. A suitable metric for such a task is the edit
distance measure. However, the quadratic complexity of the edit
distance prevents from applying storage organizations such as
the sequential search. We have investigated the properties of
the D-index to approximate searching and matching of text
databases.
10 citations
01 Jan 2010
TL;DR: This document explains the process of searching for optimal alignment of two finite-length strings in which comparable patterns may not be obvious; long strings subject to natural variations or random noise may share subtle, characteristic, underlying patterns of symbols.
Abstract: Approximate string matching is used when a query string is similar to but not identical with desired matches many patterns can be symbolically encoded as strings. Approximate string matching is the process of searching for optimal alignment of two finite-length strings in which comparable patterns may not be obvious; long strings subject to natural variations or random noise, for example, may share subtle, characteristic, underlying patterns of symbols. Use of the term approximate merely emphasizes the fact that a perfect match may not be achievable and that imperfections such as missing and extraneous symbols have to be considered. In many applications, one of the two strings is a prototype string that represents a pattern class and the other is a test string that we wish to analyze and/or classify.
10 citations