scispace - formally typeset
Search or ask a question
Topic

Approximate string matching

About: Approximate string matching is a research topic. Over the lifetime, 1903 publications have been published within this topic receiving 62352 citations. The topic is also known as: fuzzy string-searching algorithm & fuzzy string-matching algorithm.


Papers
More filters
Proceedings ArticleDOI
14 Jun 2005
TL;DR: This paper proposes a generic programmable array processor architecture that maximizes the strength of VLSI in terms of intensive and pipelined computing and yet circumvents the limitation on communication.
Abstract: This paper proposes a generic programmable array processor architecture for a wide variety of approximate string matching algorithms. Further, we describe the architecture of the array and the architecture of the cell in detail in order to efficiently implement for both the preprocessing and searching phases of most string matching algorithms. Further, the architecture performs approximate string matching for complex patterns that contain don't care, complement and classes symbols. Our architecture maximizes the strength of VLSI in terms of intensive and pipelined computing and yet circumvents the limitation on communication. It may be adopted as a basic structure for a universal flexible string matcher engine.

8 citations

Book ChapterDOI
Kensuke Baba1
23 Mar 2010
TL;DR: An FFT-based algorithm is proposed for the problem of string matching with mismatches, which computes an estimate with accuracy, which consists of FFT computations for binary vectors which can be computed faster than the computation for vectors of complex numbers.
Abstract: String matching with mismatches is a basic concept of information retrieval with some kinds of approximation. This paper proposes an FFT-based algorithm for the problem of string matching with mismatches, which computes an estimate with accuracy. The algorithm consists of FFT computations for binary vectors which can be computed faster than the computation for vectors of complex numbers. Therefore, a reduction of the computation time is obtained by the speed-up for FFT, which leads an improvement of the variance of the estimates. This paper analyzes the variance of the estimates in the algorithm and compares it with the variances in existing algorithms.

8 citations

Book ChapterDOI
22 Jul 2004
TL;DR: An algorithm to deal with error repair over finite-state architectures that guarantees asymptotic equivalence with global repair strategies and relies on a regional least-cost repair strategy, dynamically gathering all relevant information in the context of the error location.
Abstract: We describe an algorithm to deal with error repair over finite-state architectures. Such a technique is of interest in spelling correction as well as approximate string matching in a variety of applications related to natural language processing, such as information extraction/recovery or answer searching, where error-tolerant recognition allows misspelled input words to be integrated in the computational process. Our proposal relies on a regional least-cost repair strategy, dynamically gathering all relevant information in the context of the error location. The system guarantees asymptotic equivalence with global repair strategies.

7 citations

Patent
28 Jun 2002
TL;DR: In this paper, any string in any character set with an arbitrary leveled weight-based comparison system is transformed into a bitstring in such a way that two transformed strings can be compared byte-by-byte.
Abstract: Any string in any character set with an arbitrary-leveled weight-based comparison system is transformed into a bitstring in such a way that two transformed strings can be compared byte-by-byte. The resulting bit string has the minimum possible maximum length. The transformed bit strings can be inverted—meaning the original string can be recovered from the transformed string.

7 citations

Journal ArticleDOI
TL;DR: Pascal with pattern matching is shown to be a useful tool for string processing applications and compared with a SNOBOL4 implementation, it is demonstrated that the language can be used beyond context‐free languages.
Abstract: This paper presents an extension of Pascal with string pattern matching. Pattern definitions are built using six basic operations: alternation, concatenation, immediate value assignment, intersection, difference and complement. The last three have not been previously implemented and they increase the expressive power beyond context-free languages. The pattern matching actions are augmented with three options: trace, prefix and suffix. Comparisons with a SNOBOL4 implementation are also presented. This experiment demonstrates that Pascal with pattern matching is a useful tool for string processing applications.

7 citations


Network Information
Related Topics (5)
Server
79.5K papers, 1.4M citations
81% related
Cluster analysis
146.5K papers, 2.9M citations
80% related
Scheduling (computing)
78.6K papers, 1.3M citations
79% related
Network packet
159.7K papers, 2.2M citations
78% related
Optimization problem
96.4K papers, 2.1M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
20238
202230
202132
202030
201948
201839