A lower-variance randomized algorithm for approximate string matching

doi:10.1016/J.IPL.2013.06.005

Journal ArticleDOI

A lower-variance randomized algorithm for approximate string matching

Mikhail J. Atallah, +2 more

- 01 Sep 2013 -

Information Processing Letters

- Vol. 113, Iss: 18, pp 690-692

Chats0

TLDR

This paper provides an algorithm that also runs in deterministic time O(kNlogM) but achieves a lower variance of min(M/k, M-c)(M-c)/k, which is essentially a factor of k smaller than in previous work.

About:

This article is published in Information Processing Letters.The article was published on 2013-09-01. It has received 15 citations till now. The article focuses on the topics: Bias of an estimator & Approximate string matching.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Fast Detection of Transformed Data Leaks

Xiaokui Shu, +3 more

- 01 Mar 2016 -

IEEE Transactions on Information Forensi...

TL;DR: This paper utilizes sequence alignment techniques for detecting complex data-leak patterns and achieves good detection accuracy in recognizing transformed leaks, and implements a parallelized version of the algorithms in graphics processing unit that achieves high analysis throughput.

...read moreread less

Book ChapterDOI

Entropic Trace Estimates for Log Determinants

Jack K. Fitzsimons, +5 more

TL;DR: This work estimates log determinants under the framework of maximum entropy, given information in the form of moment constraints from stochastic trace estimation, demonstrating a significant improvement on state-of-the-art alternative methods.

...read moreread less

Book ChapterDOI

Optimal Query Complexity for Estimating the Trace of a Matrix

Karl Wimmer, +2 more

TL;DR: In this paper, the authors studied the query complexity of randomized algorithms for estimating the trace of an implicit n×n matrix and showed that any estimator requires Ω(1/e) queries to have a guarantee of variance at most.

...read moreread less

Journal ArticleDOI

On approximate pattern matching with thresholds

Peng Zhang, +1 more

- 01 Jul 2017 -

Information Processing Letters

TL;DR: The main result of this paper is to show that this threshold version of the problem can be solved by recursively solving 3 + 2 log ⁡ θ instances of the traditional (i.e., zero-threshold) version of this problem, which is much-studied in the literature and for which there are many efficient (typically randomized) solutions of time complexity close to O.

...read moreread less

Proceedings ArticleDOI

Breaking the Variance: Approximating the Hamming Distance in 1/a#x3B5; Time Per Alignment

Tsvi Kopelowitz, +1 more

TL;DR: The main idea behind the algorithm is to reduce the variance of a specific randomized experiment by (approximately) separating heavy hitters from non-heavy hitters, and it is shown that this belief that obtaining an algorithm for solving the approximation version cannot be done much faster as a function of 1/ε is false.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Efficient string matching: an aid to bibliographic search

Alfred V. Aho, +1 more

- 01 Jun 1975 -

Communications of The ACM

TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.

...read moreread less

Journal ArticleDOI

A new approach to text searching

Ricardo Baeza-Yates, +1 more

- 01 Oct 1992 -

Communications of The ACM

TL;DR: A family of simple and fast algorithms for solving the classical string matching problem, string matching with don't care symbols and complement symbols, and multiple patterns are introduced.

...read moreread less

String-matching and other products

Michael J. Fischer, +1 more

TL;DR: By exploiting the formal similarity of string-matching with integer multiplication, a new algorithm has been obtained with a running time which is only slightly worse than linear.

...read moreread less

Journal ArticleDOI

Generalized string matching

Karl Abrahamson

- 01 Dec 1987 -

SIAM Journal on Computing

TL;DR: A generalization of string matching, in which the pattern is a sequence of pattern elements, each compatible with a set of symbols, is investigated, which shows that generalized string matching requires a time-space product of $\Omega ({{n^2 } / {\log n}})$ on a powerful model of computation, when the alphabet is restricted to n symbols.

...read moreread less

Book

Efficient String Matching With K Mismatches

Gad M. Landau, +1 more

TL;DR: An algorithm for finding all occurrences of the pattern in the text, each with at most k mismatches, runs in O( k ( m log m + n )) time.

...read moreread less

A lower-variance randomized algorithm for approximate string matching

Citations

Fast Detection of Transformed Data Leaks

Entropic Trace Estimates for Log Determinants

Optimal Query Complexity for Estimating the Trace of a Matrix

On approximate pattern matching with thresholds

Breaking the Variance: Approximating the Hamming Distance in 1/a#x3B5; Time Per Alignment

References

Efficient string matching: an aid to bibliographic search

A new approach to text searching

String-matching and other products

Generalized string matching

Efficient String Matching With K Mismatches

Related Papers (5)

A Randomized Algorithm for Approximate String Matching

Generalized string matching

Approximate string-matching with q -grams and maximal matches

Faster algorithms for string matching with k mismatches

Approximate Boyer-Moore string matching