Journal ArticleDOI
A lower-variance randomized algorithm for approximate string matching
Reads0
Chats0
TLDR
This paper provides an algorithm that also runs in deterministic time O(kNlogM) but achieves a lower variance of min(M/k, M-c)(M-c)/k, which is essentially a factor of k smaller than in previous work.About:
This article is published in Information Processing Letters.The article was published on 2013-09-01. It has received 15 citations till now. The article focuses on the topics: Bias of an estimator & Approximate string matching.read more
Citations
More filters
Journal ArticleDOI
Fast Detection of Transformed Data Leaks
TL;DR: This paper utilizes sequence alignment techniques for detecting complex data-leak patterns and achieves good detection accuracy in recognizing transformed leaks, and implements a parallelized version of the algorithms in graphics processing unit that achieves high analysis throughput.
Book ChapterDOI
Entropic Trace Estimates for Log Determinants
Jack K. Fitzsimons,Diego Granziol,Kurt Cutajar,Michael A. Osborne,Maurizio Filippone,Stephen J. Roberts +5 more
TL;DR: This work estimates log determinants under the framework of maximum entropy, given information in the form of moment constraints from stochastic trace estimation, demonstrating a significant improvement on state-of-the-art alternative methods.
Book ChapterDOI
Optimal Query Complexity for Estimating the Trace of a Matrix
Karl Wimmer,Yi Wu,Peng Zhang +2 more
TL;DR: In this paper, the authors studied the query complexity of randomized algorithms for estimating the trace of an implicit n×n matrix and showed that any estimator requires Ω(1/e) queries to have a guarantee of variance at most.
Journal ArticleDOI
On approximate pattern matching with thresholds
Peng Zhang,Mikhail J. Atallah +1 more
TL;DR: The main result of this paper is to show that this threshold version of the problem can be solved by recursively solving 3 + 2 log θ instances of the traditional (i.e., zero-threshold) version of this problem, which is much-studied in the literature and for which there are many efficient (typically randomized) solutions of time complexity close to O.
Proceedings ArticleDOI
Breaking the Variance: Approximating the Hamming Distance in 1/a#x3B5; Time Per Alignment
Tsvi Kopelowitz,Ely Porat +1 more
TL;DR: The main idea behind the algorithm is to reduce the variance of a specific randomized experiment by (approximately) separating heavy hitters from non-heavy hitters, and it is shown that this belief that obtaining an algorithm for solving the approximation version cannot be done much faster as a function of 1/ε is false.
References
More filters
Journal ArticleDOI
Efficient string matching: an aid to bibliographic search
TL;DR: A simple, efficient algorithm to locate all occurrences of any of a finite number of keywords in a string of text that has been used to improve the speed of a library bibliographic search program by a factor of 5 to 10.
Journal ArticleDOI
A new approach to text searching
TL;DR: A family of simple and fast algorithms for solving the classical string matching problem, string matching with don't care symbols and complement symbols, and multiple patterns are introduced.
String-matching and other products
Michael J. Fischer,Mike Paterson +1 more
TL;DR: By exploiting the formal similarity of string-matching with integer multiplication, a new algorithm has been obtained with a running time which is only slightly worse than linear.
Journal ArticleDOI
Generalized string matching
TL;DR: A generalization of string matching, in which the pattern is a sequence of pattern elements, each compatible with a set of symbols, is investigated, which shows that generalized string matching requires a time-space product of $\Omega ({{n^2 } / {\log n}})$ on a powerful model of computation, when the alphabet is restricted to n symbols.
Book
Efficient String Matching With K Mismatches
Gad M. Landau,Uzi Vishkin +1 more
TL;DR: An algorithm for finding all occurrences of the pattern in the text, each with at most k mismatches, runs in O( k ( m log m + n )) time.