scispace - formally typeset
Search or ask a question

Showing papers by "Richard Cole published in 2004"


Proceedings ArticleDOI
13 Jun 2004
TL;DR: This paper considers various flavors of the following online problem: preprocess a text or collection of strings, so that given a query string p, all matches of p with the text can be reported quickly.
Abstract: This paper considers various flavors of the following online problem: preprocess a text or collection of strings, so that given a query string p, all matches of p with the text can be reported quickly. In this paper we consider matches in which a bounded number of mismatches are allowed, or in which a bounded number of "don't care" characters are allowed. The specific problems we look at are: indexing, in which there is a single text t, and we seek locations where p matches a substring of t; dictionary queries, in which a collection of strings is given upfront, and we seek those strings which match p in their entirety; and dictionary matching, in which a collection of strings is given upfront, and we seek those substrings of a (long) p which match an original string in its entirety. These are all instances of an all-to-all matching problem, for which we provide a single solution.The performance bounds all have a similar character. For example, for the indexing problem with n=|t| and m=|p|, the query time for k substitutions is O(m + (c1 log n)k⁄k! + # matches), with a data structure of size O(n (c2 log n)k⁄k!) and a preprocessing time of O(n (c2 log n)k⁄k!), where c1,c2 > 1 are constants. The deterministic preprocessing assumes a weakly nonuniform RAM model; this assumption is not needed if randomization is used in the preprocessing.

301 citations


Journal ArticleDOI
TL;DR: This work adds a new back-propagation component to McCreight's algorithm and gives a high probability hashing scheme for large degrees, which gives the first randomized linear time algorithm for constructing suffix trees for parameterized strings.
Abstract: We consider suffix tree construction for situations with missing suffix links Two examples of such situations are suffix trees for parameterized strings and suffix trees for two-dimensional arrays These trees also have the property that the node degrees may be large We add a new back-propagation component to McCreight's algorithm and also give a high probability hashing scheme for large degrees We show that these two features enable construction of suffix trees for general situations with missing suffix links in O(n) time, with high probability This gives the first randomized linear time algorithm for constructing suffix trees for parameterized strings

27 citations


Book ChapterDOI
14 Sep 2004
TL;DR: In this article, a new family of in-place sorting algorithms, the partition sorts, is introduced, which is appealing both for their relative simplicity and their efficient performance, achieving O(n log n) operations on the average and O( n \log 2 n ) operations in the worst case.
Abstract: This paper introduces a new family of in-place sorting algorithms, the partition sorts. They are appealing both for their relative simplicity and their efficient performance. They perform Θ(n log n) operations on the average, and \(\Theta(n \log^2\!n)\) operations in the worst case.

4 citations


Journal ArticleDOI
TL;DR: An optimal parallel CRCW-PRAM algorithm to compute witnesses for all non-period vectors of an m1 × m2 pattern is given and yields a work optimal algorithm for 2D pattern matching.
Abstract: An optimal parallel CRCW-PRAM algorithm to compute witnesses for all non-period vectors of an m1 × m2 pattern is given. The algorithm takes O(log log m) time and does O(m1 × m2) work, where m = max{m1, m2}. This yields a work optimal algorithm for 2D pattern matching which takes O(log log m) preprocessing time and O(1) text processing time.