scispace - formally typeset
Open AccessBook ChapterDOI

An In-place Framework for Exact and Approximate Shortest Unique Substring Queries

Reads0
Chats0
TLDR
In this article, a generic in-place framework was proposed to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size.
Abstract
We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology. We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and \(O(n^2)\) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement.

read more

Citations
More filters
Proceedings ArticleDOI

Shortest Unique Substring Queries on Run-Length Encoded Strings

TL;DR: Using the data structure by Beam and Fich (JCSS 2002), this results in a data structure of O(m) space that is constructed in O (m log m) time, and answers queries in O(sqrt(log m/loglog m)+k) time.
Journal ArticleDOI

Space–time trade-offs for finding shortest unique substrings and maximal unique matches

TL;DR: The results imply the first sub-linear space (in addition to the input string) solution to these problems, and obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings.
Journal ArticleDOI

In-place algorithms for exact and approximate shortest unique substring problems

TL;DR: A generic in-place framework is designed that fits to solve both the exact and approximate k -mismatch SUS finding, using the minimum 2 n memory words, each of log 2 ⁡ ( n ) ⌉ bits, plus n bytes space, where n is the input string size.
Journal ArticleDOI

Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings

TL;DR: In this article, the problem of shortest unique palindromic substring (SUPS) queries on run-length encoded strings was studied and two algorithms were proposed to answer SUPS queries in O(m) space.
Book ChapterDOI

Compact Data Structures for Shortest Unique Substring Queries

TL;DR: In this article, a data structure for answering an interval SUS query in output-sensitive O(occ) time was proposed, whereocc is the number of returned SUSs.
References
More filters
Journal ArticleDOI

Fast computation and applications of genome mappability.

TL;DR: A fast mapping-based algorithm is presented to compute the mappability of each region of a reference genome up to a specified number of mismatches, highlighting mappable as an important concept which deserves to be taken into full account when massively parallel sequencing technologies are employed.
Journal ArticleDOI

Genome comparison without alignment using shortest unique substrings.

TL;DR: A method to rapidly search for shortest unique substrings in DNA sequences and a derivation of their null distribution is combined and it is shown that unique regions in an arbitrary sample of genomes can be efficiently detected with this method.
Journal ArticleDOI

Practical linear-time O(1)-workspace suffix sorting for constant alphabets

TL;DR: In this experiment, SACA-K outperforms SA-IS that was previously the most time- and space-efficient linear-time SA construction algorithm (SACA), and is around 33% faster and uses a smaller deterministic workspace of K words, where the workspace is the space needed beyond the input string and the output SA.
Proceedings ArticleDOI

On shortest unique substring queries

TL;DR: This paper presents an algorithm to answer a shortest unique substring query in O(n) time using a suffix tree index, where n is the length of string S and shows that it can compute a shortestunique substring for every position in a given string.
Book ChapterDOI

Shortest Unique Substrings Queries in Optimal Time

TL;DR: An optimal, linear time algorithm for the shortest unique substring problem is presented, thus improving the algorithm by Pei et al. (ICDE 2013) and shown to be much more efficient in practice.
Related Papers (5)