An In-place Framework for Exact and Approximate Shortest Unique Substring Queries

doi:10.1007/978-3-662-48971-0_63

Open AccessBook ChapterDOI

An In-place Framework for Exact and Approximate Shortest Unique Substring Queries

Wing-Kai Hon, +2 more

- pp 755-767

Chats0

TLDR

In this article, a generic in-place framework was proposed to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size.

Abstract:

We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology. We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and \(O(n^2)\) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Shortest Unique Substring Queries on Run-Length Encoded Strings

Takuya Mieno, +3 more

TL;DR: Using the data structure by Beam and Fich (JCSS 2002), this results in a data structure of O(m) space that is constructed in O (m log m) time, and answers queries in O(sqrt(log m/loglog m)+k) time.

...read moreread less

Journal ArticleDOI

Space–time trade-offs for finding shortest unique substrings and maximal unique matches

Arnab Ganguly, +3 more

- 14 Nov 2017 -

Theoretical Computer Science

TL;DR: The results imply the first sub-linear space (in addition to the input string) solution to these problems, and obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings.

...read moreread less

Journal ArticleDOI

In-place algorithms for exact and approximate shortest unique substring problems

Wing-Kai Hon, +2 more

- 22 Aug 2017 -

Theoretical Computer Science

TL;DR: A generic in-place framework is designed that fits to solve both the exact and approximate k -mismatch SUS finding, using the minimum 2 n memory words, each of log 2 ⁡ ( n ) ⌉ bits, plus n bytes space, where n is the input string size.

...read moreread less

Journal ArticleDOI

Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings

Kiichi Watanabe, +5 more

- 01 Oct 2020 -

Theory of Computing Systems \/ Mathemati...

TL;DR: In this article, the problem of shortest unique palindromic substring (SUPS) queries on run-length encoded strings was studied and two algorithms were proposed to answer SUPS queries in O(m) space.

...read moreread less

Book ChapterDOI

Compact Data Structures for Shortest Unique Substring Queries

Takuya Mieno, +5 more

TL;DR: In this article, a data structure for answering an interval SUS query in output-sensitive O(occ) time was proposed, whereocc is the number of returned SUSs.

...read moreread less

References

PDF

Open Access

More filters

Journal ArticleDOI

Fast computation and applications of genome mappability.

Thomas Derrien, +6 more

- 19 Jan 2012 -

PLOS ONE

TL;DR: A fast mapping-based algorithm is presented to compute the mappability of each region of a reference genome up to a specified number of mismatches, highlighting mappable as an important concept which deserves to be taken into full account when massively parallel sequencing technologies are employed.

...read moreread less

Journal ArticleDOI

Genome comparison without alignment using shortest unique substrings.

Bernhard Haubold, +3 more

- 23 May 2005 -

BMC Bioinformatics

TL;DR: A method to rapidly search for shortest unique substrings in DNA sequences and a derivation of their null distribution is combined and it is shown that unique regions in an arbitrary sample of genomes can be efficiently detected with this method.

...read moreread less

Journal ArticleDOI

Practical linear-time O(1)-workspace suffix sorting for constant alphabets

Ge Nong

- 05 Aug 2013 -

ACM Transactions on Information Systems

TL;DR: In this experiment, SACA-K outperforms SA-IS that was previously the most time- and space-efficient linear-time SA construction algorithm (SACA), and is around 33% faster and uses a smaller deterministic workspace of K words, where the workspace is the space needed beyond the input string and the output SA.

...read moreread less

Proceedings ArticleDOI

On shortest unique substring queries

Jian Pei, +2 more

TL;DR: This paper presents an algorithm to answer a shortest unique substring query in O(n) time using a suffix tree index, where n is the length of string S and shows that it can compute a shortestunique substring for every position in a given string.

...read moreread less

Book ChapterDOI

Shortest Unique Substrings Queries in Optimal Time

Kazuya Tsuruta, +3 more

TL;DR: An optimal, linear time algorithm for the shortest unique substring problem is presented, thus improving the algorithm by Pei et al. (ICDE 2013) and shown to be much more efficient in practice.

...read moreread less

Fundamenta Informaticae

An In-place Framework for Exact and Approximate Shortest Unique Substring Queries

Citations

Shortest Unique Substring Queries on Run-Length Encoded Strings

Space–time trade-offs for finding shortest unique substrings and maximal unique matches

In-place algorithms for exact and approximate shortest unique substring problems

Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings

Compact Data Structures for Shortest Unique Substring Queries

References

Fast computation and applications of genome mappability.

Genome comparison without alignment using shortest unique substrings.

Practical linear-time O(1)-workspace suffix sorting for constant alphabets

On shortest unique substring queries

Shortest Unique Substrings Queries in Optimal Time

Related Papers (5)

Shortest Unique Substrings Queries in Optimal Time

Shortest Unique Queries on Strings

On shortest unique substring queries

Shortest Unique Substring Queries on Run-Length Encoded Strings

Occurrence and substring heuristics for δ-matching