An In-place Framework for Exact and Approximate Shortest Unique Substring Queries
Wing-Kai Hon,Sharma V. Thankachan,Bojian Xu +2 more
- pp 755-767
Reads0
Chats0
TLDR
In this article, a generic in-place framework was proposed to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size.Abstract:
We revisit the exact shortest unique substring (SUS) finding problem, and propose its approximate version where mismatches are allowed, due to its applications in subfields such as computational biology. We design a generic in-place framework that fits to solve both the exact and approximate k-mismatch SUS finding, using the minimum 2n memory words plus n bytes space, where n is the input string size. By using the in-place framework, we can find the exact and approximate k-mismatch SUS for every string position using a total of O(n) and \(O(n^2)\) time, respectively, regardless of the value of k. Our framework does not involve any compressed or succinct data structures and thus is practical and easy to implement.read more
Citations
More filters
Proceedings ArticleDOI
Shortest Unique Substring Queries on Run-Length Encoded Strings
TL;DR: Using the data structure by Beam and Fich (JCSS 2002), this results in a data structure of O(m) space that is constructed in O (m log m) time, and answers queries in O(sqrt(log m/loglog m)+k) time.
Journal ArticleDOI
Space–time trade-offs for finding shortest unique substrings and maximal unique matches
TL;DR: The results imply the first sub-linear space (in addition to the input string) solution to these problems, and obtain similar space-and-time tradeoffs for a related problem of finding Maximal Unique Matches of two strings.
Journal ArticleDOI
In-place algorithms for exact and approximate shortest unique substring problems
TL;DR: A generic in-place framework is designed that fits to solve both the exact and approximate k -mismatch SUS finding, using the minimum 2 n memory words, each of log 2 ( n ) ⌉ bits, plus n bytes space, where n is the input string size.
Journal ArticleDOI
Fast Algorithms for the Shortest Unique Palindromic Substring Problem on Run-Length Encoded Strings
Kiichi Watanabe,Yuto Nakashima,Shunsuke Inenaga,Shunsuke Inenaga,Hideo Bannai,Masayuki Takeda +5 more
TL;DR: In this article, the problem of shortest unique palindromic substring (SUPS) queries on run-length encoded strings was studied and two algorithms were proposed to answer SUPS queries in O(m) space.
Book ChapterDOI
Compact Data Structures for Shortest Unique Substring Queries
TL;DR: In this article, a data structure for answering an interval SUS query in output-sensitive O(occ) time was proposed, whereocc is the number of returned SUSs.
References
More filters
Journal ArticleDOI
Fast computation and applications of genome mappability.
Thomas Derrien,Jordi Estellé,Santiago Marco Sola,David G. Knowles,Emanuele Raineri,Roderic Guigó,Paolo Ribeca +6 more
TL;DR: A fast mapping-based algorithm is presented to compute the mappability of each region of a reference genome up to a specified number of mismatches, highlighting mappable as an important concept which deserves to be taken into full account when massively parallel sequencing technologies are employed.
Journal ArticleDOI
Genome comparison without alignment using shortest unique substrings.
TL;DR: A method to rapidly search for shortest unique substrings in DNA sequences and a derivation of their null distribution is combined and it is shown that unique regions in an arbitrary sample of genomes can be efficiently detected with this method.
Journal ArticleDOI
Practical linear-time O(1)-workspace suffix sorting for constant alphabets
TL;DR: In this experiment, SACA-K outperforms SA-IS that was previously the most time- and space-efficient linear-time SA construction algorithm (SACA), and is around 33% faster and uses a smaller deterministic workspace of K words, where the workspace is the space needed beyond the input string and the output SA.
Proceedings ArticleDOI
On shortest unique substring queries
Jian Pei,W. C-H Wu,Mi-Yen Yeh +2 more
TL;DR: This paper presents an algorithm to answer a shortest unique substring query in O(n) time using a suffix tree index, where n is the length of string S and shows that it can compute a shortestunique substring for every position in a given string.
Book ChapterDOI
Shortest Unique Substrings Queries in Optimal Time
TL;DR: An optimal, linear time algorithm for the shortest unique substring problem is presented, thus improving the algorithm by Pei et al. (ICDE 2013) and shown to be much more efficient in practice.