A Linear-Time Burrows-Wheeler Transform Using Induced Sorting

doi:10.1007/978-3-642-03784-9_9

Book ChapterDOI

A Linear-Time Burrows-Wheeler Transform Using Induced Sorting

Daisuke Okanohara, +1 more

- Vol. 5721, pp 90-101

Chats0

TLDR

It is shown that the working space for computing Burrows-Wheeler Transform directly in linear time is O(n log*** loglog *** n ) for any *** where *** is the alphabet size, which is the smallest among the known linear time algorithms.

Abstract:

To compute Burrows-Wheeler Transform (BWT), one usually builds a suffix array (SA) first, and then obtains BWT using SA, which requires much redundant working space. In previous studies to compute BWT directly [5,12], one constructs BWT incrementally, which requires O(n logn ) time where n is the length of the input text. We present an algorithm for computing BWT directly in linear time by modifying the suffix array construction algorithm based on induced sorting [15]. We show that the working space is O(n log*** loglog *** n ) for any *** where *** is the alphabet size, which is the smallest among the known linear time algorithms.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Fully Functional Static and Dynamic Succinct Trees

Gonzalo Navarro, +1 more

- 01 May 2014 -

ACM Transactions on Algorithms

TL;DR: The range min-max tree as discussed by the authors is a data structure for ordinal trees that can be represented in 2n p O(n/polylog(n)) bits of space.

...read moreread less

Journal ArticleDOI

Word-based self-indexes for natural language text

Antonio Fariña, +5 more

- 06 Mar 2012 -

ACM Transactions on Information Systems

TL;DR: This article introduces a different kind of index that replaces the text using essentially the same space required by the compressed text alone (compression ratio around 35%).

...read moreread less

Journal ArticleDOI

Fully compressed suffix trees

Luís M. S. Russo, +2 more

- 28 Sep 2011 -

ACM Transactions on Algorithms

TL;DR: This article introduces the first compressed suffix tree representation that requires only sublinear space on top of the compressed text size, and supports a wide set of navigational operations in almost logarithmic time.

...read moreread less

Journal ArticleDOI

Lightweight Data Indexing and Compression in External Memory

Paolo Ferragina, +2 more

- 01 Jul 2012 -

Algorithmica

TL;DR: Algorithms for computing the Burrows-Wheeler Transform and for building (compressed) indexes in external memory are described that are lightweight in the sense that, for an input of size n, they use only n bits of working space on disk while all previous approaches use Θ(nlog n) bits.

...read moreread less

Posted Content

Lightweight Data Indexing and Compression in External Memory

Paolo Ferragina, +2 more

- 24 Sep 2009 -

arXiv: Data Structures and Algorithms

TL;DR: In this article, the authors describe algorithms for computing the BWT and for building (compressed) indexes in external memory using only a small amount of disk working space, and prove lower bounds on the complexity of computing and inverting the BWTs via sequential scans in terms of the classic product.

...read moreread less

Collapse

References

PDF

Open Access

More filters

A Block-sorting Lossless Data Compression Algorithm

Michael Burrows, +1 more

TL;DR: A block-sorting, lossless data compression algorithm, and the implementation of that algorithm and the performance of the implementation with widely available data compressors running on the same hardware are compared.

...read moreread less

Journal ArticleDOI

Compressed full-text indexes

Gonzalo Navarro, +1 more

- 12 Apr 2007 -

ACM Computing Surveys

TL;DR: The relationship between text entropy and regularities that show up in index structures and permit compressing them are explained and the most relevant self-indexes are covered, focusing on how they exploit text compressibility to achieve compact structures that can efficiently solve various search problems.

...read moreread less

Journal ArticleDOI

Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching

Roberto Grossi, +1 more

- 01 Aug 2005 -

SIAM Journal on Computing

TL;DR: The result presents for the first time an efficient index whose size is provably linear in the size of the text in the worst case, and for many scenarios, the space is actually sublinear in practice.

...read moreread less

Proceedings ArticleDOI

Succinct indexable dictionaries with applications to encoding k-ary trees and multisets

Rajeev Raman, +2 more

TL;DR: A structure that supports both operations in O(1) time on the RAM model and an information-theoretically optimal representation for cardinal cardinal trees and multisets where (appropriate generalisations of) the select and rank operations can be supported in 1) time.

...read moreread less

Proceedings ArticleDOI

Optimal suffix tree construction with large alphabets

Martin Farach

TL;DR: This work builds suffix trees in linear time for integer alphabet using Weiner's algorithm, which matches a trivial /spl Omega/(n log n)-time lower bound based on sorting.

...read moreread less

A Linear-Time Burrows-Wheeler Transform Using Induced Sorting

Citations

Fully Functional Static and Dynamic Succinct Trees

Word-based self-indexes for natural language text

Fully compressed suffix trees

Lightweight Data Indexing and Compression in External Memory

Lightweight Data Indexing and Compression in External Memory

References

A Block-sorting Lossless Data Compression Algorithm

Compressed full-text indexes

Compressed Suffix Arrays and Suffix Trees with Applications to Text Indexing and String Matching

Succinct indexable dictionaries with applications to encoding k-ary trees and multisets

Optimal suffix tree construction with large alphabets

Related Papers (5)

A Block-sorting Lossless Data Compression Algorithm

Suffix arrays: a new method for on-line string searches

Indexing compressed text

Compressed full-text indexes

Opportunistic data structures with applications