scispace - formally typeset
Book ChapterDOI

The Burrows-Wheeler Transform between Data Compression and Combinatorics on Words

TLDR
This paper is interested both to survey the theoretical research issues which, by taking their cue from Data Compression, have been developed in the context of Combinatorics on Words, and to focus on those combinatorial results useful to explore the applicative potential of the Burrows-Wheeler Transform.
Abstract
The Burrows-Wheeler Transform (BWT) is a tool of fundamental importance in Data Compression and, recently, has found many applications well beyond its original purpose. The main goal of this paper is to highlight the mathematical and combinatorial properties on which the outstanding versatility of the BWT is based, i.e., its reversibility and the clustering effect on the output. Such properties have aroused curiosity and fervent interest in the scientific world both for theoretical aspects and for practical effects. In particular, in this paper we are interested both to survey the theoretical research issues which, by taking their cue from Data Compression, have been developed in the context of Combinatorics on Words, and to focus on those combinatorial results useful to explore the applicative potential of the Burrows-Wheeler Transform.

read more

Content maybe subject to copyright    Report

Citations
More filters
Book

Algorithms in Bioinformatics: 5th International Workshop, WABI 2005, Mallorca, Spain, October 3-6, 2005, Proceedings (Lecture Notes in Computer Science / Lecture Notes in Bioinformatics)

Rita Casadio, +1 more
TL;DR: In this article, the authors present an efficient reduction from constrained to unconstrained maximum agreement subtree for the maximum quartet consistency problem, which can be solved by using semi-definite programming.
Posted Content

Wheeler Languages

TL;DR: Morphisms, Factors, Suffixes, and Inverses .
Journal ArticleDOI

An External-Memory Algorithm for String Graph Construction

TL;DR: In this paper, an external-memory algorithm is proposed to compute the self-indexes of a set of strings, mainly via computing the Burrows-Wheeler transform of the input strings.
Book ChapterDOI

Block Sorting-Based Transformations on Words: Beyond the Magic BWT

TL;DR: It is shown that the BWT and the Alternating BWT are the only rank-invertible transformations in the class of block sorting-based transformations, and the notion of rank- invertibility is introduced, a property related to the implementation of an efficient inversion procedure.
Journal ArticleDOI

Variable-order reference-free variant discovery with the Burrows-Wheeler Transform

TL;DR: A new algorithm and the corresponding tool ebwt2InDel are introduced that extends the framework of [Prezza et al., AMB 2019] to detect also INDELs, and implements recent algorithmic findings that allow to perform the whole analysis using just the BWT, thus reducing the working space by one order of magnitude and allowing the analysis of full genomes.
References
More filters
Journal ArticleDOI

Fast Pattern Matching in Strings

TL;DR: An algorithm is presented which finds all occurrences of one given string within another, in running time proportional to the sum of the lengths of the strings, showing that the set of concatenations of even palindromes, i.e., the language $\{\alpha \alpha ^R\}^*$, can be recognized in linear time.
Book

Managing Gigabytes: Compressing and Indexing Documents and Images

TL;DR: A guide to the MG system and its applications, as well as a comparison to the NZDL reference index, are provided.
Book

Algebraic Combinatorics on Words

M. Lothaire
TL;DR: In this article, Berstel and Perrin proposed the concept of Sturmian words and the plactic monoid, which is a set of permutations and infinite words.
Proceedings ArticleDOI

Opportunistic data structures with applications

TL;DR: A data structure whose space occupancy is a function of the entropy of the underlying data set is devised, which achieves sublinear space and sublinear query time complexity and is shown how to plug into the Glimpse tool.
Journal ArticleDOI

Alignment-free sequence comparison-a review.

TL;DR: Alignment-free metrics are furthering their usage as a scale-independent methodology that is capable of recognizing homology when loss of contiguity is beyond the possibility of alignment.
Related Papers (5)