scispace - formally typeset
Open AccessJournal ArticleDOI

MeDuSa: a multi-draft based scaffolder

TLDR
MeDuSa formalizes the scaffolding problem by means of a combinatorial optimization formulation on graphs and implements an efficient constant factor approximation algorithm to solve it, which does not require either prior knowledge on the microrganisms dataset under analysis or the availability of paired end read libraries.
Abstract
Completing the genome sequence of an organism is an important task in comparative, functional and structural genomics. However, this remains a challenging issue from both a computational and an experimental viewpoint. Genome scaffolding (i.e. the process of ordering and orientating contigs) of de novo assemblies usually represents the first step in most genome finishing pipelines. In this paper, we present MeDuSa (Multi-Draft based Scaffolder), an algorithm for genome scaffolding. MeDuSa exploits information obtained from a set of (draft or closed) genomes from related organisms to determine the correct order and orientation of the contigs. MeDuSa formalises the scaffolding problem by means of a combinatorial optimisation formulation on graphs and implements an efficient constant factor approximation algorithm to solve it. In contrast to currently used scaffolders, it does not require either prior knowledge on the microrganisms dataset under analysis (e.g. their phylogenetic relationships) or the availability of paired end read libraries. This makes usability and running time two additional important features of our method. Moreover, benchmarks and tests on real bacterial datasets showed that MeDuSa is highly accurate and, in most cases, outperforms traditional scaffolders. The possibility to use MeDuSa on eukaryotic datasets has also been evaluated, leading to interesting results. MeDuSa web server: http://combo.dbe.unifi.it/medusa A stand-alone version of the software can be downloaded from https://github.com/combogenomics/medusa/releases. All results presented in this work have been obtained with MeDuSa v. 1.3. marco.fondi@unifi.it.

read more

Citations
More filters
Journal ArticleDOI

Comparative genomics sheds light on niche differentiation and the evolutionary history of comammox Nitrospira

TL;DR: The authors' analyses indicate that several genes belonging to the ammonia oxidation pathway could have been laterally transferred from β-AOB to comammox Nitrospira, and postulate that the absence ofcomammox genes in other sublineage II Nitro Spira genomes is the result of subsequent loss.
Journal ArticleDOI

Cultivation and functional characterization of 79 planctomycetes uncovers their unique biology

TL;DR: Diversity-driven cultivation, characterization and genome sequencing of 79 bacterial strains from all major taxonomic clades of the conspicuous bacterial phylum Planctomycetes are reported, identified previously unknown modes of bacterial cell division and illustrated how ‘microbial dark matter’ can be accessed by cultivation techniques, expanding the organismic background for small-molecule research and drug-target detection.
Journal ArticleDOI

The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism

TL;DR: Evidence is provided for convergent evolution of protein sequence and temporal gene expression underpinning the multiple independent emergences of CAM through genomic analysis of Kalanchoë fedtschenkoi.
Journal ArticleDOI

From genome mining to phenotypic microarrays: Planctomycetes as source for novel bioactive molecules.

TL;DR: The results point towards a previously postulated relationship of Planctomycetes with algae or plants, which secrete compounds that might serve as trigger to stimulate the secondary metabolite production in Planctomers, and provides the necessary starting point to explore planctomyCetal small molecules for drug development.
References
More filters
Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.
Journal ArticleDOI

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.
Journal ArticleDOI

Versatile and open software for comparing large genomes

TL;DR: The newest version of MUMmer easily handles comparisons of large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes.
Journal ArticleDOI

ABySS: A parallel assembler for short read sequence data

TL;DR: ABySS (Assembly By Short Sequences), a parallelized sequence assembler, was developed and assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina, Inc, representing 68% of the reference human genome.
Journal ArticleDOI

progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

TL;DR: A new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss is described, demonstrating high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental loss and loss.
Related Papers (5)