MeDuSa: a multi-draft based scaffolder

doi:10.1093/BIOINFORMATICS/BTV171

Open AccessJournal ArticleDOI

MeDuSa: a multi-draft based scaffolder

Emanuele Bosi, +8 more

- 25 Mar 2015 -

Bioinformatics

- Vol. 31, Iss: 15, pp 2443-2451

TLDR

MeDuSa formalizes the scaffolding problem by means of a combinatorial optimization formulation on graphs and implements an efficient constant factor approximation algorithm to solve it, which does not require either prior knowledge on the microrganisms dataset under analysis or the availability of paired end read libraries.

Abstract:

Completing the genome sequence of an organism is an important task in comparative, functional and structural genomics. However, this remains a challenging issue from both a computational and an experimental viewpoint. Genome scaffolding (i.e. the process of ordering and orientating contigs) of de novo assemblies usually represents the first step in most genome finishing pipelines. In this paper, we present MeDuSa (Multi-Draft based Scaffolder), an algorithm for genome scaffolding. MeDuSa exploits information obtained from a set of (draft or closed) genomes from related organisms to determine the correct order and orientation of the contigs. MeDuSa formalises the scaffolding problem by means of a combinatorial optimisation formulation on graphs and implements an efficient constant factor approximation algorithm to solve it. In contrast to currently used scaffolders, it does not require either prior knowledge on the microrganisms dataset under analysis (e.g. their phylogenetic relationships) or the availability of paired end read libraries. This makes usability and running time two additional important features of our method. Moreover, benchmarks and tests on real bacterial datasets showed that MeDuSa is highly accurate and, in most cases, outperforms traditional scaffolders. The possibility to use MeDuSa on eukaryotic datasets has also been evaluated, leading to interesting results. MeDuSa web server: http://combo.dbe.unifi.it/medusa A stand-alone version of the software can be downloaded from https://github.com/combogenomics/medusa/releases. All results presented in this work have been obtained with MeDuSa v. 1.3. marco.fondi@unifi.it.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Comparative genomics sheds light on niche differentiation and the evolutionary history of comammox Nitrospira

Alejandro Palomo, +5 more

- 07 Mar 2018 -

The ISME Journal

TL;DR: The authors' analyses indicate that several genes belonging to the ammonia oxidation pathway could have been laterally transferred from β-AOB to comammox Nitrospira, and postulate that the absence ofcomammox genes in other sublineage II Nitro Spira genomes is the result of subsequent loss.

...read moreread less

Journal ArticleDOI

Cultivation and functional characterization of 79 planctomycetes uncovers their unique biology

Sandra Wiegand, +37 more

- 01 Jan 2020 -

Nature microbiology

TL;DR: Diversity-driven cultivation, characterization and genome sequencing of 79 bacterial strains from all major taxonomic clades of the conspicuous bacterial phylum Planctomycetes are reported, identified previously unknown modes of bacterial cell division and illustrated how ‘microbial dark matter’ can be accessed by cultivation techniques, expanding the organismic background for small-molecule research and drug-target detection.

...read moreread less

Journal ArticleDOI

The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism

Xiaohan Yang, +56 more

- 01 Dec 2017 -

Nature Communications

TL;DR: Evidence is provided for convergent evolution of protein sequence and temporal gene expression underpinning the multiple independent emergences of CAM through genomic analysis of Kalanchoë fedtschenkoi.

...read moreread less

Journal ArticleDOI

Insights into the Evolution of Multicellularity from the Sea Lettuce Genome

Olivier De Clerck, +35 more

- 24 Sep 2018 -

Current Biology

TL;DR: The sequenced genome of Ulva mutabilis, a ubiquitous and iconic representative of the Ulvophyceae or green seaweeds, offers new opportunities to understand coastal and marine ecosystems and the fundamental evolution of the green lineage.

...read moreread less

Journal ArticleDOI

From genome mining to phenotypic microarrays: Planctomycetes as source for novel bioactive molecules.

Olga Jeske, +4 more

- 27 Aug 2013 -

Antonie Van Leeuwenhoek International Jo...

TL;DR: The results point towards a previously postulated relationship of Planctomycetes with algae or plants, which secrete compounds that might serve as trigger to stimulate the secondary metabolite production in Planctomers, and provides the necessary starting point to explore planctomyCetal small molecules for drug development.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Fast and accurate short read alignment with Burrows–Wheeler transform

Heng Li, +1 more

- 01 Jul 2009 -

Bioinformatics

TL;DR: Burrows-Wheeler Alignment tool (BWA) is implemented, a new read alignment package that is based on backward search with Burrows–Wheeler Transform (BWT), to efficiently align short sequencing reads against a large reference sequence such as the human genome, allowing mismatches and gaps.

...read moreread less

Journal ArticleDOI

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

Ben Langmead, +3 more

- 04 Mar 2009 -

Genome Biology

TL;DR: Bowtie extends previous Burrows-Wheeler techniques with a novel quality-aware backtracking algorithm that permits mismatches and can be used simultaneously to achieve even greater alignment speeds.

...read moreread less

Journal ArticleDOI

Versatile and open software for comparing large genomes

Stefan Kurtz, +6 more

- 30 Jan 2004 -

Genome Biology

TL;DR: The newest version of MUMmer easily handles comparisons of large eukaryotic genomes at varying evolutionary distances, as demonstrated by applications to multiple genomes.

...read moreread less

Journal ArticleDOI

ABySS: A parallel assembler for short read sequence data

Jared T. Simpson, +5 more

- 01 Jun 2009 -

Genome Research

TL;DR: ABySS (Assembly By Short Sequences), a parallelized sequence assembler, was developed and assembled 3.5 billion paired-end reads from the genome of an African male publicly released by Illumina, Inc, representing 68% of the reference human genome.

...read moreread less

Journal ArticleDOI

progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

Aaron E. Darling, +2 more

- 25 Jun 2010 -

PLOS ONE

TL;DR: A new method to align two or more genomes that have undergone rearrangements due to recombination and substantial amounts of segmental gain and loss is described, demonstrating high accuracy in situations where genomes have undergone biologically feasible amounts of genome rearrangement, segmental loss and loss.

...read moreread less

Collapse

Related Papers (5)

Trimmomatic: a flexible trimmer for Illumina sequence data

Anthony Bolger, +2 more

- 01 Aug 2014 -

Bioinformatics

Prokka: Rapid Prokaryotic Genome Annotation

Torsten Seemann

- 15 Jul 2014 -

Bioinformatics

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs

Felipe A. Simão, +4 more

- 01 Oct 2015 -

Bioinformatics

MeDuSa: a multi-draft based scaffolder

Citations

Comparative genomics sheds light on niche differentiation and the evolutionary history of comammox Nitrospira

Cultivation and functional characterization of 79 planctomycetes uncovers their unique biology

The Kalanchoë genome provides insights into convergent evolution and building blocks of crassulacean acid metabolism

Insights into the Evolution of Multicellularity from the Sea Lettuce Genome

From genome mining to phenotypic microarrays: Planctomycetes as source for novel bioactive molecules.

References

Fast and accurate short read alignment with Burrows–Wheeler transform

Ultrafast and memory-efficient alignment of short DNA sequences to the human genome

Versatile and open software for comparing large genomes

ABySS: A parallel assembler for short read sequence data

progressiveMauve: Multiple Genome Alignment with Gene Gain, Loss and Rearrangement

Related Papers (5)

SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing

Trimmomatic: a flexible trimmer for Illumina sequence data

Prokka: Rapid Prokaryotic Genome Annotation

The RAST Server: Rapid Annotations using Subsystems Technology

BUSCO: assessing genome assembly and annotation completeness with single-copy orthologs