scispace - formally typeset
Journal ArticleDOI

The greedy path-merging algorithm for contig scaffolding

Reads0
Chats0
TLDR
An efficient heuristic called the greedy-path merging algorithm for solving the Contig Scaffolding Problem is described, originally developed as a key component of the compartmentalized assembly strategy developed at Celera Genomics.
Abstract
Given a collection of contigs and mate-pairs. The Contig Scaffolding Problem is to order and orientate the given contigs in a manner that is consistent with as many mate-pairs as possible. This paper describes an efficient heuristic called the greedy-path merging algorithm for solving this problem. The method was originally developed as a key component of the compartmentalized assembly strategy developed at Celera Genomics. This interim approach was used at an early stage of the sequencing of the human genome to produce a preliminary assembly based on preliminary whole genome shotgun data produced at Celera and preliminary human contigs produced by the Human Genome Project.

read more

Content maybe subject to copyright    Report

Citations
More filters

SPAdes, a new genome assembly algorithm and its applications to single-cell sequencing ( 7th Annual SFAF Meeting, 2012)

Glenn Tesler
TL;DR: SPAdes as mentioned in this paper is a new assembler for both single-cell and standard (multicell) assembly, and demonstrate that it improves on the recently released E+V-SC assembler and on popular assemblers Velvet and SoapDeNovo (for multicell data).
Journal ArticleDOI

Comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph.

TL;DR: A detailed comparison of the two major classes of assembly algorithms: overlap-layout-consensus and de-bruijn-graph is made, from how they match the Lander-Waterman model, to the required sequencing depth and reads length.
Journal ArticleDOI

Bioinformatics—an introduction for computer scientists

TL;DR: A bird's eye view of the basic concepts in molecular cell biology is provided, the nature of the existing data is outlined, and the kind of computer algorithms and techniques that are necessary to understand cell behavior are described.
Journal ArticleDOI

Pebble and rock band: heuristic resolution of repeats and scaffolding in the velvet short-read de novo assembler.

TL;DR: A novel heuristic algorithm, Pebble, which uses paired-end read information to resolve repeats and scaffold contigs to produce large-scale assemblies to extend the utility of short read only assemblies into large complex genomes.
References
More filters
Journal ArticleDOI

DNA sequencing with chain-terminating inhibitors

TL;DR: A new method for determining nucleotide sequences in DNA is described, which makes use of the 2',3'-dideoxy and arabinon nucleoside analogues of the normal deoxynucleoside triphosphates, which act as specific chain-terminating inhibitors of DNA polymerase.
Book

Computers and Intractability: A Guide to the Theory of NP-Completeness

TL;DR: The second edition of a quarterly column as discussed by the authors provides a continuing update to the list of problems (NP-complete and harder) presented by M. R. Garey and myself in our book "Computers and Intractability: A Guide to the Theory of NP-Completeness,” W. H. Freeman & Co., San Francisco, 1979.
Journal ArticleDOI

Initial sequencing and analysis of the human genome.

Eric S. Lander, +248 more
- 15 Feb 2001 - 
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Book

Data Reduction and Error Analysis for the Physical Sciences

TL;DR: In this paper, Monte Carlo techniques are used to fit dependent and independent variables least squares fit to a polynomial least-squares fit to an arbitrary function fitting composite peaks direct application of the maximum likelihood.
Related Papers (5)