Topic

De Bruijn sequence

About: De Bruijn sequence is a research topic. Over the lifetime, 1408 publications have been published within this topic receiving 28620 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Velvet: Algorithms for de novo short read assembly using de Bruijn graphs

[...]

Daniel R. Zerbino¹, Ewan Birney¹•Institutions (1)

European Bioinformatics Institute¹

01 May 2008-Genome Research

TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.

...read moreread less

Abstract: We have developed a new set of algorithms, collectively called "Velvet," to manipulate de Bruijn graphs for genomic sequence assembly. A de Bruijn graph is a compact representation based on short words (k-mers) that is ideal for high coverage, very short read (25-50 bp) data sets. Applying Velvet to very short reads and paired-ends information only, one can produce contigs of significant length, up to 50-kb N50 length in simulations of prokaryotic data and 3-kb N50 on simulated mammalian BACs. When applied to real Solexa data sets without read pairs, Velvet generated contigs of approximately 8 kb in a prokaryote and 2 kb in a mammalian BAC, in close agreement with our simulated results without read-pair information. Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies.

...read moreread less

9,389 citations

Journal Article•DOI•

Efficient de novo assembly of large genomes using compressed data structures

[...]

Jared T. Simpson¹, Richard Durbin•Institutions (1)

Wellcome Trust Sanger Institute¹

01 Mar 2012-Genome Research

TL;DR: A new assembler based on the overlap-based string graph model of assembly, SGA (String Graph Assembler), which provides the first practical assembler for a mammalian-sized genome on a low-end computing cluster and is simply parallelizable.

...read moreread less

Abstract: De novo genome sequence assembly is important both to generate new sequence assemblies for previously uncharacterized genomes and to identify the genome sequence of individuals in a reference-unbiased way. We present memory efficient data structures and algorithms for assembly using the FM-index derived from the compressed Burrows-Wheeler transform, and a new assembler based on these called SGA (String Graph Assembler). We describe algorithms to error-correct, assemble, and scaffold large sets of sequence data. SGA uses the overlap-based string graph model of assembly, unlike most de novo assemblers that rely on de Bruijn graphs, and is simply parallelizable. We demonstrate the error correction and assembly performance of SGA on 1.2 billion sequence reads from a human genome, which we are able to assemble using 54 GB of memory. The resulting contigs are highly accurate and contiguous, while covering 95% of the reference genome (excluding contigs <200 bp in length). Because of the low memory requirements and parallelization without requiring inter-process communication, SGA provides the first practical assembler to our knowledge for a mammalian-sized genome on a low-end computing cluster.

...read moreread less

811 citations

Journal Article•DOI•

De novo assembly and genotyping of variants using colored de Bruijn graphs

[...]

Zamin Iqbal¹, Mario Caccamo², Isaac Turner¹, Paul Flicek³, Gil McVean¹, Gil McVean⁴ - Show less +2 more•Institutions (4)

Wellcome Trust Centre for Human Genetics¹, Norwich Research Park², European Bioinformatics Institute³, University of Oxford⁴

01 Feb 2012-Nature Genetics

TL;DR: An efficient software implementation, Cortex, the first de novo assembler capable of assembling multiple eukaryotic genomes simultaneously is provided, and how population information from ten chimpanzees enables accurate variant calls without a reference sequence is shown.

...read moreread less

Abstract: Gil McVean and colleagues report algorithms for de novo assembly and genotyping of variants using colored de Bruijn graphs and provide these in a software implementation called Cortex. Their methods can detect and genotype both simple and complex genetic variants in either an individual or a population.

...read moreread less

695 citations

Journal Article•DOI•

How to apply de Bruijn graphs to genome assembly

[...]

Phillip E. C. Compeau¹, Pavel A. Pevzner¹, Glenn Tesler¹•Institutions (1)

University of California, San Diego¹

01 Nov 2011-Nature Biotechnology

TL;DR: A mathematical concept known as a de Bruijn graph turns the formidable challenge of assembling a contiguous genome from billions of short sequencing reads into a tractable computational problem.

...read moreread less

Abstract: A mathematical concept known as a de Bruijn graph turns the formidable challenge of assembling a contiguous genome from billions of short sequencing reads into a tractable computational problem.

...read moreread less

623 citations

Journal Article•DOI•

MetaVelvet: an extension of Velvet assembler to de novo metagenome assembly from short sequence reads

[...]

Toshiaki Namiki¹, Tsuyoshi Hachiya¹, H. Tanaka¹, Yasubumi Sakakibara¹•Institutions (1)

Keio University¹

01 Nov 2012-Nucleic Acids Research

TL;DR: An important step in ‘metagenomics’ analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community, and a single-genome assembler for short reads was extended to metagenome assembly.

...read moreread less

Abstract: An important step in 'metagenomics' analysis is the assembly of multiple genomes from mixed sequence reads of multiple species in a microbial community. Most conventional pipelines use a single-genome assembler with carefully optimized parameters. A limitation of a single-genome assembler for de novo metagenome assembly is that sequences of highly abundant species are likely misidentified as repeats in a single genome, resulting in a number of small fragmented scaffolds. We extended a single-genome assembler for short reads, known as 'Velvet', to metagenome assembly, which we called 'MetaVelvet', for mixed short reads of multiple species. Our fundamental concept was to first decompose a de Bruijn graph constructed from mixed short reads into individual sub-graphs, and second, to build scaffolds based on each decomposed de Bruijn sub-graph as an isolate species genome. We made use of two features, the coverage (abundance) difference and graph connectivity, for the decomposition of the de Bruijn graph. For simulated datasets, MetaVelvet succeeded in generating significantly higher N50 scores than any single-genome assemblers. MetaVelvet also reconstructed relatively low-coverage genome sequences as scaffolds. On real datasets of human gut microbial read data, MetaVelvet produced longer scaffolds and increased the number of predicted genes.

...read moreread less

591 citations

Collapse

Network Information

Performance

Metrics

1,519

Papers

31,440

Citations

No. of papers in the topic in previous years
Year	Papers
2023	36
2022	75
2021	50
2020	73
2019	80
2018	65

De Bruijn sequence

Papers published on a yearly basis

Papers

Trending Questions (7)

Network Information

Related Topics (5)

Performance

Metrics