scispace - formally typeset
Open AccessJournal ArticleDOI

Optimization of de novo transcriptome assembly from next-generation sequencing data

Yann Surget-Groba, +1 more
- 01 Oct 2010 - 
- Vol. 20, Iss: 10, pp 1432-1440
Reads0
Chats0
TLDR
Two new methods for substantially improving transcriptome de novo assembly were used to assemble successfully the transcripts of the core set of genes regulating tooth development in vertebrates, while classic de noVO assembly failed.
Abstract
Transcriptome analysis has important applications in many biological fields. However, assembling a transcriptome without a known reference remains a challenging task requiring algorithmic improvements. We present two methods for substantially improving transcriptome de novo assembly. The first method relies on the observation that the use of a single k-mer length by current de novo assemblers is suboptimal to assemble transcriptomes where the sequence coverage of transcripts is highly heterogeneous. We present the Multiple-k method in which various k-mer lengths are used for de novo transcriptome assembly. We demonstrate its good performance by assembling de novo a published next-generation transcriptome sequence data set of Aedes aegypti, using the existing genome to check the accuracy of our method. The second method relies on the use of a reference proteome to improve the de novo assembly. We developed the Scaffolding using Translation Mapping (STM) method that uses mapping against the closest available reference proteome for scaffolding contigs that map onto the same protein. In a controlled experiment using simulated data, we show that the STM method considerably improves the assembly, with few errors. We applied these two methods to assemble the transcriptome of the non-model catfish Loricaria gr. cataphracta. Using the Multiple-k and STM methods, the assembly increases in contiguity and in gene identification, showing that our methods clearly improve quality and can be widely used. The new methods were used to assemble successfully the transcripts of the core set of genes regulating tooth development in vertebrates, while classic de novo assembly failed.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Oases: Robust de novo RNA-seq assembly across the dynamic range of expression levels

TL;DR: A software package named Oases designed to heuristically assemble RNA-seq reads in the absence of a reference genome, across a broad spectrum of expression values and in presence of alternative isoforms is presented.
Journal ArticleDOI

Next-generation transcriptome assembly

TL;DR: This Review summarizes the recent developments in transcriptome assembly approaches — reference-based, de novo and combined strategies — along with some perspectives on transcriptomeAssembly in the near future.
Journal ArticleDOI

Computational methods for transcriptome annotation and quantification using RNA-seq

TL;DR: The major conceptual and practical challenges of high-throughput RNA sequencing, the general classes of solutions for each category, and the interdependence between these categories are highlighted and discussed.
Posted Content

SOAPdenovo-Trans: De novo transcriptome assembly with short RNA-Seq reads

TL;DR: SOAPdenovo-Trans as mentioned in this paper is a de novo transcriptome assembler designed specifically for RNA-Seq that provides higher contiguity, lower redundancy, and faster execution.
References
More filters
Journal ArticleDOI

Gapped BLAST and PSI-BLAST: a new generation of protein database search programs.

TL;DR: A new criterion for triggering the extension of word hits, combined with a new heuristic for generating gapped alignments, yields a gapped BLAST program that runs at approximately three times the speed of the original.
Journal ArticleDOI

Mapping and quantifying mammalian transcriptomes by RNA-Seq.

TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Journal ArticleDOI

RNA-Seq: a revolutionary tool for transcriptomics

TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Journal ArticleDOI

Velvet: Algorithms for de novo short read assembly using de Bruijn graphs

TL;DR: Velvet represents a new approach to assembly that can leverage very short reads in combination with read pairs to produce useful assemblies and is in close agreement with simulated results without read-pair information.
Related Papers (5)