scispace - formally typeset
Search or ask a question
Journal ArticleDOI

RNA-Seq: a revolutionary tool for transcriptomics

01 Jan 2009-Nature Reviews Genetics (Nature Publishing Group)-Vol. 10, Iss: 1, pp 57-63
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An overview of the wheat genome and NGS technologies is provided, some of the problems in applying NGS technology to wheat are details, and how N GS technologies are starting to impact wheat crop improvement is described.
Abstract: Free to read Bread wheat (Triticum aestivum; Poaceae) is a crop plant of great importance. It provides nearly 20% of the world’s daily food supply measured by calorie intake, similar to that provided by rice. The yield of wheat has doubled over the last 40 years due to a combination of advanced agronomic practice and improved germplasm through selective breeding. More recently, yield growth has been less dramatic, and a significant improvement in wheat production will be required if demand from the growing human population is to be met. Next-generation sequencing (NGS) technologies are revolutionizing biology and can be applied to address critical issues in plant biology. Technologies can produce draft sequences of genomes with a significant reduction to the cost and timeframe of traditional technologies. In addition, NGS technologies can be used to assess gene structure and expression, and importantly, to identify heritable genome variation underlying important agronomic traits. This review provides an overview of the wheat genome and NGS technologies, details some of the problems in applying NGS technology to wheat, and describes how NGS technologies are starting to impact wheat crop improvement.

91 citations

Journal ArticleDOI
07 Nov 2013-PLOS ONE
TL;DR: For the first time, de novo transcriptome sequencing is performed and mapped to the moso bamboo genomic resources (reference genome and genes) to produce a comprehensive dataset for the fast growing shoots of mosoamboo.
Abstract: Background: The moso bamboo, a large woody bamboo with the highest ecological, economic, and cultural value of all bamboos, has one of the highest growth speeds in the world. Genetic research into moso bamboo has been scarce, partly because of the lack of previous genomic resources. In the present study, for the first time, we performed de novo transcriptome sequencing and mapped to the moso bamboo genomic resources (reference genome and genes) to produce a comprehensive dataset for the fast growing shoots of moso bamboo. Results: The fast growing shoots mixed with six different heights and culms after leaf expansion of moso bamboo transcriptome were sequenced using the Illumina HiSeq TM 2000 sequencing platform, respectively. More than 80 million reads including 65,045,670 and 68,431,884 clean reads were produced in the two libraries. More than 81% of the reads were matched to the reference genome, and nearly 50% of the reads were matched to the reference genes. The genes with log 2 ratio . 2o r ,2 2( P,0.001) were characterized as the most differentially expressed genes. 6,076 up-regulated and 4,613 down-regulated genes were classified into functional categories. Candidate genes which mainly involved transcript factors, plant hormones, cell cycle regulation, cell wall metabolism and cell morphogenesis genes were further analyzed and they may form a network that regulates the fast growth of moso bamboo shoots. Conclusion: Firstly, our data provides the most comprehensive transcriptomic resource for moso bamboo to date. Candidate genes have been identified and they are potentially involved in the growth and development of moso bamboo. The results give a better insight into the mechanisms of moso bamboo shoots rapid growth and provide gene resources for improving plant growth.

91 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...Transcriptome analysis is essential in interpreting the functional elements of the genome and reveals the molecular constituents of cells and tissues [27,28]....

    [...]

Journal ArticleDOI
TL;DR: The majority of human cells do not multiply continuously but are quiescent or slow-replicating and could lead to the production of mutant proteins and might therefore be important in tumour development.
Abstract: The majority of human cells do not multiply continuously but are quiescent or slow-replicating and devote a large part of their energy to transcription. When DNA damage in the transcribed strand of an active gene is bypassed by a RNA polymerase, they can miscode at the damaged site and produce mutant transcripts. This process is known as transcriptional mutagenesis and, as discussed in this Perspective, could lead to the production of mutant proteins and might therefore be important in tumour development.

91 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...For example, massively parallel sequencing of cellular mRNA population...

    [...]

Journal ArticleDOI
TL;DR: This study identified a total of 187 genes whose expression differs in response to hard and soft diets, including immediate early genes, extracellular matrix genes and inflammatory factors, which opens up new avenues of research at new levels of biological organization into the roles of phenotypic plasticity during speciation and radiation of cichlid fishes.
Abstract: Adaptive phenotypic plasticity, the ability of an organism to change its phenotype to match local environments, is increasingly recognized for its contribution to evolution. However, few empirical studies have explored the molecular basis of plastic traits. The East African cichlid fish Astatoreochromis alluaudi displays adaptive phenotypic plasticity in its pharyngeal jaw apparatus, a structure that is widely seen as an evolutionary key innovation that has contributed to the remarkable diversity of cichlid fishes. It has previously been shown that in response to different diets, the pharyngeal jaws change their size, shape and dentition: hard diets induce an adaptive robust molariform tooth phenotype with short jaws and strong internal bone structures, while soft diets induce a gracile papilliform tooth phenotype with elongated jaws and slender internal bone structures. To gain insight into the molecular underpinnings of these adaptations and enable future investigations of the role that phenotypic plasticity plays during the formation of adaptive radiations, the transcriptomes of the two divergent jaw phenotypes were examined. Our study identified a total of 187 genes whose expression differs in response to hard and soft diets, including immediate early genes, extracellular matrix genes and inflammatory factors. Transcriptome results are interpreted in light of expression of candidate genes—markers for tooth size and shape, bone cells and mechanically sensitive pathways. This study opens up new avenues of research at new levels of biological organization into the roles of phenotypic plasticity during speciation and radiation of cichlid fishes.

91 citations


Cites methods from "RNA-Seq: a revolutionary tool for t..."

  • ...Orthology was confirmed through construction of maximum-likelihood trees using Jalview (Waterhouse et al. 2009) and PhyML (Guindon et al. 2010) (Fig....

    [...]

Journal ArticleDOI
TL;DR: The transcriptome of the human myometrial samples taken from patients prior to and after the onset of spontaneous labour is described for the first time, documenting a significant number of novel transcripts of both protein‐coding mRNA and microRNA.
Abstract: The transition of the human uterus from a quiescent to a contractile state takes place over a number of weeks. On such biological time scales, cellular phenotype is modified by changes in the transcriptome, which in turn is under the control of the underlying endocrine, paracrine, and biophysical processes resulting from the ongoing pregnancy. In this study, we characterize the transition of the human myometrial transcriptome at term from not in labour (NIL) to in labour (LAB) using high throughput RNA sequencing (RNA-seq). RNA was isolated from the myometrium of uterine biopsies from patients at term who were not in labour (n = 5) and at term in spontaneous labour (n = 5) without augmentation. A total of 143.6 million separate reads were sequenced, achieving, on average, ∼13 times coverage of the expressed human transcriptome per sample. Principal component analysis indicated that the NIL and LAB transcriptomes could be distinguished as two distinct clusters. A comparison of the NIL and LAB groups, using three different statistical approaches (baySeq, edgeR, and DESeq), demonstrated an overlap of 764 differentially expressed genes. A comparison with currently available microarray data revealed only a partial overlap in differentially expressed genes. We conclude that the described RNA-seq data sets represent the first fully annotated catalogue of expressed mRNAs in human myometrium. When considered together, the full expression repertoire and the differentially expressed gene sets should provide an excellent resource for formulating new hypotheses of physiological function, as well as the discovery of novel therapeutic targets.

91 citations


Cites background or methods from "RNA-Seq: a revolutionary tool for t..."

  • ...2012), arrays in general are less sensitive than RNA-seq and have limited ability to detect alternative spliced mRNA isoforms (Wang et al. 2009; McGettigan, 2013)....

    [...]

  • ...We employed this strategy because although RNA-seq data offer unprecedented resolution and reproducibility (Marioni et al. 2008; Wang et al. 2009), analysis of complete transcripts rather than individual probes on subregions of transcripts is challenging (Wagner et al....

    [...]

  • ...We employed this strategy because although RNA-seq data offer unprecedented resolution and reproducibility (Marioni et al. 2008; Wang et al. 2009), analysis of complete transcripts rather than individual probes on subregions of transcripts is challenging (Wagner et al. 2012)....

    [...]

  • ...Although not completely replacing other technologies, such as microarray (Kogenaru et al. 2012), arrays in general are less sensitive than RNA-seq and have limited ability to detect alternative spliced mRNA isoforms (Wang et al. 2009; McGettigan, 2013)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Abstract: We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 10 5 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices. The mRNA population specifies a cell’s identity and helps to govern its present and future activities. This has made transcriptome analysis a general phenotyping method, with expression microarrays of many kinds in routine use. Here we explore the possibility that transcriptome analysis, transcript discovery and transcript refinement can be done effectively in large and complex mammalian genomes by ultra-high-throughput sequencing. Expression microarrays are currently the most widely used methodology for transcriptome analysis, although some limitations persist. These include hybridization and cross-hybridization artifacts 1–3 , dye-based detection issues and design constraints that preclude or seriously limit the detection of RNA splice patterns and previously unmapped genes. These issues have made it difficult for standard array designs to provide full sequence comprehensiveness (coverage of all possible genes, including unknown ones, in large genomes) or transcriptome comprehensiveness (reliable detection of all RNAs of all prevalence classes, including the least abundant ones that are physiologically relevant). Other

12,293 citations

PatentDOI
04 Oct 2000-Science
TL;DR: Serial analysis of gene expression (SAGE) should provide a broadly applicable means for the quantitative cataloging and comparison of expressed genes in a variety of normal, developmental, and disease states.
Abstract: PROBLEM TO BE SOLVED: To provide a method for preparing a short nucleotide sequence (tag) which is useful to identify a cDNA oligonucleotide and is derived from a restricted position in a mRNA or a cDNA. SOLUTION: This is the method of preparing a tag for identifying the cDNA oligonucleotide. The above method comprises preparing the cDNA oligonucleotide bearing 5' and 3' terminals, collecting cDNA fragments by cutting the cDNA oligonucleotide with a restriction enzyme at the first restriction endonuclease site, separating a cDNA oligonucleotide bearing 5' or 3' terminal and connecting an oligonucleotide linker to the isolated cDNA fragment bearing the cDNA oligonucleotide 5' or 3' terminal. Here, the oligonucleotide linker contains the recognition site of the second restriction endonuclease enzyme and the isolated cDNA fragment is cut with the second restriction endonuclease enzyme which cuts the cDNA fragment in a section separated from the recognition site to obtain the tag for identifying the cDNA oligonucleotide.

4,437 citations

Journal ArticleDOI
TL;DR: This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
Abstract: New sequencing technologies promise a new era in the use of DNA sequence. However, some of these technologies produce very short reads, typically of a few tens of base pairs, and to use these reads effectively requires new algorithms and software. In particular, there is a major issue in efficiently aligning short reads to a reference genome and handling ambiguity or lack of accuracy in this alignment. Here we introduce the concept of mapping quality, a measure of the confidence that a read actually comes from the position it is aligned to by the mapping algorithm. We describe the software MAQ that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample. MAQ makes full use of mate-pair information and estimates the error probability of each read alignment. Error probabilities are also derived for the final genotype calls, using a Bayesian statistical model that incorporates the mapping qualities, error probabilities from the raw sequence quality scores, sampling of the two haplotypes, and an empirical model for correlated errors at a site. Both read mapping and genotype calling are evaluated on simulated data and real data. MAQ is accurate, efficient, versatile, and user-friendly. It is freely available at http://maq.sourceforge.net.

2,927 citations

Journal ArticleDOI
TL;DR: It is found that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane).
Abstract: Ultra-high-throughput sequencing is emerging as an attractive alternative to microarrays for genotyping, analysis of methylation patterns, and identification of transcription factor binding sites. Here, we describe an application of the Illumina sequencing (formerly Solexa sequencing) platform to study mRNA expression levels. Our goals were to estimate technical variance associated with Illumina sequencing in this context and to compare its ability to identify differentially expressed genes with existing array technologies. To do so, we estimated gene expression differences between liver and kidney RNA samples using multiple sequencing replicates, and compared the sequencing data to results obtained from Affymetrix arrays using the same RNA samples. We find that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane). The information in a single lane of Illumina sequencing data appears comparable to that in a single array in enabling identification of differentially expressed genes, while allowing for additional analyses such as detection of low-expressed genes, alternative splice variants, and novel transcripts. Based on our observations, we propose an empirical protocol and a statistical framework for the analysis of gene expression using ultra-high-throughput sequencing technology.

2,834 citations

Journal ArticleDOI
TL;DR: The program SOAP is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology, which supports multi-threaded parallel computing and has a batch module for multiple query sets.
Abstract: Summary: We have developed a program SOAP for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The program is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology. SOAP is compatible with numerous applications, including single-read or pair-end resequencing, small RNA discovery and mRNA tag sequence mapping. SOAP is a command-driven program, which supports multi-threaded parallel computing, and has a batch module for multiple query sets. Availability: http://soap.genomics.org.cn Contact: soap@genomics.org.cn

2,729 citations


"RNA-Seq: a revolutionary tool for t..." refers methods in this paper

  • ...There are several programs for mapping reads to the genome, including ELAND, SOA...

    [...]