scispace - formally typeset
Search or ask a question
Journal ArticleDOI

RNA-Seq: a revolutionary tool for transcriptomics

01 Jan 2009-Nature Reviews Genetics (Nature Publishing Group)-Vol. 10, Iss: 1, pp 57-63
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: In this article, the authors describe the techniques and sources of omics data, outline network theory, and highlight exemplars of novel approaches that combine gene regulatory and co-expression networks, proteomics, metabolomics, lipidomics and phenomics with informatics techniques to provide new insights into cardiovascular disease.
Abstract: Omics techniques generate large, multidimensional data that are amenable to analysis by new informatics approaches alongside conventional statistical methods. Systems theories, including network analysis and machine learning, are well placed for analysing these data but must be applied with an understanding of the relevant biological and computational theories. Through applying these techniques to omics data, systems biology addresses the problems posed by the complex organization of biological processes. In this Review, we describe the techniques and sources of omics data, outline network theory, and highlight exemplars of novel approaches that combine gene regulatory and co-expression networks, proteomics, metabolomics, lipidomics and phenomics with informatics techniques to provide new insights into cardiovascular disease. The use of systems approaches will become necessary to integrate data from more than one omic technique. Although understanding the interactions between different omics data requires increasingly complex concepts and methods, we argue that hypothesis-driven investigations and independent validation must still accompany these novel systems biology approaches to realize their full potential.

102 citations

Journal ArticleDOI
Hong Xu1, Yi Gao1, Jianbo Wang1
08 Feb 2012-PLOS ONE
TL;DR: The transcriptome profiling analysis of rice developing embryos is reported using RNA-Seq as an attempt to gain insight into the molecular and cellular events associated with rice embryogenesis, and it is found that many transcription factor families may play important roles at different developmental stages.
Abstract: Rice (Oryza sativa) is an excellent model monocot with a known genome sequence for studying embryogenesis. Here we report the transcriptome profiling analysis of rice developing embryos using RNA-Seq as an attempt to gain insight into the molecular and cellular events associated with rice embryogenesis. RNA-Seq analysis generated 17,755,890 sequence reads aligned with 27,190 genes, which provided abundant data for the analysis of rice embryogenesis. A total of 23,971, 23,732, and 23,592 genes were identified from embryos at three developmental stages (3–5, 7, and 14 DAP), while an analysis between stages allowed the identification of a subset of stage-specific genes. The number of genes expressed stage-specifically was 1,131, 1,443, and 1,223, respectively. In addition, we investigated transcriptomic changes during rice embryogenesis based on our RNA-Seq data. A total of 1,011 differentially expressed genes (DEGs) (log2Ratio ≥1, FDR ≤0.001) were identified; thus, the transcriptome of the developing rice embryos changed considerably. A total of 672 genes with significant changes in expression were detected between 3–5 and 7 DAP; 504 DEGs were identified between 7 and 14 DAP. A large number of genes related to metabolism, transcriptional regulation, nucleic acid replication/processing, and signal transduction were expressed predominantly in the early and middle stages of embryogenesis. Protein biosynthesis-related genes accumulated predominantly in embryos at the middle stage. Genes for starch/sucrose metabolism and protein modification were highly expressed in the middle and late stages of embryogenesis. In addition, we found that many transcription factor families may play important roles at different developmental stages, not only in embryo initiation but also in other developmental processes. These results will expand our understanding of the complex molecular and cellular events in rice embryogenesis and provide a foundation for future studies on embryo development in rice and other cereal crops.

102 citations


Cites background or methods from "RNA-Seq: a revolutionary tool for t..."

  • ...RNA-Seq is more suitable and affordable for comparative gene expression studies because it verifies direct transcript profiling without compromise and potential bias, thus allowing for more sensitive and accurate profiling of the transcriptome that more closely resembles the biology of the cell [15,17]....

    [...]

  • ...More recently developed RNA deep-sequencing technologies, such as digital gene expression (DGE) [14] and Solexa/Illumina RNA-Seq [15], will dramatically change the methods used to identify embryogenesis-related genes in plants because these technologies facilitate investigations of the functional complexity of transcriptomes....

    [...]

Journal ArticleDOI
TL;DR: The main sources of AMR amenable to veterinary medicine are described, driving the attention towards the indissoluble cross-talk existing between the diverse ecosystems and sectors and their cumulative cooperation to this warning phenomenon.
Abstract: Antimicrobial resistance (AMR) represents one of the most important human- and animal health-threatening issues worldwide. Bacterial capability to face antimicrobial compounds is an ancient feature, enabling bacterial survival over time and the dynamic surrounding. Moreover, bacteria make use of their evolutionary machinery to adapt to the selective pressure exerted by antibiotic treatments, resulting in reduced efficacy of the therapeutic intervention against human and animal infections. The mechanisms responsible for both innate and acquired AMR are thoroughly investigated. Commonly, AMR traits are included in mobilizable genetic elements enabling the homogeneous diffusion of the AMR traits pool between the ecosystems of diverse sectors, such as human medicine, veterinary medicine, and the environment. Thus, a coordinated multisectoral approach, such as One-Health, provides a detailed comprehensive picture of the AMR onset and diffusion. Following a general revision of the molecular mechanisms responsible for both innate and acquired AMR, the present manuscript focuses on reviewing the contribution of veterinary medicine to the overall issue of AMR. The main sources of AMR amenable to veterinary medicine are described, driving the attention towards the indissoluble cross-talk existing between the diverse ecosystems and sectors and their cumulative cooperation to this warning phenomenon.

102 citations

Journal ArticleDOI
TL;DR: The recent advances in systems biology are reviewed, and the biological and computational challenges posed in this area are highlighted.
Abstract: The goal of systems biology is to access and integrate information about the parts (e.g., genes, proteins, cells) of a biological system with a view to computing and predicting the behavior of the system. The past decade has witnessed technological revolutions in the capacity to make high throughput measurements about the behavior of genes, proteins, and cells. Such technologies are widely used in biological research and in medicine, such as toward prognosis and therapy response prediction in cancer patients. More recently, systems biology is being applied to vaccinology, with the goal of: (1) understanding the mechanisms by which vaccines stimulate protective immunity, and (2) predicting the immunogenicity or efficacy of vaccines. Here, we review the recent advances in this area, and highlight the biological and computational challenges posed.

102 citations

Journal ArticleDOI
TL;DR: The ribosomal RNA depletion protocol from Illumina works very well at amounts far below recommendation and over a good range of intact and degraded material, and the exome-capture protocol (RNA Access, Illumina) performs better than other methods on highly degraded and low amount samples.
Abstract: RNA-sequencing (RNA-seq) has emerged as one of the most sensitive tool for gene expression analysis. Among the library preparation methods available, the standard poly(A) + enrichment provides a comprehensive, detailed, and accurate view of polyadenylated RNAs. However, on samples of suboptimal quality ribosomal RNA depletion and exon capture methods have recently been reported as better alternatives. We compared for the first time three commercial Illumina library preparation kits (TruSeq Stranded mRNA, TruSeq Ribo-Zero rRNA Removal, and TruSeq RNA Access) as representatives of these three different approaches using well-established human reference RNA samples from the MAQC/SEQC consortium on a wide range of input amounts (from 100 ng down to 1 ng) and degradation levels (intact, degraded, and highly degraded). We assessed the accuracy of the generated expression values by comparison to gold standard TaqMan qPCR measurements and gained unprecedented insight into the limits of applicability in terms of input quantity and sample quality of each protocol. We found that each protocol generates highly reproducible results (R 2 > 0.92) on intact RNA samples down to input amounts of 10 ng. For degraded RNA samples, Ribo-Zero showed clear performance advantages over the other two protocols as it generated more accurate and better reproducible gene expression results even at very low input amounts such as 1 ng and 2 ng. For highly degraded RNA samples, RNA Access performed best generating reliable data down to 5 ng input. We found that the ribosomal RNA depletion protocol from Illumina works very well at amounts far below recommendation and over a good range of intact and degraded material. We also infer that the exome-capture protocol (RNA Access, Illumina) performs better than other methods on highly degraded and low amount samples.

102 citations

References
More filters
Journal ArticleDOI
TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Abstract: We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 10 5 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices. The mRNA population specifies a cell’s identity and helps to govern its present and future activities. This has made transcriptome analysis a general phenotyping method, with expression microarrays of many kinds in routine use. Here we explore the possibility that transcriptome analysis, transcript discovery and transcript refinement can be done effectively in large and complex mammalian genomes by ultra-high-throughput sequencing. Expression microarrays are currently the most widely used methodology for transcriptome analysis, although some limitations persist. These include hybridization and cross-hybridization artifacts 1–3 , dye-based detection issues and design constraints that preclude or seriously limit the detection of RNA splice patterns and previously unmapped genes. These issues have made it difficult for standard array designs to provide full sequence comprehensiveness (coverage of all possible genes, including unknown ones, in large genomes) or transcriptome comprehensiveness (reliable detection of all RNAs of all prevalence classes, including the least abundant ones that are physiologically relevant). Other

12,293 citations

PatentDOI
04 Oct 2000-Science
TL;DR: Serial analysis of gene expression (SAGE) should provide a broadly applicable means for the quantitative cataloging and comparison of expressed genes in a variety of normal, developmental, and disease states.
Abstract: PROBLEM TO BE SOLVED: To provide a method for preparing a short nucleotide sequence (tag) which is useful to identify a cDNA oligonucleotide and is derived from a restricted position in a mRNA or a cDNA. SOLUTION: This is the method of preparing a tag for identifying the cDNA oligonucleotide. The above method comprises preparing the cDNA oligonucleotide bearing 5' and 3' terminals, collecting cDNA fragments by cutting the cDNA oligonucleotide with a restriction enzyme at the first restriction endonuclease site, separating a cDNA oligonucleotide bearing 5' or 3' terminal and connecting an oligonucleotide linker to the isolated cDNA fragment bearing the cDNA oligonucleotide 5' or 3' terminal. Here, the oligonucleotide linker contains the recognition site of the second restriction endonuclease enzyme and the isolated cDNA fragment is cut with the second restriction endonuclease enzyme which cuts the cDNA fragment in a section separated from the recognition site to obtain the tag for identifying the cDNA oligonucleotide.

4,437 citations

Journal ArticleDOI
TL;DR: This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
Abstract: New sequencing technologies promise a new era in the use of DNA sequence. However, some of these technologies produce very short reads, typically of a few tens of base pairs, and to use these reads effectively requires new algorithms and software. In particular, there is a major issue in efficiently aligning short reads to a reference genome and handling ambiguity or lack of accuracy in this alignment. Here we introduce the concept of mapping quality, a measure of the confidence that a read actually comes from the position it is aligned to by the mapping algorithm. We describe the software MAQ that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample. MAQ makes full use of mate-pair information and estimates the error probability of each read alignment. Error probabilities are also derived for the final genotype calls, using a Bayesian statistical model that incorporates the mapping qualities, error probabilities from the raw sequence quality scores, sampling of the two haplotypes, and an empirical model for correlated errors at a site. Both read mapping and genotype calling are evaluated on simulated data and real data. MAQ is accurate, efficient, versatile, and user-friendly. It is freely available at http://maq.sourceforge.net.

2,927 citations

Journal ArticleDOI
TL;DR: It is found that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane).
Abstract: Ultra-high-throughput sequencing is emerging as an attractive alternative to microarrays for genotyping, analysis of methylation patterns, and identification of transcription factor binding sites. Here, we describe an application of the Illumina sequencing (formerly Solexa sequencing) platform to study mRNA expression levels. Our goals were to estimate technical variance associated with Illumina sequencing in this context and to compare its ability to identify differentially expressed genes with existing array technologies. To do so, we estimated gene expression differences between liver and kidney RNA samples using multiple sequencing replicates, and compared the sequencing data to results obtained from Affymetrix arrays using the same RNA samples. We find that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane). The information in a single lane of Illumina sequencing data appears comparable to that in a single array in enabling identification of differentially expressed genes, while allowing for additional analyses such as detection of low-expressed genes, alternative splice variants, and novel transcripts. Based on our observations, we propose an empirical protocol and a statistical framework for the analysis of gene expression using ultra-high-throughput sequencing technology.

2,834 citations

Journal ArticleDOI
TL;DR: The program SOAP is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology, which supports multi-threaded parallel computing and has a batch module for multiple query sets.
Abstract: Summary: We have developed a program SOAP for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The program is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology. SOAP is compatible with numerous applications, including single-read or pair-end resequencing, small RNA discovery and mRNA tag sequence mapping. SOAP is a command-driven program, which supports multi-threaded parallel computing, and has a batch module for multiple query sets. Availability: http://soap.genomics.org.cn Contact: soap@genomics.org.cn

2,729 citations


"RNA-Seq: a revolutionary tool for t..." refers methods in this paper

  • ...There are several programs for mapping reads to the genome, including ELAND, SOA...

    [...]