scispace - formally typeset
Search or ask a question
Journal ArticleDOI

RNA-Seq: a revolutionary tool for transcriptomics

01 Jan 2009-Nature Reviews Genetics (Nature Publishing Group)-Vol. 10, Iss: 1, pp 57-63
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An overrepresentation of upregulated genes in transport, secondary metabolism, and cell wall and surface functions is found and many relevant genes are identified, in the context of biofilm biology, candidate genes for downstream functional experiments.
Abstract: Aspergillus fumigatus is the most common and deadly pulmonary fungal infection worldwide. In the lung, the fungus usually forms a dense colony of filaments embedded in a polymeric extracellular matrix. To identify candidate genes involved in this biofilm (BF) growth, we used RNA-Seq to compare the transcriptomes of BF and liquid plankton (PL) growth. Sequencing and mapping of tens of millions sequence reads against the A. fumigatus transcriptome identified 3,728 differentially regulated genes in the two conditions. Although many of these genes, including the ones coding for transcription factors, stress response, the ribosome, and the translation machinery, likely reflect the different growth demands in the two conditions, our experiment also identified hundreds of candidate genes for the observed differences in morphology and pathobiology between BF and PL. We found an overrepresentation of upregulated genes in transport, secondary metabolism, and cell wall and surface functions. Furthermore, upregulated genes showed significant spatial structure across the A. fumigatus genome; they were more likely to occur in subtelomeric regions and colocalized in 27 genomic neighborhoods, many of which overlapped with known or candidate secondary metabolism gene clusters. We also identified 1,164 genes that were downregulated. This gene set was not spatially structured across the genome and was overrepresented in genes participating in primary metabolic functions, including carbon and amino acid metabolism. These results add valuable insight into the genetics of biofilm formation in A. fumigatus and other filamentous fungi and identify many relevant, in the context of biofilm biology, candidate genes for downstream functional experiments.

103 citations


Cites methods from "RNA-Seq: a revolutionary tool for t..."

  • ...fumigatus by using microarray and twodimensional (2D) gel electrophoresis technologies identified only 700 genes and 40 proteins that were differentially abundant during development (8), RNA-Seq appears to be the most powerful tool for genome-wide functional comparisons of fungal growth to date (7, 49, 77, 79)....

    [...]

Journal ArticleDOI
TL;DR: Recent advances in the emerging field of comparative physiological genomics are considered, including examples of plants, bees and fish, and opportunities for further development are outlined particularly in the context of climate change research.
Abstract: Organisms that live in variable environments must adjust their physiology to compensate for environmental change. Modern functional genomics technologies offer global top-down discovery-based tools for identifying and exploring the mechanistic basis by which organisms respond physiologically to a detected change in the environment. Given that populations and species from different niches may exhibit different acclimation abilities, comparative genomic approaches may offer more nuanced understanding of acclimation responses, and provide insight into the mechanistic and genomic basis of variable acclimation. The physiological genomics literature is large and growing, as is the comparative evolutionary genomics literature. Yet, expansion of physiological genomics experiments to exploit taxonomic variation remains relatively undeveloped. Here, recent advances in the emerging field of comparative physiological genomics are considered, including examples of plants, bees and fish, and opportunities for further development are outlined particularly in the context of climate change research. Elements of robust experimental design are discussed with emphasis on the phylogenetic comparative approach. Understanding how acclimation ability is partitioned among populations and species in nature, and knowledge of the relevant genes and mechanisms, will be important for characterizing and predicting the ecological and evolutionary consequences of human-accelerated environmental change.

103 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...Although microarrays are the more mature technology in terms of development, deployment and data analysis, massively parallel RNA sequencing is rapidly replacing microarrays for many applications (Wang et al., 2009)....

    [...]

Journal ArticleDOI
TL;DR: It is demonstrated that a nonprotein-coding antisense transcript originating from a conserved at DNA—but not protein level—DOG1 region is a negative regulator of DOG1 expression and seed dormancy establishment, and it is proposed that this cis-constrained noncoding RNA-mediated mechanism limiting the duration of Seed dormancy functions across the Brassicaceae.
Abstract: Seed dormancy is one of the most crucial process transitions in a plant's life cycle. Its timing is tightly controlled by the expression level of the Delay of Germination 1 gene (DOG1). DOG1 is the major quantitative trait locus for seed dormancy in Arabidopsis and has been shown to control dormancy in many other plant species. This is reflected by the evolutionary conservation of the functional short alternatively polyadenylated form of the DOG1 mRNA. Notably, the 3' region of DOG1, including the last exon that is not included in this transcript isoform, shows a high level of conservation at the DNA level, but the encoded polypeptide is poorly conserved. Here, we demonstrate that this region of DOG1 contains a promoter for the transcription of a noncoding antisense RNA, asDOG1, that is 5' capped, polyadenylated, and relatively stable. This promoter is autonomous and asDOG1 has an expression profile that is different from known DOG1 transcripts. Using several approaches we show that asDOG1 strongly suppresses DOG1 expression during seed maturation in cis, but is unable to do so in trans Therefore, the negative regulation of seed dormancy by asDOG1 in cis results in allele-specific suppression of DOG1 expression and promotes germination. Given the evolutionary conservation of the asDOG1 promoter, we propose that this cis-constrained noncoding RNA-mediated mechanism limiting the duration of seed dormancy functions across the Brassicaceae.

103 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...Proc Natl Acad Sci USA 103(45): 17042–17047....

    [...]

  • ...This antisense transcript has a tissuespecific expression pattern, indicating that it is not generated by spurious transcriptional noise (45) (Fig....

    [...]

Journal ArticleDOI
TL;DR: A label-free ribobase identification method is described, which uses ionic current measurement to resolve ribonucleoside monophosphates or diphosphates in α-hemolysin protein nanopores containing amino-cyclodextrin adapters.
Abstract: We describe a label-free ribobase identification method, which uses ionic current measurement to resolve ribonucleoside monophosphates or diphosphates in α-hemolysin protein nanopores containing amino-cyclodextrin adapters. The accuracy of base identification is further investigated through the use of a guanidino-modified adapter. On the basis of these findings, an exosequencing approach is envisioned in which a processive exoribonuclease (polynucleotide phosphorylase) presents sequentially cleaved ribonucleoside diphosphates to a nanopore.

103 citations

Journal ArticleDOI
TL;DR: The latest studies on various types of RNA biomarkers, especially extracellular RNAs, in cancer diagnosis and prognosis are summarized, and several well-known RNA biomarker of clinical utility are illustrated.
Abstract: As an essential part of central dogma, RNA delivers genetic and regulatory information and reflects cellular states. Based on high-throughput sequencing technologies, cumulating data show that various RNA molecules are able to serve as biomarkers for the diagnosis and prognosis of various diseases, for instance, cancer. In particular, detectable in various bio-fluids, such as serum, saliva and urine, extracellular RNAs (exRNAs) are emerging as non-invasive biomarkers for earlier cancer diagnosis, tumor progression monitor, and prediction of therapy response. In this review, we summarize the latest studies on various types of RNA biomarkers, especially extracellular RNAs, in cancer diagnosis and prognosis, and illustrate several well-known RNA biomarkers of clinical utility. In addition, we describe and discuss general procedures and issues in investigating exRNA biomarkers, and perspectives on utility of exRNAs in precision medicine.

103 citations


Cites methods from "RNA-Seq: a revolutionary tool for t..."

  • ...In preparation of RNA-seq libraries, RNA transcripts are fragmentized and reverse transcribed into cDNAs [96]....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Abstract: We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 10 5 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices. The mRNA population specifies a cell’s identity and helps to govern its present and future activities. This has made transcriptome analysis a general phenotyping method, with expression microarrays of many kinds in routine use. Here we explore the possibility that transcriptome analysis, transcript discovery and transcript refinement can be done effectively in large and complex mammalian genomes by ultra-high-throughput sequencing. Expression microarrays are currently the most widely used methodology for transcriptome analysis, although some limitations persist. These include hybridization and cross-hybridization artifacts 1–3 , dye-based detection issues and design constraints that preclude or seriously limit the detection of RNA splice patterns and previously unmapped genes. These issues have made it difficult for standard array designs to provide full sequence comprehensiveness (coverage of all possible genes, including unknown ones, in large genomes) or transcriptome comprehensiveness (reliable detection of all RNAs of all prevalence classes, including the least abundant ones that are physiologically relevant). Other

12,293 citations

PatentDOI
04 Oct 2000-Science
TL;DR: Serial analysis of gene expression (SAGE) should provide a broadly applicable means for the quantitative cataloging and comparison of expressed genes in a variety of normal, developmental, and disease states.
Abstract: PROBLEM TO BE SOLVED: To provide a method for preparing a short nucleotide sequence (tag) which is useful to identify a cDNA oligonucleotide and is derived from a restricted position in a mRNA or a cDNA. SOLUTION: This is the method of preparing a tag for identifying the cDNA oligonucleotide. The above method comprises preparing the cDNA oligonucleotide bearing 5' and 3' terminals, collecting cDNA fragments by cutting the cDNA oligonucleotide with a restriction enzyme at the first restriction endonuclease site, separating a cDNA oligonucleotide bearing 5' or 3' terminal and connecting an oligonucleotide linker to the isolated cDNA fragment bearing the cDNA oligonucleotide 5' or 3' terminal. Here, the oligonucleotide linker contains the recognition site of the second restriction endonuclease enzyme and the isolated cDNA fragment is cut with the second restriction endonuclease enzyme which cuts the cDNA fragment in a section separated from the recognition site to obtain the tag for identifying the cDNA oligonucleotide.

4,437 citations

Journal ArticleDOI
TL;DR: This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
Abstract: New sequencing technologies promise a new era in the use of DNA sequence. However, some of these technologies produce very short reads, typically of a few tens of base pairs, and to use these reads effectively requires new algorithms and software. In particular, there is a major issue in efficiently aligning short reads to a reference genome and handling ambiguity or lack of accuracy in this alignment. Here we introduce the concept of mapping quality, a measure of the confidence that a read actually comes from the position it is aligned to by the mapping algorithm. We describe the software MAQ that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample. MAQ makes full use of mate-pair information and estimates the error probability of each read alignment. Error probabilities are also derived for the final genotype calls, using a Bayesian statistical model that incorporates the mapping qualities, error probabilities from the raw sequence quality scores, sampling of the two haplotypes, and an empirical model for correlated errors at a site. Both read mapping and genotype calling are evaluated on simulated data and real data. MAQ is accurate, efficient, versatile, and user-friendly. It is freely available at http://maq.sourceforge.net.

2,927 citations

Journal ArticleDOI
TL;DR: It is found that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane).
Abstract: Ultra-high-throughput sequencing is emerging as an attractive alternative to microarrays for genotyping, analysis of methylation patterns, and identification of transcription factor binding sites. Here, we describe an application of the Illumina sequencing (formerly Solexa sequencing) platform to study mRNA expression levels. Our goals were to estimate technical variance associated with Illumina sequencing in this context and to compare its ability to identify differentially expressed genes with existing array technologies. To do so, we estimated gene expression differences between liver and kidney RNA samples using multiple sequencing replicates, and compared the sequencing data to results obtained from Affymetrix arrays using the same RNA samples. We find that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane). The information in a single lane of Illumina sequencing data appears comparable to that in a single array in enabling identification of differentially expressed genes, while allowing for additional analyses such as detection of low-expressed genes, alternative splice variants, and novel transcripts. Based on our observations, we propose an empirical protocol and a statistical framework for the analysis of gene expression using ultra-high-throughput sequencing technology.

2,834 citations

Journal ArticleDOI
TL;DR: The program SOAP is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology, which supports multi-threaded parallel computing and has a batch module for multiple query sets.
Abstract: Summary: We have developed a program SOAP for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The program is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology. SOAP is compatible with numerous applications, including single-read or pair-end resequencing, small RNA discovery and mRNA tag sequence mapping. SOAP is a command-driven program, which supports multi-threaded parallel computing, and has a batch module for multiple query sets. Availability: http://soap.genomics.org.cn Contact: soap@genomics.org.cn

2,729 citations


"RNA-Seq: a revolutionary tool for t..." refers methods in this paper

  • ...There are several programs for mapping reads to the genome, including ELAND, SOA...

    [...]