scispace - formally typeset
Search or ask a question
Journal ArticleDOI

RNA-Seq: a revolutionary tool for transcriptomics

01 Jan 2009-Nature Reviews Genetics (Nature Publishing Group)-Vol. 10, Iss: 1, pp 57-63
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An alternative approach to transcriptome sequencing for the Illumina Genome Analyzer is reported, in which the reverse transcription reaction takes place on the flowcell, and the template is poly(A)+ RNA rather than cDNA, so the resulting sequences are necessarily strand-specific.
Abstract: We report an alternative approach to transcriptome sequencing for the Illumina Genome Analyzer, in which the reverse transcription reaction takes place on the flowcell. No amplification is performed during the library preparation, so PCR biases and duplicates are avoided, and because the template is poly(A)(+) RNA rather than cDNA, the resulting sequences are necessarily strand-specific. The method is compatible with paired- or single-end sequencing.

157 citations

Journal ArticleDOI
TL;DR: The systems genetics approach reveals that the program for inflorescence branching is initiated surprisingly early during meristem maturation and that evolutionary diversity in inflorescence architecture is modulated by heterochronic shifts in the acquisition of floral fate.
Abstract: Flower production and crop yields are highly influenced by the architectures of inflorescences. In the compound inflorescences of tomato and related nightshades (Solanaceae), new lateral inflorescence branches develop on the flanks of older branches that have terminated in flowers through a program of plant growth known as "sympodial." Variability in the number and organization of sympodial branches produces a remarkable array of inflorescence architectures, but little is known about the mechanisms underlying sympodial growth and branching diversity. One hypothesis is that the rate of termination modulates branching. By performing deep sequencing of transcriptomes, we have captured gene expression dynamics from individual shoot meristems in tomato as they gradually transition from a vegetative state to a terminal flower. Surprisingly, we find thousands of age-dependent expression changes, even when there is little change in meristem morphology. From these data, we reveal that meristem maturation is an extremely gradual process defined molecularly by a "meristem maturation clock." Using hundreds of stage-enriched marker genes that compose this clock, we show that extreme branching, conditioned by loss of expression of the COMPOUND INFLORESCENCE gene, is driven by delaying the maturation of both apical and lateral meristems. In contrast, we find that wild tomato species display a delayed maturation only in apical meristems, which leads to modest branching. Our systems genetics approach reveals that the program for inflorescence branching is initiated surprisingly early during meristem maturation and that evolutionary diversity in inflorescence architecture is modulated by heterochronic shifts in the acquisition of floral fate.

157 citations


Cites methods from "RNA-Seq: a revolutionary tool for t..."

  • ...We profiled the transcriptome of each stage by isolating mRNA and subjecting cDNA libraries to Illumina sequencing (SI Appendix) (15)....

    [...]

Journal ArticleDOI
TL;DR: Traits required in future pasture legumes include greater resilience to declining rainfall and more variable seasons, higher tolerance of soil acidity, higher phosphorous utilisation efficiency, lower potential to produce methane emissions in grazing ruminants, better integration into weed management strategies on mixed farms, and resistance to new pest and disease threats.
Abstract: Australian farmers and scientists have embraced the use of new pasture legume species more than those in any other country, with 36 annual and 11 perennial legumes having cultivars registered for use. Lucerne (Medicago sativa), white clover (Trifolium repens), and red clover (T. pratense) were introduced by the early European settlers and are still important species in Australia, but several other species, notably annual legumes, have been developed specifically for Australian environments, leading to the evolution of unique farming systems. Subterranean clover (T. subterraneum) and annual medics (Medicago spp.) have been the most successful species, while a suite of new annual legumes, including serradellas (Ornithopus compressus and O. sativus), biserrula (Biserrula pelecinus) and other Trifolium and Medicago species, has expanded the range of legume options. Strawberry clover (T. fragiferum) was the first non-traditional, perennial legume commercialised in Australia. Other new perennial legumes have recently been developed to overcome the soil acidity and waterlogging productivity constraints of lucerne and white clover and to reduce groundwater recharge and the spread of dryland salinity. These include birdsfoot trefoil (Lotus corniculatus), Talish clover (T. tumens), and hairy canary clover (Dorycnium hirsutum). Stoloniferous red clover cultivars and sulla (Hedysarum coronarium) cultivars adapted to southern Australia have also been released, along with a new cultivar of Caucasian clover (T. ambiguum) aimed at overcoming seed production issues of cultivars released in the 1970s. New species under development include the annual legume messina (Melilotus siculus) and the perennial legume narrowleaf lotus (L. tenuis) for saline, waterlogged soils, and the drought-tolerant perennial legume tedera (Bituminaria bituminosa var. albomarginata). Traits required in future pasture legumes include greater resilience to declining rainfall and more variable seasons, higher tolerance of soil acidity, higher phosphorous utilisation efficiency, lower potential to produce methane emissions in grazing ruminants, better integration into weed management strategies on mixed farms, and resistance to new pest and disease threats. Future opportunities include supplying new fodder markets and potential pharmaceutical and health uses for humans and livestock. New species could be considered in the future to overcome constraints of existing species, but their commercial success will depend upon perceived need, size of the seed market, ease of establishment, and management and safety of grazing animals and the environment. Molecular biology has a range of potential applications in pasture legume breeding, including marker-assisted and genomics-assisted selection and the identification of quantitative trait loci and candidate genes for important traits. Genetically modified pasture plants are unlikely to be commercialised until public concerns are allayed. Private seed companies are likely to play an increasingly important role in pasture legume development, particularly of mainstream species, but the higher risk and more innovative breakthroughs are likely to come from the public sector, provided the skills base for plant breeding and associated disciplines is maintained.

157 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...This, coupledwith DNA micro-array and RNA sequencing techniques, has led to the development of other fields of molecular biology with the potential to aid future pasture legume improvement, including functional genomics, which aims to understand the gene function and protein products of particular DNA sequences (Pevsner 2009); transcriptomics, which investigates differences in gene expression between single nucleotide polymorphisms (SNPs), leading to discovery of the active genes (Wang et al. 2009); and proteomics,whichexamines the structure, function, and synthesis of proteins (James 1997)....

    [...]

Journal ArticleDOI
TL;DR: The key studies and technological advances that have shaped the understanding of the dimensions, dynamics, and biological relevance of the mammalian noncoding transcriptome are described.

157 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...Trends in Genetics, July 2017, Vol. 33, No. 7 465 Glossary Adaptive radiation: an evolutionary process in which organisms diversify rapidly from an ancestral species into a multitude of new forms....

    [...]

  • ...The breadth and complexity of mammalian transcription was not obvious before scalable cDNA hybridization [3] and sequencing [4], and the subsequent incorporation of next-generation sequencing to create modern RNA sequencing (RNA-Seq, see Glossary) [5]....

    [...]

Journal ArticleDOI
TL;DR: Differences in gene expression suggest that evolutionary divergence in the regulatory pathway(s) involved in acute temperature stress may offer at least a partial explanation of population differences in thermal tolerance observed in Tigriopus.
Abstract: Geographic variation in the thermal environment impacts a broad range of biochemical and physiological processes and can be a major selective force leading to local population adaptation. In the intertidal copepod Tigriopus californicus, populations along the coast of California show differences in thermal tolerance that are consistent with adaptation, i.e., southern populations withstand thermal stresses that are lethal to northern populations. To understand the genetic basis of these physiological differences, we use an RNA-seq approach to compare genome-wide patterns of gene expression in two populations known to differ in thermal tolerance. Observed differences in gene expression between the southern (San Diego) and the northern (Santa Cruz) populations included both the number of affected loci as well as the identity of these loci. However, the most pronounced differences concerned the amplitude of up-regulation of genes producing heat shock proteins (Hsps) and genes involved in ubiquitination and proteolysis. Among the hsp genes, orthologous pairs show markedly different thermal responses as the amplitude of hsp response was greatly elevated in the San Diego population, most notably in members of the hsp70 gene family. There was no evidence of accelerated evolution at the sequence level for hsp genes. Among other sets of genes, cuticle genes were up-regulated in SD but down-regulated in SC, and mitochondrial genes were down-regulated in both populations. Marked changes in gene expression were observed in response to acute sub-lethal thermal stress in the copepod T. californicus. Although some qualitative differences were observed between populations, the most pronounced differences involved the magnitude of induction of numerous hsp and ubiquitin genes. These differences in gene expression suggest that evolutionary divergence in the regulatory pathway(s) involved in acute temperature stress may offer at least a partial explanation of population differences in thermal tolerance observed in Tigriopus.

157 citations


Cites background from "RNA-Seq: a revolutionary tool for t..."

  • ...Next-generation RNA sequencing technology (RNA-seq) [30] now allows us to simultaneously assemble transcriptomes and quantify gene expression across tens of thousands of genes without any a priori genomic information....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors.
Abstract: We have mapped and quantified mouse transcriptomes by deeply sequencing them and recording how frequently each gene is represented in the sequence sample (RNA-Seq). This provides a digital measure of the presence and prevalence of transcripts from known and previously unknown genes. We report reference measurements composed of 41–52 million mapped 25-base-pair reads for poly(A)-selected RNA from adult mouse brain, liver and skeletal muscle tissues. We used RNA standards to quantify transcript prevalence and to test the linear range of transcript detection, which spanned five orders of magnitude. Although >90% of uniquely mapped reads fell within known exons, the remaining data suggest new and revised gene models, including changed or additional promoters, exons and 3′ untranscribed regions, as well as new candidate microRNA precursors. RNA splice events, which are not readily measured by standard gene expression microarray or serial analysis of gene expression methods, were detected directly by mapping splice-crossing sequence reads. We observed 1.45 × 10 5 distinct splices, and alternative splices were prominent, with 3,500 different genes expressing one or more alternate internal splices. The mRNA population specifies a cell’s identity and helps to govern its present and future activities. This has made transcriptome analysis a general phenotyping method, with expression microarrays of many kinds in routine use. Here we explore the possibility that transcriptome analysis, transcript discovery and transcript refinement can be done effectively in large and complex mammalian genomes by ultra-high-throughput sequencing. Expression microarrays are currently the most widely used methodology for transcriptome analysis, although some limitations persist. These include hybridization and cross-hybridization artifacts 1–3 , dye-based detection issues and design constraints that preclude or seriously limit the detection of RNA splice patterns and previously unmapped genes. These issues have made it difficult for standard array designs to provide full sequence comprehensiveness (coverage of all possible genes, including unknown ones, in large genomes) or transcriptome comprehensiveness (reliable detection of all RNAs of all prevalence classes, including the least abundant ones that are physiologically relevant). Other

12,293 citations

PatentDOI
04 Oct 2000-Science
TL;DR: Serial analysis of gene expression (SAGE) should provide a broadly applicable means for the quantitative cataloging and comparison of expressed genes in a variety of normal, developmental, and disease states.
Abstract: PROBLEM TO BE SOLVED: To provide a method for preparing a short nucleotide sequence (tag) which is useful to identify a cDNA oligonucleotide and is derived from a restricted position in a mRNA or a cDNA. SOLUTION: This is the method of preparing a tag for identifying the cDNA oligonucleotide. The above method comprises preparing the cDNA oligonucleotide bearing 5' and 3' terminals, collecting cDNA fragments by cutting the cDNA oligonucleotide with a restriction enzyme at the first restriction endonuclease site, separating a cDNA oligonucleotide bearing 5' or 3' terminal and connecting an oligonucleotide linker to the isolated cDNA fragment bearing the cDNA oligonucleotide 5' or 3' terminal. Here, the oligonucleotide linker contains the recognition site of the second restriction endonuclease enzyme and the isolated cDNA fragment is cut with the second restriction endonuclease enzyme which cuts the cDNA fragment in a section separated from the recognition site to obtain the tag for identifying the cDNA oligonucleotide.

4,437 citations

Journal ArticleDOI
TL;DR: This work describes the software MAQ, software that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample.
Abstract: New sequencing technologies promise a new era in the use of DNA sequence. However, some of these technologies produce very short reads, typically of a few tens of base pairs, and to use these reads effectively requires new algorithms and software. In particular, there is a major issue in efficiently aligning short reads to a reference genome and handling ambiguity or lack of accuracy in this alignment. Here we introduce the concept of mapping quality, a measure of the confidence that a read actually comes from the position it is aligned to by the mapping algorithm. We describe the software MAQ that can build assemblies by mapping shotgun short reads to a reference genome, using quality scores to derive genotype calls of the consensus sequence of a diploid genome, e.g., from a human sample. MAQ makes full use of mate-pair information and estimates the error probability of each read alignment. Error probabilities are also derived for the final genotype calls, using a Bayesian statistical model that incorporates the mapping qualities, error probabilities from the raw sequence quality scores, sampling of the two haplotypes, and an empirical model for correlated errors at a site. Both read mapping and genotype calling are evaluated on simulated data and real data. MAQ is accurate, efficient, versatile, and user-friendly. It is freely available at http://maq.sourceforge.net.

2,927 citations

Journal ArticleDOI
TL;DR: It is found that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane).
Abstract: Ultra-high-throughput sequencing is emerging as an attractive alternative to microarrays for genotyping, analysis of methylation patterns, and identification of transcription factor binding sites. Here, we describe an application of the Illumina sequencing (formerly Solexa sequencing) platform to study mRNA expression levels. Our goals were to estimate technical variance associated with Illumina sequencing in this context and to compare its ability to identify differentially expressed genes with existing array technologies. To do so, we estimated gene expression differences between liver and kidney RNA samples using multiple sequencing replicates, and compared the sequencing data to results obtained from Affymetrix arrays using the same RNA samples. We find that the Illumina sequencing data are highly replicable, with relatively little technical variation, and thus, for many purposes, it may suffice to sequence each mRNA sample only once (i.e., using one lane). The information in a single lane of Illumina sequencing data appears comparable to that in a single array in enabling identification of differentially expressed genes, while allowing for additional analyses such as detection of low-expressed genes, alternative splice variants, and novel transcripts. Based on our observations, we propose an empirical protocol and a statistical framework for the analysis of gene expression using ultra-high-throughput sequencing technology.

2,834 citations

Journal ArticleDOI
TL;DR: The program SOAP is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology, which supports multi-threaded parallel computing and has a batch module for multiple query sets.
Abstract: Summary: We have developed a program SOAP for efficient gapped and ungapped alignment of short oligonucleotides onto reference sequences. The program is designed to handle the huge amounts of short reads generated by parallel sequencing using the new generation Illumina-Solexa sequencing technology. SOAP is compatible with numerous applications, including single-read or pair-end resequencing, small RNA discovery and mRNA tag sequence mapping. SOAP is a command-driven program, which supports multi-threaded parallel computing, and has a batch module for multiple query sets. Availability: http://soap.genomics.org.cn Contact: soap@genomics.org.cn

2,729 citations


"RNA-Seq: a revolutionary tool for t..." refers methods in this paper

  • ...There are several programs for mapping reads to the genome, including ELAND, SOA...

    [...]