scispace - formally typeset
Search or ask a question
Author

Alistair R. R. Forrest

Bio: Alistair R. R. Forrest is an academic researcher from Harry Perkins Institute of Medical Research. The author has contributed to research in topics: Regulation of gene expression & Cap analysis gene expression. The author has an hindex of 59, co-authored 175 publications receiving 23544 citations. Previous affiliations of Alistair R. R. Forrest include Griffith University & Centre for Life.


Papers
More filters
Journal ArticleDOI
Piero Carninci, Takeya Kasukawa1, Shintaro Katayama, Julian Gough  +194 moreInstitutions (36)
02 Sep 2005-Science
TL;DR: Detailed polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.
Abstract: This study describes comprehensive polling of transcription start and termination sites and analysis of previously unidentified full-length complementary DNAs derived from the mouse genome. We identify the 5' and 3' boundaries of 181,047 transcripts with extensive variation in transcripts arising from alternative promoter usage, splicing, and polyadenylation. There are 16,247 new mouse protein-coding transcripts, including 5154 encoding previously unidentified proteins. Genomic mapping of the transcriptome reveals transcriptional forests, with overlapping transcription on both strands, separated by deserts in which few transcripts are observed. The data provide a comprehensive platform for the comparative analysis of mammalian transcriptional regulation in differentiation and development.

3,412 citations

Journal ArticleDOI
27 Mar 2014-Nature
TL;DR: It is shown that enhancers share properties with CpG-poor messenger RNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity.
Abstract: Enhancers control the correct temporal and cell-type-specific activation of gene expression in multicellular eukaryotes. Knowing their properties, regulatory activity and targets is crucial to understand the regulation of differentiation and homeostasis. Here we use the FANTOM5 panel of samples, covering the majority of human tissues and cell types, to produce an atlas of active, in vivo-transcribed enhancers. We show that enhancers share properties with CpG-poor messenger RNA promoters but produce bidirectional, exosome-sensitive, relatively short unspliced RNAs, the generation of which is strongly related to enhancer activity. The atlas is used to compare regulatory programs between different cells at unprecedented depth, to identify disease-associated regulatory single nucleotide polymorphisms, and to classify cell-type-specific and ubiquitous enhancers. We further explore the utility of enhancer redundancy, which explains gene expression strength rather than expression patterns. The online FANTOM5 enhancer atlas represents a unique resource for studies on cell-type-specific enhancers and gene regulation.

2,260 citations

Journal ArticleDOI
27 Mar 2014-Nature
TL;DR: For example, the authors mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body.
Abstract: Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body We find that few genes are truly 'housekeeping', whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research

1,715 citations

Journal ArticleDOI
Yasushi Okazaki, Masaaki Furuno, Takeya Kasukawa1, Jun Adachi, Hidemasa Bono, S. Kondo, Itoshi Nikaido2, Naoki Osato, Rintaro Saito3, Harukazu Suzuki, Itaru Yamanaka, H. Kiyosawa2, Ken Yagi, Yasuhiro Tomaru4, Yuki Hasegawa2, A. Nogami2, Christian Schönbach, Takashi Gojobori, Richard M. Baldarelli, David P. Hill, Carol J. Bult, David A. Hume5, John Quackenbush6, Lynn M. Schriml7, Alexander Kanapin, Hideo Matsuda8, Serge Batalov9, Kirk W. Beisel10, Judith A. Blake, Dirck W. Bradt, Vladimir Brusic, Cyrus Chothia11, Lori E. Corbani, S. Cousins, Emiliano Dalla, Tommaso A. Dragani, Colin F. Fletcher9, Colin F. Fletcher12, Alistair R. R. Forrest5, K. S. Frazer13, Terry Gaasterland14, Manuela Gariboldi, Carmela Gissi15, Adam Godzik16, Julian Gough11, Sean M. Grimmond5, Stefano Gustincich17, Nobutaka Hirokawa18, Ian J. Jackson19, Erich D. Jarvis20, Akio Kanai3, Hideya Kawaji1, Hideya Kawaji8, Yuka Imamura Kawasawa21, Rafal M. Kedzierski21, Benjamin L. King, Akihiko Konagaya, Igor V. Kurochkin, Yong-Hwan Lee6, Boris Lenhard22, Paul A. Lyons23, Donna Maglott7, Lois J. Maltais, Luigi Marchionni, Louise M. McKenzie, Harukata Miki18, Takeshi Nagashima, Koji Numata3, Toshihisa Okido, William J. Pavan7, Geo Pertea6, Graziano Pesole15, Nikolai Petrovsky24, Ramesh S. Pillai, Joan Pontius7, D. Qi, Sridhar Ramachandran, Timothy Ravasi5, Jonathan C. Reed16, Deborah J Reed, Jeffrey G. Reid, Brian Z. Ring, M. Ringwald, Albin Sandelin22, Claudio Schneider, Colin A. Semple19, Mitsutoshi Setou18, K. Shimada25, Razvan Sultana6, Yoichi Takenaka8, Martin S. Taylor19, Rohan D. Teasdale5, Masaru Tomita3, Roberto Verardo, Lukas Wagner7, Claes Wahlestedt22, Y. Wang6, Yoshiki Watanabe25, Christine A. Wells5, Laurens G. Wilming26, Anthony Wynshaw-Boris27, Masashi Yanagisawa21, Ivana V. Yang6, L. Yang, Zheng Yuan5, Mihaela Zavolan14, Yunhui Zhu, Anne M. Zimmer28, Piero Carninci, N. Hayatsu, Tomoko Hirozane-Kishikawa, Hideaki Konno, M. Nakamura, Naoko Sakazume, K. Sato4, Toshiyuki Shiraki, Kazunori Waki, Jun Kawai, Katsunori Aizawa, Takahiro Arakawa, S. Fukuda, A. Hara, W. Hashizume, K. Imotani, Y. Ishii, Masayoshi Itoh, Ikuko Kagawa, A. Miyazaki, K. Sakai, D. Sasaki, K. Shibata, Akira Shinagawa, Ayako Yasunishi, Masayasu Yoshino, Robert H. Waterston29, Eric S. Lander30, Jane Rogers26, Ewan Birney, Yoshihide Hayashizaki 
05 Dec 2002-Nature
TL;DR: The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics.
Abstract: Only a small proportion of the mouse genome is transcribed into mature messenger RNA transcripts There is an international collaborative effort to identify all full-length mRNA transcripts from the mouse, and to ensure that each is represented in a physical collection of clones Here we report the manual annotation of 60,770 full-length mouse complementary DNA sequences These are clustered into 33,409 'transcriptional units', contributing 901% of a newly established mouse transcriptome database Of these transcriptional units, 4,258 are new protein-coding and 11,665 are new non-coding messages, indicating that non-coding RNA is a major component of the transcriptome 41% of all transcriptional units showed evidence of alternative splicing In protein-coding transcripts, 79% of splice variations altered the protein product Whole-transcriptome analyses resulted in the identification of 2,431 sense-antisense pairs The present work, completely supported by physical clones, provides the most comprehensive survey of a mammalian transcriptome so far, and is a valuable resource for functional genomics

1,663 citations

Journal ArticleDOI
TL;DR: These tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini.
Abstract: Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.

1,324 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: It is shown that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads, and estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired- end reads, depending on the number of possible splice forms for each gene.
Abstract: RNA-Seq is revolutionizing the way transcript abundances are measured. A key challenge in transcript quantification from RNA-Seq data is the handling of reads that map to multiple genes or isoforms. This issue is particularly important for quantification with de novo transcriptome assemblies in the absence of sequenced genomes, as it is difficult to determine which transcripts are isoforms of the same gene. A second significant issue is the design of RNA-Seq experiments, in terms of the number of reads, read length, and whether reads come from one or both ends of cDNA fragments. We present RSEM, an user-friendly software package for quantifying gene and isoform abundances from single-end or paired-end RNA-Seq data. RSEM outputs abundance estimates, 95% credibility intervals, and visualization files and can also simulate RNA-Seq data. In contrast to other existing tools, the software does not require a reference genome. Thus, in combination with a de novo transcriptome assembler, RSEM enables accurate transcript quantification for species without sequenced genomes. On simulated and real data sets, RSEM has superior or comparable performance to quantification methods that rely on a reference genome. Taking advantage of RSEM's ability to effectively use ambiguously-mapping reads, we show that accurate gene-level abundance estimates are best obtained with large numbers of short single-end reads. On the other hand, estimates of the relative frequencies of isoforms within single genes may be improved through the use of paired-end reads, depending on the number of possible splice forms for each gene. RSEM is an accurate and user-friendly software tool for quantifying transcript abundances from RNA-Seq data. As it does not rely on the existence of a reference genome, it is particularly useful for quantification with de novo transcriptome assemblies. In addition, RSEM has enabled valuable guidance for cost-efficient design of quantification experiments with RNA-Seq, which is currently relatively expensive.

14,524 citations

Journal ArticleDOI
TL;DR: The results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.
Abstract: High-throughput mRNA sequencing (RNA-Seq) promises simultaneous transcript discovery and abundance estimation. However, this would require algorithms that are not restricted by prior gene annotations and that account for alternative transcription and splicing. Here we introduce such algorithms in an open-source software program called Cufflinks. To test Cufflinks, we sequenced and analyzed >430 million paired 75-bp RNA-Seq reads from a mouse myoblast cell line over a differentiation time series. We detected 13,692 known transcripts and 3,724 previously unannotated ones, 62% of which are supported by independent expression data or by homologous genes in other species. Over the time series, 330 genes showed complete switches in the dominant transcription start site (TSS) or splice isoform, and we observed more subtle shifts in 1,304 other genes. These results suggest that Cufflinks can illuminate the substantial regulatory flexibility and complexity in even this well-studied model of muscle development and that it can improve transcriptome-based genome annotation.

13,337 citations

Journal ArticleDOI
TL;DR: The RNA-Seq approach to transcriptome profiling that uses deep-sequencing technologies provides a far more precise measurement of levels of transcripts and their isoforms than other methods.
Abstract: RNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies. Studies using this method have already altered our view of the extent and complexity of eukaryotic transcriptomes. RNA-Seq also provides a far more precise measurement of levels of transcripts and their isoforms than other methods. This article describes the RNA-Seq approach, the challenges associated with its application, and the advances made so far in characterizing several eukaryote transcriptomes.

11,528 citations

Journal ArticleDOI
TL;DR: The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer.
Abstract: Motivation: A new protocol for sequencing the messenger RNA in a cell, known as RNA-Seq, generates millions of short sequence fragments in a single run. These fragments, or ‘reads’, can be used to measure levels of gene expression and to identify novel splice variants of genes. However, current software for aligning RNA-Seq data to a genome relies on known splice junctions and cannot identify novel ones. TopHat is an efficient read-mapping algorithm designed to align reads from an RNA-Seq experiment to a reference genome without relying on known splice sites. Results: We mapped the RNA-Seq reads from a recent mammalian RNA-Seq experiment and recovered more than 72% of the splice junctions reported by the annotation-based software from that study, along with nearly 20 000 previously unreported junctions. The TopHat pipeline is much faster than previous systems, mapping nearly 2.2 million reads per CPU hour, which is sufficient to process an entire RNA-Seq experiment in less than a day on a standard desktop computer. We describe several challenges unique to ab initio splice site discovery from RNA-Seq reads that will require further algorithm development. Availability: TopHat is free, open-source software available from http://tophat.cbcb.umd.edu Contact: ude.dmu.sc@eloc Supplementary information: Supplementary data are available at Bioinformatics online.

11,473 citations