Showing papers by "Alistair R. R. Forrest published in 2006"

PDF

Open Access

Journal Article•DOI•

Genome-wide analysis of mammalian promoter architecture and evolution

[...]

Piero Carninci, Albin Sandelin¹, Boris Lenhard², Boris Lenhard¹, Shintaro Katayama, Kazuro Shimokawa, Jasmina Ponjavic³, Colin A. Semple⁴, Martin S. Taylor³, Pär G. Engström¹, Martin C. Frith⁵, Alistair R. R. Forrest⁵, Wynand Alkema¹, Sin Lam Tan⁶, Charles Plessy, Rimantas Kodzius, Timothy Ravasi⁷, Timothy Ravasi⁵, Takeya Kasukawa, Shiro Fukuda, Mutsumi Kanamori-Katayama, Yayoi Kitazume, Hideya Kawaji, Chikatoshi Kai, Mari M. Nakamura, Hideaki Konno, Kenji Nakano, Salim Mottagui-Tabar⁸, Salim Mottagui-Tabar¹, Salim Mottagui-Tabar⁹, Peter Arner¹, Alessandra Chesi¹⁰, Stefano Gustincich¹⁰, Francesca Persichetti¹⁰, Harukazu Suzuki, Sean M. Grimmond⁵, Christine A. Wells¹¹, Valerio Orlando, Claes Wahlestedt¹, Claes Wahlestedt¹², Edison T. Liu⁶, Matthias Harbers, Jun Kawai, Vladimir B. Bajic⁶, Vladimir B. Bajic¹³, David A. Hume⁵, Yoshihide Hayashizaki¹⁴, Yoshihide Hayashizaki¹⁵ - Show less +44 more•Institutions (15)

Karolinska Institutet¹, University of Bergen², University of Oxford³, Western General Hospital⁴, University of Queensland⁵, Agency for Science, Technology and Research⁶, University of California, San Diego⁷, National Institute for Health and Welfare⁸, University of Helsinki⁹, International School for Advanced Studies¹⁰, Griffith University¹¹, Scripps Research Institute¹², University of the Western Cape¹³, University of Tsukuba¹⁴, Yokohama City University¹⁵

28 Apr 2006-Nature Genetics

TL;DR: These tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini.

...read moreread less

Abstract: Mammalian promoters can be separated into two classes, conserved TATA box-enriched promoters, which initiate at a well-defined site, and more plastic, broad and evolvable CpG-rich promoters. We have sequenced tags corresponding to several hundred thousand transcription start sites (TSSs) in the mouse and human genomes, allowing precise analysis of the sequence architecture and evolution of distinct promoter classes. Different tissues and families of genes differentially use distinct types of promoters. Our tagging methods allow quantitative analysis of promoter usage in different tissues and show that differentially regulated alternative TSSs are a common feature in protein-coding genes and commonly generate alternative N termini. Among the TSSs, we identified new start sites associated with the majority of exons and with 3' UTRs. These data permit genome-scale identification of tissue-specific promoters and analysis of the cis-acting elements associated with them.

...read moreread less

1,324 citations

Journal Article•DOI•

The abundance of short proteins in the mammalian proteome

[...]

Martin C. Frith, Alistair R. R. Forrest¹, Ehsan Nourbakhsh¹, Ken C Pang¹, Ken C Pang², Chikatoshi Kai, Jun Kawai, Piero Carninci, Yoshihide Hayashizaki, Timothy L. Bailey¹, Sean M. Grimmond¹ - Show less +7 more•Institutions (2)

University of Queensland¹, Ludwig Institute for Cancer Research²

28 Apr 2006-PLOS Genetics

TL;DR: This work identifies many novel short proteins, including a “dark matter” subset containing ones that lack detectable homology to other known proteins, and confirms that some of these novel proteins can be translated and localised to the secretory pathway.

...read moreread less

Abstract: Short proteins play key roles in cell signalling and other processes, but their abundance in the mammalian proteome is unknown. Current catalogues of mammalian proteins exhibit an artefactual discontinuity at a length of 100 aa, so that protein abundance peaks just above this length and falls off sharply below it. To clarify the abundance of short proteins, we identify proteins in the FANTOM collection of mouse cDNAs by analysing synonymous and non-synonymous substitutions with the computer program CRITICA. This analysis confirms that there is no real discontinuity at length 100. Roughly 10% of mouse proteins are shorter than 100 aa, although the majority of these are variants of proteins longer than 100 aa. We identify many novel short proteins, including a “dark matter” subset containing ones that lack detectable homology to other known proteins. Translation assays confirm that some of these novel proteins can be translated and localised to the secretory pathway.

...read moreread less

217 citations

Journal Article•DOI•

Transcript annotation in FANTOM3: mouse gene catalog based on physical cDNAs.

[...]

Norihiro Maeda, Takeya Kasukawa¹, Rieko Oyama, Julian Gough, Martin C. Frith², Pär G. Engström³, Pär G. Engström⁴, Boris Lenhard³, Boris Lenhard⁴, Rajith N. Aturaliya², Serge Batalov⁵, Kirk W. Beisel, Carol J. Bult, Colin F. Fletcher⁵, Alistair R. R. Forrest², Masaaki Furuno, David E. Hill, Masayoshi Itoh, Mutsumi Kanamori-Katayama, Shintaro Katayama, Masaru Katoh⁶, Tsugumi Kawashima, John Quackenbush⁷, John Quackenbush⁸, Timothy Ravasi², Brian Z. Ring, Kazuhiro Shibata, Koji Sugiura, Yoichi Takenaka⁹, Rohan D. Teasdale², Christine A. Wells¹⁰, Yunxia Zhu, Chikatoshi Kai, Jun Kawai, David A. Hume¹¹, Piero Carninci, Yoshihide Hayashizaki - Show less +33 more•Institutions (11)

Nippon Telegraph and Telephone¹, University of Queensland², Karolinska Institutet³, University of Bergen⁴, Novartis⁵, National Cancer Research Institute⁶, Harvard University⁷, J. Craig Venter Institute⁸, Osaka University⁹, Griffith University¹⁰, University of Edinburgh¹¹

28 Apr 2006-PLOS Genetics

TL;DR: The FANTOM3 annotation system, consisting of automated computational prediction, manualCuration, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.

...read moreread less

Abstract: The international FANTOM consortium aims to produce a comprehensive picture of the mammalian transcriptome, based upon an extensive cDNA collection and functional annotation of full-length enriched cDNAs. The previous dataset, FANTOM2, comprised 60,770 full-length enriched cDNAs. Functional annotation revealed that this cDNA dataset contained only about half of the estimated number of mouse protein-coding genes, indicating that a number of cDNAs still remained to be collected and identified. To pursue the complete gene catalog that covers all predicted mouse genes, cloning and sequencing of full-length enriched cDNAs has been continued since FANTOM2. In FANTOM3, 42,031 newly isolated cDNAs were subjected to functional annotation, and the annotation of 4,347 FANTOM2 cDNAs was updated. To accomplish accurate functional annotation, we improved our automated annotation pipeline by introducing new coding sequence prediction programs and developed a Web-based annotation interface for simplifying the annotation procedures to reduce manual annotation errors. Automated coding sequence and function prediction was followed with manual curation and review by expert curators. A total of 102,801 full-length enriched mouse cDNAs were annotated. Out of 102,801 transcripts, 56,722 were functionally annotated as protein coding (including partial or truncated transcripts), providing to our knowledge the greatest current coverage of the mouse proteome by full-length cDNAs. The total number of distinct non-protein-coding transcripts increased to 34,030. The FANTOM3 annotation system, consisting of automated computational prediction, manual curation, and final expert curation, facilitated the comprehensive characterization of the mouse transcriptome, and could be applied to the transcriptomes of other species.

...read moreread less

193 citations

Journal Article•DOI•

Alternate transcription of the Toll-like receptor signaling cascade

[...]

Christine A. Wells¹, Christine A. Wells², Alistair M. Chalk³, Alistair M. Chalk¹, Alistair R. R. Forrest², Darrin Taylor², Nic Waddell², Kate Schroder², S. Roy Himes², Geoffrey J. Faulkner², Sandra Lo¹, Takeya Kasukawa, Hideya Kawaji, Chikatoshi Kai, Jun Kawai, Shintaro Katayama, Piero Carninci, Yoshihide Hayashizaki, David A. Hume², Sean M. Grimmond² - Show less +16 more•Institutions (3)

Griffith University¹, University of Queensland², Karolinska Institutet³

17 Feb 2006-Genome Biology

TL;DR: Transcriptional evidence of widespread alternate splicing in the Toll-like receptor signaling pathway is provided from a systematic analysis of the FANTOM3 mouse data set, suggesting a surprisingly common role for variant proteins in diversification/repression of inflammatory signaling.

...read moreread less

Abstract: Background: Alternate splicing of key signaling molecules in the Toll-like receptor (Tlr) cascade has been shown to dramatically alter the signaling capacity of inflammatory cells, but it is not known how common this mechanism is. We provide transcriptional evidence of widespread alternate splicing in the Toll-like receptor signaling pathway, derived from a systematic analysis of the FANTOM3 mouse data set. Functional annotation of variant proteins was assessed in light of inflammatory signaling in mouse primary macrophages, and the expression of each variant transcript was assessed by splicing arrays. Results: A total of 256 variant transcripts were identified, including novel variants of Tlr4, Ticam1, Tollip, Rac1, Irak1, 2 and 4, Mapk14/p38, Atf2 and Stat1. The expression of variant transcripts was assessed using custom-designed splicing arrays. We functionally tested the expression of Tlr4 transcripts under a range of cytokine conditions via northern and quantitative real-time polymerase chain reaction. The effects of variant Mapk14/p38 protein expression on macrophage survival were demonstrated. Conclusion: Members of the Toll-like receptor signaling pathway are highly alternatively spliced, producing a large number of novel proteins with the potential to functionally alter inflammatory outcomes. These variants are expressed in primary mouse macrophages in response to inflammatory mediators such as interferon-γ and lipopolysaccharide. Our data suggest a surprisingly common role for variant proteins in diversification/repression of inflammatory signaling.

...read moreread less

80 citations

Journal Article•DOI•

Pseudo-messenger RNA: phantoms of the transcriptome.

[...]

Martin C. Frith, Laurens G. Wilming¹, Alistair R. R. Forrest², Hideya Kawaji, Sin Lam Tan³, Sin Lam Tan⁴, Claes Wahlestedt⁵, Claes Wahlestedt⁶, Vladimir B. Bajic³, Vladimir B. Bajic⁴, Chikatoshi Kai, Jun Kawai, Piero Carninci, Yoshihide Hayashizaki, Timothy L. Bailey², Lukasz Huminiecki⁶, Lukasz Huminiecki⁷ - Show less +13 more•Institutions (7)

Wellcome Trust Sanger Institute¹, University of Queensland², Institute for Infocomm Research Singapore³, South African National Bioinformatics Institute⁴, Scripps Research Institute⁵, Karolinska Institutet⁶, Ludwig Institute for Cancer Research⁷

28 Apr 2006-PLOS Genetics

TL;DR: A large class of non-standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways are surveyed, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.

...read moreread less

Abstract: The mammalian transcriptome harbours shadowy entities that resist classification and analysis. In analogy with pseudogenes, we define pseudo-messenger RNA to be RNA molecules that resemble protein- coding mRNA, but cannot encode full-length proteins owing to disruptions of the reading frame. Using a rigorous computational pipeline, which rules out sequencing errors, we identify 10,679 pseudo - messenger RNAs ( approximately half of which are transposonassociated) among the 102,801 FANTOM3 mouse cDNAs: just over 10% of the FANTOM3 transcriptome. These comprise not only transcribed pseudogenes, but also disrupted splice variants of otherwise protein- coding genes. Some may encode truncated proteins, only a minority of which appear subject to nonsense- mediated decay. The presence of an excess of transcripts whose only disruptions are opal stop codons suggests that there are more selenoproteins than currently estimated. We also describe compensatory frameshifts, where a segment of the gene has changed frame but remains translatable. In summary, we survey a large class of non- standard but potentially functional transcripts that are likely to encode genetic information and effect biological processes in novel ways. Many of these transcripts do not correspond cleanly to any identifiable object in the genome, implying fundamental limits to the goal of annotating all functional elements at the genome sequence level.

...read moreread less

67 citations

Journal Article•DOI•

Genome-wide review of transcriptional complexity in mouse protein kinases and phosphatases

[...]

Alistair R. R. Forrest¹, Darrin Taylor¹, Mark L. Crowe¹, Alistair M. Chalk¹, Alistair M. Chalk², Nic Waddell¹, Gabriel Kolle¹, Geoffrey J. Faulkner¹, Rimantas Kodzius, Shintaro Katayama, Christine A. Wells¹, Christine A. Wells³, Chikatoshi Kai, Jun Kawai, Piero Carninci, Yoshihide Hayashizaki, Sean M. Grimmond¹ - Show less +13 more•Institutions (3)

University of Queensland¹, Karolinska Institutet², Griffith University³

26 Jan 2006-Genome Biology

TL;DR: These findings suggest that alternative transcripts of protein kinases and phosphatases are produced that encode different domain structures, and that these variants are likely to play important roles in phosphorylation-dependent signaling pathways.

...read moreread less

Abstract: Background: Alternative transcripts of protein kinases and protein phosphatases are known to encode peptides with altered substrate affinities, subcellular localizations, and activities. We undertook a systematic study to catalog the variant transcripts of every protein kinase-like and phosphatase-like locus of mouse http://variant.imb.uq.edu.au. Results: By reviewing all available transcript evidence, we found that at least 75% of kinase and phosphatase loci in mouse generate alternative splice forms, and that 44% of these loci have well supported alternative 5' exons. In a further analysis of full-length cDNAs, we identified 69% of loci as generating more than one peptide isoform. The 1,469 peptide isoforms generated from these loci correspond to 1,080 unique Interpro domain combinations, many of which lack catalytic or interaction domains. We also report on the existence of likely dominant negative forms for many of the receptor kinases and phosphatases, including some 26 secreted decoys (seven known and 19 novel: Alk, Csf1r, Egfr, Epha1, 3, 5,7 and 10, Ephb1, Flt1, Flt3, Insr, Insrr, Kdr, Met, Ptk7, Ptprc, Ptprd, Ptprg, Ptprl, Ptprn, Ptprn2, Ptpro, Ptprr, Ptprs, and Ptprz1) and 13 transmembrane forms (four known and nine novel: Axl, Bmpr1a, Csf1r, Epha4, 5, 6 and 7, Ntrk2, Ntrk3, Pdgfra, Ptprk, Ptprm, Ptpru). Finally, by mining public gene expression data (MPSS and microarrays), we confirmed tissue-specific expression of ten of the novel isoforms. Conclusion: These findings suggest that alternative transcripts of protein kinases and phosphatases are produced that encode different domain structures, and that these variants are likely to play important roles in phosphorylation-dependent signaling pathways.

...read moreread less

56 citations

Journal Article•DOI•

Subcellular localization of mammalian type II membrane proteins.

[...]

Rajith N. Aturaliya¹, J. Lynn Fink¹, Melissa J. Davis¹, Melvena S. Teasdale¹, Kelly Hanson¹, Kevin C. Miranda¹, Alistair R. R. Forrest¹, Sean M. Grimmond¹, Harukazu Suzuki, Mutsumi Kanamori, Chikatoshi Kai, Jun Kawai, Piero Carninci, Yoshihide Hayashizaki, Rohan D. Teasdale¹ - Show less +11 more•Institutions (1)

University of Queensland¹

01 May 2006-Traffic

TL;DR: This approach combines mining of published literature to identify sub cellular localization data and a high‐throughput, polymerase chain reaction (PCR)‐based approach to experimentally characterize subcellular localization of type II membrane proteins.

...read moreread less

Abstract: Application of a computational membrane organization prediction pipeline, MemO, identified putative type II membrane proteins as proteins predicted to encode a single alpha-helical transmembrane domain (TMD) and no signal peptides. MemO was applied to RIKEN's mouse isoform protein set to identify 1436 non-overlapping genomic regions or transcriptional units (TUs), which encode exclusively type II membrane proteins. Proteins with overlapping predicted InterPro and TMDs were reviewed to discard false positive predictions resulting in a dataset comprised of 1831 transcripts in 1408 TUs. This dataset was used to develop a systematic protocol to document subcellular localization of type II membrane proteins. This approach combines mining of published literature to identify subcellular localization data and a high-throughput, polymerase chain reaction (PCR)-based approach to experimentally characterize subcellular localization. These approaches have provided localization data for 244 and 169 proteins. Type II membrane proteins are localized to all major organelle compartments; however, some biases were observed towards the early secretory pathway and punctate structures. Collectively, this study reports the subcellular localization of 26% of the defined dataset. All reported localization data are presented in the LOCATE database (http://www.locate.imb.uq.edu.au).

...read moreread less

23 citations

Journal Article•DOI•

PhosphoregDB: the tissue and sub-cellular distribution of mammalian protein kinases and phosphatases.

[...]

Alistair R. R. Forrest¹, Darrin Taylor¹, J. Lynn Fink¹, Milena Gongora¹, Cameron Flegg¹, Rohan D. Teasdale¹, Harukazu Suzuki, Mutsumi Kanamori, Chikatoshi Kai, Yoshihide Hayashizaki, Sean M. Grimmond¹ - Show less +7 more•Institutions (1)

University of Queensland¹

20 Feb 2006-BMC Bioinformatics

TL;DR: Together these data demonstrate that cell type specific systems exist to regulate protein phosphorylation and that for accurate modelling and for determination of enzyme substrate relationships the co-location of components needs to be considered.

...read moreread less

Abstract: Protein kinases and protein phosphatases are the fundamental components of phosphorylation dependent protein regulatory systems. We have created a database for the protein kinase-like and phosphatase-like loci of mouse http://phosphoreg.imb.uq.edu.au that integrates protein sequence, interaction, classification and pathway information with the results of a systematic screen of their sub-cellular localization and tissue specific expression data mined from the GNF tissue atlas of mouse. The database lets users query where a specific kinase or phosphatase is expressed at both the tissue and sub-cellular levels. Similarly the interface allows the user to query by tissue, pathway or sub-cellular localization, to reveal which components are co-expressed or co-localized. A review of their expression reveals 30% of these components are detected in all tissues tested while 70% show some level of tissue restriction. Hierarchical clustering of the expression data reveals that expression of these genes can be used to separate the samples into tissues of related lineage, including 3 larger clusters of nervous tissue, developing embryo and cells of the immune system. By overlaying the expression, sub-cellular localization and classification data we examine correlations between class, specificity and tissue restriction and show that tyrosine kinases are more generally expressed in fewer tissues than serine/threonine kinases. Together these data demonstrate that cell type specific systems exist to regulate protein phosphorylation and that for accurate modelling and for determination of enzyme substrate relationships the co-location of components needs to be considered.

...read moreread less

20 citations