scispace - formally typeset
Search or ask a question
Journal ArticleDOI

BioGPS: an extensible and customizable portal for querying and organizing gene annotation resources.

TL;DR: BioGPS http://biogps.gnf.org is introduced, a centralized gene portal for aggregating distributed gene annotation resources, and embraces the principle of community intelligence, enabling any user to easily and directly contribute to the BioGPS platform.
Abstract: Online gene annotation resources are indispensable for analysis of genomics data. However, the landscape of these online resources is highly fragmented, and scientists often visit dozens of these sites for each gene in a candidate gene list. Here, we introduce BioGPS http://biogps.gnf.org, a centralized gene portal for aggregating distributed gene annotation resources. Moreover, BioGPS embraces the principle of community intelligence, enabling any user to easily and directly contribute to the BioGPS platform.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: An update on the online database resource Search Tool for the Retrieval of Interacting Genes (STRING), which provides uniquely comprehensive coverage and ease of access to both experimental as well as predicted interaction information.
Abstract: An essential prerequisite for any systems-level understanding of cellular functions is to correctly uncover and annotate all functional interactions among proteins in the cell. Toward this goal, remarkable progress has been made in recent years, both in terms of experimental measurements and computational prediction techniques. However, public efforts to collect and present protein interaction information have struggled to keep up with the pace of interaction discovery, partly because protein-protein interaction information can be error-prone and require considerable effort to annotate. Here, we present an update on the online database resource Search Tool for the Retrieval of Interacting Genes (STRING); it provides uniquely comprehensive coverage and ease of access to both experimental as well as predicted interaction information. Interactions in STRING are provided with a confidence score, and accessory information such as protein domains and 3D structures is made available, all within a stable and consistent identifier space. New features in STRING include an interactive network viewer that can cluster networks on demand, updated on-screen previews of structural information including homology models, extensive data updates and strongly improved connectivity and integration with third-party resources. Version 9.0 of STRING covers more than 1100 completely sequenced organisms; the resource can be reached at http://string-db.org.

3,239 citations


Cites background or methods from "BioGPS: an extensible and customiza..."

  • ...A notable example for this is the BioGPS Community Gene Portal System (53); this site provides ‘plugins’ through which users can connect any number of external websites into freely configurable screen layouts....

    [...]

  • ...Second, partner websites can choose to embed the entire STRING website into their own pages (52,53), for example, using HTML inline frames (iframes)....

    [...]

Journal ArticleDOI
TL;DR: A quantitative transcriptomics analysis (RNA-Seq) is used to classify the tissue-specific expression of genes across a representative set of all major human organs and tissues and combined this analysis with antibody-based profiling of the same tissues.

2,512 citations


Cites methods from "BioGPS: an extensible and customiza..."

  • ...On the RNA level, the FANTOM consortium has been initiated to map the transcriptional space of the human genome and several gene expression atlases for RNA expression data have been launched, such as the original work to create a gene atlas by integrating mouse and human expression data from multiple tissues using micro arrays (7), the BioGPS portal with expression data from numerous sources (8), the repository ArrayExpress (9) and the RNA-Seq Atlas (10), with transcriptomics data based on deep sequencing from eleven normal human tissues....

    [...]

Journal ArticleDOI
TL;DR: GeneCards, the human gene compendium, enables researchers to effectively navigate and inter‐relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways and provides a stronger foundation for the GeneCards suite of companion databases and analysis tools.
Abstract: GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc.

2,015 citations


Cites methods from "BioGPS: an extensible and customiza..."

  • ...BioGPS: An extensible and customizable portal for querying and organizing gene annotation resources....

    [...]

  • ...This section has the following subsections: a. mRNA expression graph: Provides normal tissue expression profiles for the gene, via experimental results from BioGPS (Wu et al., 2009), GTEx (Lonsdale et al., 2013[), Illumina Body Map, and SAGE....

    [...]

  • ...Color-coded bars indicate the expression level in different tissues, as reported by BioGPS, GTex, and SAGE....

    [...]

01 Aug 2010
TL;DR: It is reported that mediator and cohesin physically and functionally connect the enhancers and core promoters of active genes in murine embryonic stem cells.
Abstract: Transcription factors control cell-specific gene expression programs through interactions with diverse coactivators and the transcription apparatus. Gene activation may involve DNA loop formation between enhancer-bound transcription factors and the transcription apparatus at the core promoter, but this process is not well understood. Here we report that mediator and cohesin physically and functionally connect the enhancers and core promoters of active genes in murine embryonic stem cells. Mediator, a transcriptional coactivator, forms a complex with cohesin, which can form rings that connect two DNA segments. The cohesin-loading factor Nipbl is associated with mediator–cohesin complexes, providing a means to load cohesin at promoters. DNA looping is observed between the enhancers and promoters occupied by mediator and cohesin. Mediator and cohesin co-occupy different promoters in different cells, thus generating cell-type-specific DNA loops linked to the gene expression program of each cell.

1,771 citations

Journal ArticleDOI
TL;DR: A new web site with improved tools for pathway browsing and data analysis is developed, and orthology-based inferences of pathways in non-human species are made, applying Ensembl Compara to identify orthologs of curated human proteins in each of 20 other species.
Abstract: Reactome (http://www.reactome.org) is a collaboration among groups at the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University School of Medicine and The European Bioinformatics Institute, to develop an open source curated bioinformatics database of human pathways and reactions. Recently, we developed a new web site with improved tools for pathway browsing and data analysis. The Pathway Browser is an Systems Biology Graphical Notation (SBGN)-based visualization system that supports zooming, scrolling and event highlighting. It exploits PSIQUIC web services to overlay our curated pathways with molecular interaction data from the Reactome Functional Interaction Network and external interaction databases such as IntAct, BioGRID, ChEMBL, iRefIndex, MINT and STRING. Our Pathway and Expression Analysis tools enable ID mapping, pathway assignment and overrepresentation analysis of user-supplied data sets. To support pathway annotation and analysis in other species, we continue to make orthology-based inferences of pathways in non-human species, applying Ensembl Compara to identify orthologs of curated human proteins in each of 20 other species. The resulting inferred pathway sets can be browsed and analyzed with our Species Comparison tool. Collaborations are also underway to create manually curated data sets on the Reactome framework for chicken, Drosophila and rice.

1,460 citations


Cites background from "BioGPS: an extensible and customiza..."

  • ...This year, additional crossreferences to RSCB Protein Data Bank (34), Comparative Toxicogenomics Database (35), DockBlaster (36), BioGPS (37) and dbSNP (38) have been added to the protein pages....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: This work overhauled its tool for finding preferential conservation of sequence motifs and applied it to the analysis of human 3'UTRs, increasing by nearly threefold the detected number of preferentially conserved miRNA target sites.
Abstract: MicroRNAs (miRNAs) are small endogenous RNAs that pair to sites in mRNAs to direct post-transcriptional repression. Many sites that match the miRNA seed (nucleotides 2–7), particularly those in 3 untranslated regions (3UTRs), are preferentially conserved. Here, we overhauled our tool for finding preferential conservation of sequence motifs and applied it to the analysis of human 3UTRs, increasing by nearly threefold the detected number of preferentially conserved miRNA target sites. The new tool more efficiently incorporates new genomes and more completely controls for background conservation by accounting for mutational biases, dinucleotide conservation rates, and the conservation rates of individual UTRs. The improved background model enabled preferential conservation of a new site type, the “offset 6mer,” to be detected. In total, >45,000 miRNA target sites within human 3UTRs are conserved above background levels, and >60% of human protein-coding genes have been under selective pressure to maintain pairing to miRNAs. Mammalian-specific miRNAs have far fewer conserved targets than do the more broadly conserved miRNAs, even when considering only more recently emerged targets. Although pairing to the 3 end of miRNAs can compensate for seed mismatches, this class of sites constitutes less than 2% of all preferentially conserved sites detected. The new tool enables statistically powerful analysis of individual miRNA target sites, with the probability of preferentially conserved targeting (PCT) correlating with experimental measurements of repression. Our expanded set of target predictions (including conserved 3-compensatory sites), are available at the TargetScan website, which displays the PCT for each site and each predicted target.

7,744 citations

Journal ArticleDOI
TL;DR: KEGG PATHWAY is now supplemented with a new global map of metabolic pathways, which is essentially a combined map of about 120 existing pathway maps, and the KEGG resource is being expanded to suit the needs for practical applications.
Abstract: KEGG (http://www.genome.jp/kegg/) is a database of biological systems that integrates genomic, chemical and systemic functional information. KEGG provides a reference knowledge base for linking genomes to life through the process of PATHWAY mapping, which is to map, for example, a genomic or transcriptomic content of genes to KEGG reference pathways to infer systemic behaviors of the cell or the organism. In addition, KEGG provides a reference knowledge base for linking genomes to the environment, such as for the analysis of drug-target relationships, through the process of BRITE mapping. KEGG BRITE is an ontology database representing functional hierarchies of various biological objects, including molecules, cells, organisms, diseases and drugs, as well as relationships among them. KEGG PATHWAY is now supplemented with a new global map of metabolic pathways, which is essentially a combined map of about 120 existing pathway maps. In addition, smaller pathway modules are defined and stored in KEGG MODULE that also contains other functional units and complexes. The KEGG resource is being expanded to suit the needs for practical applications. KEGG DRUG contains all approved drugs in the US and Japan, and KEGG DISEASE is a new database linking disease genes, pathways, drugs and diagnostic markers.

5,352 citations


"BioGPS: an extensible and customiza..." refers background in this paper

  • ...The 'KEGG' layout shows biological pathways relevant to any particular gene of interest [15]....

    [...]

Journal ArticleDOI
Ed S. Lein1, Michael Hawrylycz1, Nancy Ao2, Mikael Ayres1, Amy Bensinger1, Amy Bernard1, Andrew F. Boe1, Mark S. Boguski3, Mark S. Boguski1, Kevin S. Brockway1, Emi J. Byrnes1, Lin Chen1, Li Chen2, Tsuey-Ming Chen2, Mei Chi Chin1, Jimmy Chong1, Brian E. Crook1, Aneta Czaplinska2, Chinh Dang1, Suvro Datta1, Nick Dee1, Aimee L. Desaki1, Tsega Desta1, Ellen Diep1, Tim A. Dolbeare1, Matthew J. Donelan1, Hong-Wei Dong1, Jennifer G. Dougherty1, Ben J. Duncan1, Amanda Ebbert1, Gregor Eichele4, Lili K. Estin1, Casey Faber1, Benjamin A.C. Facer1, Rick Fields2, Shanna R. Fischer1, Tim P. Fliss1, Cliff Frensley1, Sabrina N. Gates1, Katie J. Glattfelder1, Kevin R. Halverson1, Matthew R. Hart1, John G. Hohmann1, Maureen P. Howell1, Darren P. Jeung1, Rebecca A. Johnson1, Patrick T. Karr1, Reena Kawal1, Jolene Kidney1, Rachel H. Knapik1, Chihchau L. Kuan1, James H. Lake1, Annabel R. Laramee1, Kirk D. Larsen1, Christopher Lau1, Tracy Lemon1, Agnes J. Liang2, Ying Liu2, Lon T. Luong1, Jesse Michaels1, Judith J. Morgan1, Rebecca J. Morgan1, Marty Mortrud1, Nerick Mosqueda1, Lydia Ng1, Randy Ng1, Geralyn J. Orta1, Caroline C. Overly1, Tu H. Pak1, Sheana Parry1, Sayan Dev Pathak1, Owen C. Pearson1, Ralph B. Puchalski1, Zackery L. Riley1, Hannah R. Rockett1, Stephen A. Rowland1, Joshua J. Royall1, Marcos J. Ruiz2, Nadia R. Sarno1, Katherine Schaffnit1, Nadiya V. Shapovalova1, Taz Sivisay1, Clifford R. Slaughterbeck1, Simon Smith1, Kimberly A. Smith1, Bryan I. Smith1, Andy J. Sodt1, Nick N. Stewart1, Kenda-Ruth Stumpf1, Susan M. Sunkin1, Madhavi Sutram1, Angelene Tam2, Carey D. Teemer1, Christina Thaller2, Carol L. Thompson1, Lee R. Varnam1, Axel Visel4, Axel Visel5, Ray M. Whitlock1, Paul Wohnoutka1, Crissa K. Wolkey1, Victoria Y. Wong1, Matthew J.A. Wood2, Murat B. Yaylaoglu2, Rob Young1, Brian L. Youngstrom1, Xu Feng Yuan1, Bin Zhang2, Theresa A. Zwingman1, Allan R. Jones1 
11 Jan 2007-Nature
TL;DR: An anatomically comprehensive digital atlas containing the expression patterns of ∼20,000 genes in the adult mouse brain is described, providing an open, primary data resource for a wide variety of further studies concerning brain organization and function.
Abstract: Molecular approaches to understanding the functional circuitry of the nervous system promise new insights into the relationship between genes, brain and behaviour. The cellular diversity of the brain necessitates a cellular resolution approach towards understanding the functional genomics of the nervous system. We describe here an anatomically comprehensive digital atlas containing the expression patterns of approximately 20,000 genes in the adult mouse brain. Data were generated using automated high-throughput procedures for in situ hybridization and data acquisition, and are publicly accessible online. Newly developed image-based informatics tools allow global genome-scale structural analysis and cross-correlation, as well as identification of regionally enriched genes. Unbiased fine-resolution analysis has identified highly specific cellular markers as well as extensive evidence of cellular heterogeneity not evident in classical neuroanatomical atlases. This highly standardized atlas provides an open, primary data resource for a wide variety of further studies concerning brain organization and function.

4,944 citations


"BioGPS: an extensible and customiza..." refers background in this paper

  • ...In addition, there are a wide variety of gene annotation sites targeting more specific communities, including a database describing the targets of the transcription factor CREB [7], the Allen Brain Atlas showing high-resolution expression information by in situ hybridization in the mouse brain [ 8 ], and the TargetScan database for microRNA target prediction [9]....

    [...]

Journal ArticleDOI
TL;DR: In this paper, high-density oligonucleotide arrays offer the opportunity to examine patterns of gene expression on a genome scale, and the authors have designed custom arrays that interrogate the expression of the vast majority of proteinencoding human and mouse genes and have used them to profile a panel of 79 human and 61 mouse tissues.
Abstract: The tissue-specific pattern of mRNA expression can indicate important clues about gene function. High-density oligonucleotide arrays offer the opportunity to examine patterns of gene expression on a genome scale. Toward this end, we have designed custom arrays that interrogate the expression of the vast majority of protein-encoding human and mouse genes and have used them to profile a panel of 79 human and 61 mouse tissues. The resulting data set provides the expression patterns for thousands of predicted genes, as well as known and poorly characterized genes, from mice and humans. We have explored this data set for global trends in gene expression, evaluated commonly used lines of evidence in gene prediction methodologies, and investigated patterns indicative of chromosomal organization of transcription. We describe hundreds of regions of correlated transcription and show that some are subject to both tissue and parental allele-specific expression, suggesting a link between spatial expression and imprinting.

3,513 citations


"BioGPS: an extensible and customiza..." refers methods in this paper

  • ...In addition, BioGPS hosts a gene expression plugin that displays reference expression patterns from the Gene Atlas data sets [6] and expression quantitative trait loci studies [20], as well as new data sets for an updated mouse Gene Atlas [GEO:GSE10246] [12] and exon array atlas [GEO:GSE15998]....

    [...]

  • ...Moreover, we have included several reference datasets that have been extensively utilized in the microarray community through our SymAtlas website [6]....

    [...]

  • ...In the case of BioGPS, the default gene annotation report focuses on our reference 'Gene Atlas' data sets, which show gene expression patterns from a diverse set of tissues and cell types [6,12,13]....

    [...]

  • ...Other researchers may query reference Gene Atlas expression data using the SymAtlas web site [6]....

    [...]

Journal ArticleDOI
TL;DR: The most important new developments in STRING 8 over previous releases include a URL-based programming interface, improved interaction prediction via genomic neighborhood in prokaryotes, and the inclusion of protein structures.
Abstract: Functional partnerships between proteins are at the core of complex cellular phenotypes, and the networks formed by interacting proteins provide researchers with crucial scaffolds for modeling, data reduction and annotation. STRING is a database and web resource dedicated to protein–protein interactions, including both physical and functional interactions. It weights and integrates information from numerous sources, including experimental repositories, computational prediction methods and public text collections, thus acting as a metadatabase that maps all interaction evidence onto a common set of genomes and proteins. The most important new developments in STRING 8 over previous releases include a URL-based programming interface, which can be used to query STRING from other resources, improved interaction prediction via genomic neighborhood in prokaryotes, and the inclusion of protein structures. Version 8.0 of STRING covers about 2.5 million proteins from 630 organisms, providing the most comprehensive view on protein–protein interactions currently available. STRING can be reached at http://string-db.org/.

2,394 citations


"BioGPS: an extensible and customiza..." refers methods in this paper

  • ...visit the STRING database for protein interaction data [5]....

    [...]