scispace - formally typeset
Search or ask a question
Author

Kara Dolinski

Bio: Kara Dolinski is an academic researcher from Princeton University. The author has contributed to research in topics: Genome & Reference genome. The author has an hindex of 47, co-authored 70 publications receiving 47508 citations. Previous affiliations of Kara Dolinski include Stanford University & University of British Columbia.


Papers
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
Midori A. Harris, Jennifer I. Clark1, Ireland A1, Jane Lomax1, Michael Ashburner2, Michael Ashburner1, R. Foulger2, R. Foulger1, Karen Eilbeck3, Karen Eilbeck1, Suzanna E. Lewis1, Suzanna E. Lewis3, B. Marshall3, B. Marshall1, Christopher J. Mungall3, Christopher J. Mungall1, J. Richter1, J. Richter3, Gerald M. Rubin3, Gerald M. Rubin1, Judith A. Blake1, Carol J. Bult1, Dolan M1, Drabkin H1, Janan T. Eppig1, Hill Dp1, L. Ni1, Ringwald M1, Rama Balakrishnan4, Rama Balakrishnan1, J. M. Cherry1, J. M. Cherry4, Karen R. Christie4, Karen R. Christie1, Maria C. Costanzo1, Maria C. Costanzo4, Selina S. Dwight1, Selina S. Dwight4, Stacia R. Engel4, Stacia R. Engel1, Dianna G. Fisk1, Dianna G. Fisk4, Jodi E. Hirschman4, Jodi E. Hirschman1, Eurie L. Hong1, Eurie L. Hong4, Robert S. Nash4, Robert S. Nash1, Anand Sethuraman1, Anand Sethuraman4, Chandra L. Theesfeld1, Chandra L. Theesfeld4, David Botstein1, David Botstein5, Kara Dolinski5, Kara Dolinski1, Becket Feierbach5, Becket Feierbach1, Tanya Z. Berardini6, Tanya Z. Berardini1, S. Mundodi1, S. Mundodi6, Seung Y. Rhee6, Seung Y. Rhee1, Rolf Apweiler1, Daniel Barrell1, Camon E1, E. Dimmer1, Lee1, Rex L. Chisholm, Pascale Gaudet1, Pascale Gaudet7, Warren A. Kibbe1, Warren A. Kibbe7, Ranjana Kishore8, Ranjana Kishore1, Erich M. Schwarz8, Erich M. Schwarz1, Paul W. Sternberg1, Paul W. Sternberg8, M. Gwinn1, Hannick L1, Wortman J1, Matthew Berriman9, Matthew Berriman1, Wood9, Wood1, de la Cruz N1, de la Cruz N10, Peter J. Tonellato10, Peter J. Tonellato1, Pankaj Jaiswal11, Pankaj Jaiswal1, Seigfried T1, Seigfried T12, White R1, White R13 
TL;DR: The Gene Ontology (GO) project as discussed by the authors provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences.
Abstract: The Gene Ontology (GO) project (http://www.geneontology.org/) provides structured, controlled vocabularies and classifications that cover several domains of molecular and cellular biology and are freely available for community use in the annotation of genes, gene products and sequences. Many model organism databases and genome annotation groups use the GO and contribute their annotation sets to the GO resource. The GO database integrates the vocabularies and contributed annotations and provides full access to this information in several formats. Members of the GO Consortium continually work collectively, involving outside experts as needed, to expand and update the GO vocabularies. The GO Web resource also provides access to extensive documentation about the GO project and links to applications that use GO data for functional analyses.

3,565 citations

Journal ArticleDOI
TL;DR: A new dedicated aspect of BioGRID annotates genome-wide CRISPR/Cas9-based screens that report gene–phenotype and gene–gene relationships, and captures chemical interaction data, including chemical–protein interactions for human drug targets drawn from the DrugBank database and manually curated bioactive compounds reported in the literature.
Abstract: The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the curation and archival storage of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2018 (build 3.4.164), BioGRID contains records for 1 598 688 biological interactions manually annotated from 55 809 publications for 71 species, as classified by an updated set of controlled vocabularies for experimental detection methods. BioGRID also houses records for >700 000 post-translational modification sites. BioGRID now captures chemical interaction data, including chemical-protein interactions for human drug targets drawn from the DrugBank database and manually curated bioactive compounds reported in the literature. A new dedicated aspect of BioGRID annotates genome-wide CRISPR/Cas9-based screens that report gene-phenotype and gene-gene relationships. An extension of the BioGRID resource called the Open Repository for CRISPR Screens (ORCS) database (https://orcs.thebiogrid.org) currently contains over 500 genome-wide screens carried out in human or mouse cell lines. All data in BioGRID is made freely available without restriction, is directly downloadable in standard formats and can be readily incorporated into existing applications via our web service platforms. BioGRID data are also freely distributed through partner model organism databases and meta-databases.

1,046 citations

Journal ArticleDOI
TL;DR: The Biological General Repository for Interaction Datasets (BioGRID) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species.
Abstract: The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe and the model plant Arabidopsis thaliana. A number of themed curation projects in areas of biomedical importance are also supported. BioGRID has established collaborations and/or shares data records for the annotation of interactions and phenotypes with most major model organism databases, including Saccharomyces Genome Database, PomBase, WormBase, FlyBase and The Arabidopsis Information Resource. BioGRID also actively engages with the text-mining community to benchmark and deploy automated tools to expedite curation workflows. BioGRID data are freely accessible through both a user-defined interactive interface and in batch downloads in a wide variety of formats, including PSI-MI2.5 and tab-delimited files. BioGRID records can also be interrogated and analyzed with a series of new bioinformatics tools, which include a post-translational modification viewer, a graphical viewer, a REST service and a Cytoscape plugin.

1,000 citations

Journal ArticleDOI
TL;DR: The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources that enable insights into conserved networks and pathways that are relevant to human health.
Abstract: The Biological General Repository for Interaction Datasets (BioGRID) is a public database that archives and disseminates genetic and protein interaction data from model organisms and humans (http://www.thebiogrid.org). BioGRID currently holds 347 966 interactions (170 162 genetic, 177 804 protein) curated from both high-throughput data sets and individual focused studies, as derived from over 23 000 publications in the primary literature. Complete coverage of the entire literature is maintained for budding yeast (Saccharomyces cerevisiae), fission yeast (Schizosaccharomyces pombe) and thale cress (Arabidopsis thaliana), and efforts to expand curation across multiple metazoan species are underway. The BioGRID houses 48 831 human protein interactions that have been curated from 10 247 publications. Current curation drives are focused on particular areas of biology to enable insights into conserved networks and pathways that are relevant to human health. The BioGRID 3.0 web interface contains new search and display features that enable rapid queries across multiple data types and sources. An automated Interaction Management System (IMS) is used to prioritize, coordinate and track curation across international sites and projects. BioGRID provides interaction data to several model organism databases, resources such as Entrez-Gene and other interaction meta-databases. The entire BioGRID 3.0 data collection may be downloaded in multiple file formats, including PSI MI XML. Source code for BioGRID 3.0 is freely available without any restrictions.

972 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.
Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

35,225 citations

Journal ArticleDOI
TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.
Abstract: Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

32,980 citations

Journal ArticleDOI
Eric S. Lander1, Lauren Linton1, Bruce W. Birren1, Chad Nusbaum1  +245 moreInstitutions (29)
15 Feb 2001-Nature
TL;DR: The results of an international collaboration to produce and make freely available a draft sequence of the human genome are reported and an initial analysis is presented, describing some of the insights that can be gleaned from the sequence.
Abstract: The human genome holds an extraordinary trove of information about human development, physiology, medicine and evolution. Here we report the results of an international collaboration to produce and make freely available a draft sequence of the human genome. We also present an initial analysis of the data, describing some of the insights that can be gleaned from the sequence.

22,269 citations

Journal ArticleDOI
TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.
Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

22,147 citations

Journal ArticleDOI
TL;DR: An R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters and can be easily extended to other species and ontologies is presented.
Abstract: Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters The analysis module and visualization module were combined into a reusable workflow Currently, clusterProfiler supports three species, including humans, mice, and yeast Methods provided in this package can be easily extended to other species and ontologies The clusterProfiler package is released under Artistic-20 License within Bioconductor project The source code and vignette are freely available at http://bioconductororg/packages/release/bioc/html/clusterProfilerhtml

16,644 citations