Home
/
Authors
/
Allan Peter Davis

Author

Allan Peter Davis

Other affiliations: University of California, Los Angeles, University of Utah, Mount Desert Island Biological Laboratory

Bio: Allan Peter Davis is an academic researcher from North Carolina State University. The author has contributed to research in topics: Toxicogenomics & Environmental exposure. The author has an hindex of 28, co-authored 43 publications receiving 35566 citations. Previous affiliations of Allan Peter Davis include University of California, Los Angeles & University of Utah.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Gene Ontology: tool for the unification of biology

[...]

M Ashburner¹, Catherine A. Ball, Judith A. Blake, David Botstein, Heather Butler, J. M. Cherry, Allan Peter Davis, Kara Dolinski, Selina S. Dwight, J.T. Eppig, Midori A. Harris, David P. Hill, Laurie Issel-Tarver, Andrew Kasarskis, Suzanna E. Lewis, John C. Matese, Joel E. Richardson, M. Ringwald, Gerald M. Rubin, Gavin Sherlock - Show less +16 more•Institutions (1)

Stanford University¹

01 May 2000-Nature Genetics

TL;DR: The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing.

...read moreread less

Abstract: Genomic sequencing has made it clear that a large fraction of the genes specifying the core biological functions are shared by all eukaryotes. Knowledge of the biological role of such shared proteins in one organism can often be transferred to other organisms. The goal of the Gene Ontology Consortium is to produce a dynamic, controlled vocabulary that can be applied to all eukaryotes even as knowledge of gene and protein roles in cells is accumulating and changing. To this end, three independent ontologies accessible on the World-Wide Web (http://www.geneontology.org) are being constructed: biological process, molecular function and cellular component.

...read moreread less

35,225 citations

Journal Article•DOI•

The Comparative Toxicogenomics Database: update 2019.

[...]

Allan Peter Davis¹, Cynthia J. Grondin¹, Robin J. Johnson¹, Daniela Sciaky¹, Roy McMorran², Jolene Wiegers¹, Thomas C. Wiegers¹, Carolyn J. Mattingly¹ - Show less +4 more•Institutions (2)

North Carolina State University¹, Mount Desert Island Biological Laboratory²

08 Jan 2019-Nucleic Acids Research

TL;DR: This biennial update presents a new chemical–phenotype module that codes chemical-induced effects on phenotypes, curated using controlled vocabularies for chemicals, phenotype, taxa, and anatomical descriptors, and describes new querying and display features for the enhanced chemical–exposure science module, providing greater scope of content and utility.

...read moreread less

Abstract: The Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is a premier public resource for literature-based, manually curated associations between chemicals, gene products, phenotypes, diseases, and environmental exposures. In this biennial update, we present our new chemical-phenotype module that codes chemical-induced effects on phenotypes, curated using controlled vocabularies for chemicals, phenotypes, taxa, and anatomical descriptors; this module provides unique opportunities to explore cellular and system-level phenotypes of the pre-disease state and allows users to construct predictive adverse outcome pathways (linking chemical-gene molecular initiating events with phenotypic key events, diseases, and population-level health outcomes). We also report a 46% increase in CTD manually curated content, which when integrated with other datasets yields more than 38 million toxicogenomic relationships. We describe new querying and display features for our enhanced chemical-exposure science module, providing greater scope of content and utility. As well, we discuss an updated MEDIC disease vocabulary with over 1700 new terms and accession identifiers. To accommodate these increases in data content and functionality, CTD has upgraded its computational infrastructure. These updates continue to improve CTD and help inform new testable hypotheses about the etiology and mechanisms underlying environmentally influenced diseases.

...read moreread less

716 citations

Journal Article•DOI•

BioCreative V CDR task corpus: a resource for chemical disease relation extraction

[...]

Jiao Li¹, Yueping Sun¹, Robin J. Johnson², Daniela Sciaky², Chih-Hsuan Wei, Robert Leaman, Allan Peter Davis², Carolyn J. Mattingly², Thomas C. Wiegers², Zhiyong Lu - Show less +6 more•Institutions (2)

Peking Union Medical College¹, North Carolina State University²

01 Jan 2016-Database

TL;DR: The BC5CDR corpus was successfully used for the BioCreative V challenge tasks and should serve as a valuable resource for the text-mining research community.

...read moreread less

Abstract: Community-run, formal evaluations and manually annotated text corpora are critically important for advancing biomedical text-mining research. Recently in BioCreative V, a new challenge was organized for the tasks of disease named entity recognition (DNER) and chemical-induced disease (CID) relation extraction. Given the nature of both tasks, a test collection is required to contain both disease/chemical annotations and relation annotations in the same set of articles. Despite previous efforts in biomedical corpus construction, none was found to be sufficient for the task. Thus, we developed our own corpus called BC5CDR during the challenge by inviting a team of Medical Subject Headings (MeSH) indexers for disease/chemical entity annotation and Comparative Toxicogenomics Database (CTD) curators for CID relation annotation. To ensure high annotation quality and productivity, detailed annotation guidelines and automatic annotation tools were provided. The resulting BC5CDR corpus consists of 1500 PubMed articles with 4409 annotated chemicals, 5818 diseases and 3116 chemical-disease interactions. Each entity annotation includes both the mention text spans and normalized concept identifiers, using MeSH as the controlled vocabulary. To ensure accuracy, the entities were first captured independently by two annotators followed by a consensus annotation: The average inter-annotator agreement (IAA) scores were 87.49% and 96.05% for the disease and chemicals, respectively, in the test set according to the Jaccard similarity coefficient. Our corpus was successfully used for the BioCreative V challenge tasks and should serve as a valuable resource for the text-mining research community.Database URL: http://www.biocreative.org/tasks/biocreative-v/track-3-cdr/.

...read moreread less

605 citations

Journal Article•DOI•

Absence of radius and ulna in mice lacking hoxa-11 and hoxd-11.

[...]

Allan Peter Davis¹, David P. Witte², Hsiu Mei Hsieh-Li³, S. Steven Potter², Mario R. Capecchi¹ - Show less +1 more•Institutions (3)

University of Utah¹, University of Cincinnati², University of Cincinnati Academic Health Center³

29 Jun 1995-Nature

TL;DR: Double mutants are generated which have dramatic phenotypes not apparent in mice homozygous for the individual mutations, and suggest that paralogous Hoxgenes function together to specify limb outgrowth and patterning along the proximodistal axis.

...read moreread less

Abstract: MICE with targeted disruptions1 in Hoxgenes have been generated to evaluate the role of the Hox complex in determining the mammalian body plan. This complex of 38 genes encodes transcription factors that specify regional information along the embryonic axes. Early in vertebrate evolution an ancestral complex shared with invertebrates was duplicated twice to give rise to the four linkage groups (Hox A, B, C and D)2,3. As a consequence, corresponding genes on the separate linkage groups, called paralogues, are most closely related to each other. Based on sequence similarities, the Hox genes have been subdivided into 13 paralogous groups. The five most 5′ groups (Hox9–13) pattern the posterior region of the vertebrate embryo and the appendicular skeleton4–18. Mice with individual mutations in the paralogous genes hoxa-11 and hoxd-11 have been described15–18. By breeding these two strains together we have generated double mutants which have dramatic phenotypes not apparent in mice homozygous for the individual mutations. The radius and the ulna of the forelimb are almost entirely eliminated, the axial skeleton shows homeotic transformations, and there are severe kidney defects not present in either single mutant. The limb and axial phenotypes are quantitative: as more mutant alleles are added to the genotype, the phenotype becomes progressively more severe. The appendicular skeleton defects suggest that paralogous Hoxgenes function together to specify limb outgrowth and patterning along the proximodistal axis.

...read moreread less

594 citations

Journal Article•DOI•

Comparative Toxicogenomics Database (CTD): update 2021.

[...]

Allan Peter Davis¹, Cynthia J. Grondin¹, Robin J. Johnson¹, Daniela Sciaky¹, Jolene Wiegers¹, Thomas C. Wiegers¹, Carolyn J. Mattingly¹ - Show less +3 more•Institutions (1)

North Carolina State University¹

08 Jan 2021-Nucleic Acids Research

TL;DR: This biennial update of the public Comparative Toxicogenomics Database (CTD) reports a 20% increase in CTD curated content and provides 45 million toxicogenomic relationships and introduces new CTD Anatomy pages that allow users to uniquely explore and analyze chemical–phenotype interactions from an anatomical perspective.

...read moreread less

Abstract: The public Comparative Toxicogenomics Database (CTD; http://ctdbase.org/) is an innovative digital ecosystem that relates toxicological information for chemicals, genes, phenotypes, diseases, and exposures to advance understanding about human health. Literature-based, manually curated interactions are integrated to create a knowledgebase that harmonizes cross-species heterogeneous data for chemical exposures and their biological repercussions. In this biennial update, we report a 20% increase in CTD curated content and now provide 45 million toxicogenomic relationships for over 16 300 chemicals, 51 300 genes, 5500 phenotypes, 7200 diseases and 163 000 exposure events, from 600 comparative species. Furthermore, we increase the functionality of chemical-phenotype content with new data-tabs on CTD Disease pages (to help fill in knowledge gaps for environmental health) and new phenotype search parameters (for Batch Query and Venn analysis tools). As well, we introduce new CTD Anatomy pages that allow users to uniquely explore and analyze chemical-phenotype interactions from an anatomical perspective. Finally, we have enhanced CTD Chemical pages with new literature-based chemical synonyms (to improve querying) and added 1600 amino acid-based compounds (to increase chemical landscape). Together, these updates continue to augment CTD as a powerful resource for generating testable hypotheses about the etiologies and molecular mechanisms underlying environmentally influenced diseases.

...read moreread less

515 citations

1
2
3
4
…
5
6
7
8
9

Collapse

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks

[...]

Paul Shannon¹, Andrew Markiel, Owen Ozier, Nitin S. Baliga, Jonathan T. Wang, Daniel Ramage, Nada Amin, Benno Schwikowski, Trey Ideker - Show less +5 more•Institutions (1)

Institute for Systems Biology¹

01 Nov 2003-Genome Research

TL;DR: Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

...read moreread less

Abstract: Cytoscape is an open source software project for integrating biomolecular interaction networks with high-throughput expression data and other molecular states into a unified conceptual framework. Although applicable to any system of molecular components and interactions, Cytoscape is most powerful when used in conjunction with large databases of protein-protein, protein-DNA, and genetic interactions that are increasingly available for humans and model organisms. Cytoscape's software Core provides basic functionality to layout and query the network; to visually integrate the network with expression profiles, phenotypes, and other molecular states; and to link the network to databases of functional annotations. The Core is extensible through a straightforward plug-in architecture, allowing rapid development of additional computational analyses and features. Several case studies of Cytoscape plug-ins are surveyed, including a search for interaction pathways correlating with changes in gene expression, a study of protein complexes involved in cellular recovery to DNA damage, inference of a combined physical/functional interaction network for Halobacterium, and an interface to detailed stochastic/kinetic gene regulatory models.

...read moreread less

32,980 citations

Journal Article•DOI•

limma powers differential expression analyses for RNA-sequencing and microarray studies

[...]

Matthew E. Ritchie¹, Belinda Phipson², Di Wu³, Yifang Hu¹, Charity W. Law⁴, Wei Shi¹, Gordon K. Smyth¹, Gordon K. Smyth⁵ - Show less +4 more•Institutions (5)

Walter and Eliza Hall Institute of Medical Research¹, Royal Children's Hospital², Harvard University³, University of Zurich⁴, University of Melbourne⁵

20 Apr 2015-Nucleic Acids Research

TL;DR: The philosophy and design of the limma package is reviewed, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

Abstract: limma is an R/Bioconductor software package that provides an integrated solution for analysing data from gene expression experiments. It contains rich features for handling complex experimental designs and for information borrowing to overcome the problem of small sample sizes. Over the past decade, limma has been a popular choice for gene discovery through differential expression analyses of microarray and high-throughput PCR data. The package contains particularly strong facilities for reading, normalizing and exploring such data. Recently, the capabilities of limma have been significantly expanded in two important directions. First, the package can now perform both differential expression and differential splicing analyses of RNA sequencing (RNA-seq) data. All the downstream analysis tools previously restricted to microarray data are now available for RNA-seq as well. These capabilities allow users to analyse both RNA-seq and microarray data with very similar pipelines. Second, the package is now able to go past the traditional gene-wise expression analyses in a variety of ways, analysing expression profiles in terms of co-regulated sets of genes or in terms of higher-order expression signatures. This provides enhanced possibilities for biological interpretation of gene expression differences. This article reviews the philosophy and design of the limma package, summarizing both new and historical features, with an emphasis on recent enhancements and features that have not been previously described.

...read moreread less

22,147 citations

Journal Article•DOI•

clusterProfiler: an R Package for Comparing Biological Themes Among Gene Clusters

[...]

Guangchuang Yu¹, Li Gen Wang, Yanyan Han, Qing-Yu He•Institutions (1)

Jinan University¹

03 May 2012-Omics A Journal of Integrative Biology

TL;DR: An R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters and can be easily extended to other species and ontologies is presented.

...read moreread less

Abstract: Increasing quantitative data generated from transcriptomics and proteomics require integrative strategies for analysis Here, we present an R package, clusterProfiler that automates the process of biological-term classification and the enrichment analysis of gene clusters The analysis module and visualization module were combined into a reusable workflow Currently, clusterProfiler supports three species, including humans, mice, and yeast Methods provided in this package can be easily extended to other species and ontologies The clusterProfiler package is released under Artistic-20 License within Bioconductor project The source code and vignette are freely available at http://bioconductororg/packages/release/bioc/html/clusterProfilerhtml

...read moreread less

16,644 citations

Journal Article•DOI•

WGCNA: an R package for weighted correlation network analysis.

[...]

Peter Langfelder¹, Steve Horvath¹•Institutions (1)

University of California, Los Angeles¹

29 Dec 2008-BMC Bioinformatics

TL;DR: The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis that includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software.

...read moreread less

Abstract: Correlation networks are increasingly being used in bioinformatics applications For example, weighted gene co-expression network analysis is a systems biology method for describing the correlation patterns among genes across microarray samples Weighted correlation network analysis (WGCNA) can be used for finding clusters (modules) of highly correlated genes, for summarizing such clusters using the module eigengene or an intramodular hub gene, for relating modules to one another and to external sample traits (using eigengene network methodology), and for calculating module membership measures Correlation networks facilitate network based gene screening methods that can be used to identify candidate biomarkers or therapeutic targets These methods have been successfully applied in various biological contexts, eg cancer, mouse genetics, yeast genetics, and analysis of brain imaging data While parts of the correlation network methodology have been described in separate publications, there is a need to provide a user-friendly, comprehensive, and consistent software implementation and an accompanying tutorial The WGCNA R software package is a comprehensive collection of R functions for performing various aspects of weighted correlation network analysis The package includes functions for network construction, module detection, gene selection, calculations of topological properties, data simulation, visualization, and interfacing with external software Along with the R package we also present R software tutorials While the methods development was motivated by gene expression data, the underlying data mining approach can be applied to a variety of different settings The WGCNA package provides R functions for weighted correlation network analysis, eg co-expression network analysis of gene expression data The R package along with its source code and additional material are freely available at http://wwwgeneticsuclaedu/labs/horvath/CoexpressionNetwork/Rpackages/WGCNA

...read moreread less

14,243 citations

Journal Article•DOI•

Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists

[...]

Da-Wei Huang¹, Brad T. Sherman¹, Richard A. Lempicki¹•Institutions (1)

Science Applications International Corporation¹

01 Jan 2009-Nucleic Acids Research

TL;DR: The survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

...read moreread less

Abstract: Functional analysis of large gene lists, derived in most cases from emerging high-throughput genomic, proteomic and bioinformatics scanning approaches, is still a challenging and daunting task. The gene-annotation enrichment analysis is a promising high-throughput strategy that increases the likelihood for investigators to identify biological processes most pertinent to their study. Approximately 68 bioinformatics enrichment tools that are currently available in the community are collected in this survey. Tools are uniquely categorized into three major classes, according to their underlying enrichment algorithms. The comprehensive collections, unique tool classifications and associated questions/issues will provide a more comprehensive and up-to-date view regarding the advantages, pitfalls and recent trends in a simpler tool-class level rather than by a tool-by-tool approach. Thus, the survey will help tool designers/developers and experienced end users understand the underlying algorithms and pertinent details of particular tool categories/tools, enabling them to make the best choices for their particular research interests.

...read moreread less

13,102 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse