Home
/
Authors
/
Paul A. Thiessen

Author

Paul A. Thiessen

Bio: Paul A. Thiessen is an academic researcher from National Institutes of Health. The author has contributed to research in topics: PubChem & Conserved Domain Database. The author has an hindex of 3, co-authored 4 publications receiving 2519 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

PubChem Substance and Compound databases

[...]

Sunghwan Kim¹, Paul A. Thiessen¹, Evan E Bolton¹, Jie Chen¹, Gang Fu¹, Asta Gindulyte¹, Lianyi Han¹, Jane He¹, Siqian He¹, Benjamin A. Shoemaker¹, Jiyao Wang¹, Bo Yu¹, Jian-Jian Zhang¹, Stephen H. Bryant¹ - Show less +10 more•Institutions (1)

National Institutes of Health¹

04 Jan 2016-Nucleic Acids Research

TL;DR: An overview of the PubChem Substance and Compound databases is provided, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access.

...read moreread less

Abstract: PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public repository for information on chemical substances and their biological activities, launched in 2004 as a component of the Molecular Libraries Roadmap Initiatives of the US National Institutes of Health (NIH). For the past 11 years, PubChem has grown to a sizable system, serving as a chemical information resource for the scientific research community. PubChem consists of three inter-linked databases, Substance, Compound and BioAssay. The Substance database contains chemical information deposited by individual data contributors to PubChem, and the Compound database stores unique chemical structures extracted from the Substance database. Biological activity data of chemical substances tested in assay experiments are contained in the BioAssay database. This paper provides an overview of the PubChem Substance and Compound databases, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access. It also gives a brief description of PubChem3D, a resource derived from theoretical three-dimensional structures of compounds in PubChem, as well as PubChemRDF, Resource Description Framework (RDF)-formatted PubChem data for data sharing, analysis and integration with information contained in other databases.

...read moreread less

3,328 citations

Journal Article•DOI•

PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem

[...]

Sunghwan Kim¹, Paul A. Thiessen¹, Evan E Bolton¹, Stephen H. Bryant¹•Institutions (1)

National Institutes of Health¹

01 Jul 2015-Nucleic Acids Research

TL;DR: Two additional general purpose web services can be harnessed in combination to access the data contained in PubChem, which is integrated with the more than thirty databases available within the NCBI Entrez system.

...read moreread less

Abstract: PubChem (http://pubchem.ncbi.nlm.nih.gov) is a public repository for information on chemical substances and their biological activities, developed and maintained by the US National Institutes of Health (NIH). PubChem contains more than 180 million depositor-provided chemical substance descriptions, 60 million unique chemical structures and 225 million bioactivity assay results, covering more than 9000 unique protein target sequences. As an information resource for the chemical biology research community, it routinely receives more than 1 million requests per day from an estimated more than 1 million unique users per month. Programmatic access to this vast amount of data is provided by several different systems, including the US National Center for Biotechnology Information (NCBI)'s Entrez Utilities (E-Utilities or E-Utils) and the PubChem Power User Gateway (PUG)-a common gateway interface (CGI) that exchanges data through eXtended Markup Language (XML). Further simplifying programmatic access, PubChem provides two additional general purpose web services: PUG-SOAP, which uses the simple object access protocol (SOAP) and PUG-REST, which is a Representational State Transfer (REST)-style interface. These interfaces can be harnessed in combination to access the data contained in PubChem, which is integrated with the more than thirty databases available within the NCBI Entrez system.

...read moreread less

76 citations

Journal Article•DOI•

A structure-based method for protein sequence alignment

[...]

Maricel G. Kann¹, Paul A. Thiessen¹, Anna R. Panchenko¹, Alejandro A. Schäffer¹, Stephen F. Altschul¹, Stephen H. Bryant¹ - Show less +2 more•Institutions (1)

National Institutes of Health¹

15 Apr 2005-Bioinformatics

TL;DR: SALTO as mentioned in this paper aligns protein query sequences to position-specific scoring matrices (PSSMs) using rules for placing and scoring gaps that are consistent with the conserved regions of domain alignments from NCBI's conserved domain database.

...read moreread less

Abstract: Motivation: With the continuing rapid growth of protein sequence data, protein sequence comparison methods have become the most widely used tools of bioinformatics. Among these methods are those that use position-specific scoring matrices (PSSMs) to describe protein families. PSSMs can capture information about conserved patterns within families, which can be used to increase the sensitivity of searches for related sequences. Certain types of structural information, however, are not generally captured by PSSM search methods. Here we introduce a program, Structure-based ALignment TOol (SALTO), that aligns protein query sequences to PSSMs using rules for placing and scoring gaps that are consistent with the conserved regions of domain alignments from NCBI's Conserved Domain Database. Results: In most cases, the alignment scores obtained using the local alignment version follow an extreme value distribution. SALTO's performance in finding related sequences and producing accurate alignments is similar to or better than that of IMPALA; one advantage of SALTO is that it imposes an explicit gapping model on each protein family. Availability: A stand-alone version of the program that can generate global or local alignments is available by ftp distribution (ftp://ftp.ncbi.nih.gov/pub/SALTO/), and has been incorporated to Cn3D structure/alignment viewer. Contact: bryant@ncbi.nlm.nih.gov

...read moreread less

24 citations

Book Chapter•DOI•

Bridging Chemical and Biological Information: Public Knowledge Spaces

[...]

Paul A. Thiessen¹, Wolf‐D. Ihlenfeldt, Evan E Bolton¹, Stephen H. Bryant¹•Institutions (1)

National Institutes of Health¹

14 Nov 2011

TL;DR: PubChem1 is probably the most widely known publicly accessible chemical compound database on the World Wide Web, but it is not the first freely available, Web-accessible database providing biological information on the Internet.

...read moreread less

Abstract: At the time of this writing, PubChem1 is probably the most widely known publicly accessible chemical compound database on the World Wide Web (WWW, or just Web). It contains not only chemical structures, but also biological data linked to these structures. PubChem was launched in 2004, but it is certainly not the first freely available, Web-accessible database providing biological information on the Internet. The biological data landscape is complicated by varying definitions of what classes of information should be considered as biological information. Do toxicity data constitute biological information? If yes, should a qualifying database contain actual measurements, or can this information be provided in distilled, abstracted formats, perhaps even as material safety data sheets (MSDSs) or simple handling classifiers? Do we simply

...read moreread less

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules

[...]

Antoine Daina¹, Olivier Michielin², Olivier Michielin³, Olivier Michielin¹, Vincent Zoete¹ - Show less +1 more•Institutions (3)

Swiss Institute of Bioinformatics¹, Ludwig Institute for Cancer Research², University Hospital of Lausanne³

03 Mar 2017-Scientific Reports

TL;DR: The new SwissADME web tool is presented that gives free access to a pool of fast yet robust predictive models for physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness, among which in-house proficient methods such as the BOILED-Egg, iLOGP and Bioavailability Radar are presented.

...read moreread less

Abstract: To be effective as a drug, a potent molecule must reach its target in the body in sufficient concentration, and stay there in a bioactive form long enough for the expected biologic events to occur. Drug development involves assessment of absorption, distribution, metabolism and excretion (ADME) increasingly earlier in the discovery process, at a stage when considered compounds are numerous but access to the physical samples is limited. In that context, computer models constitute valid alternatives to experiments. Here, we present the new SwissADME web tool that gives free access to a pool of fast yet robust predictive models for physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness, among which in-house proficient methods such as the BOILED-Egg, iLOGP and Bioavailability Radar. Easy efficient input and interpretation are ensured thanks to a user-friendly interface through the login-free website http://www.swissadme.ch. Specialists, but also nonexpert in cheminformatics or computational chemistry can predict rapidly key parameters for a collection of molecules to support their drug discovery endeavours.

...read moreread less

6,135 citations

Journal Article•DOI•

PubChem Substance and Compound databases

[...]

National Institutes of Health¹

04 Jan 2016-Nucleic Acids Research

...read moreread less

3,328 citations

Journal Article•DOI•

MetaboAnalyst 4.0: towards more transparent and integrative metabolomics analysis.

[...]

Jasmine Chong¹, Othman Soufan¹, Carin Li², Iurie Caraus¹, Shuzhao Li³, Guillaume Bourque¹, David S. Wishart², Jianguo Xia¹ - Show less +4 more•Institutions (3)

McGill University¹, University of Alberta², Emory University³

02 Jul 2018-Nucleic Acids Research

TL;DR: The user interface of MetaboAnalyst 4.0 has been reengineered to provide a more modern look and feel, as well as to give more space and flexibility to introduce new functions.

...read moreread less

Abstract: We present a new update to MetaboAnalyst (version 4.0) for comprehensive metabolomic data analysis, interpretation, and integration with other omics data. Since the last major update in 2015, MetaboAnalyst has continued to evolve based on user feedback and technological advancements in the field. For this year's update, four new key features have been added to MetaboAnalyst 4.0, including: (1) real-time R command tracking and display coupled with the release of a companion MetaboAnalystR package; (2) a MS Peaks to Pathways module for prediction of pathway activity from untargeted mass spectral data using the mummichog algorithm; (3) a Biomarker Meta-analysis module for robust biomarker identification through the combination of multiple metabolomic datasets and (4) a Network Explorer module for integrative analysis of metabolomics, metagenomics, and/or transcriptomics data. The user interface of MetaboAnalyst 4.0 has been reengineered to provide a more modern look and feel, as well as to give more space and flexibility to introduce new functions. The underlying knowledgebases (compound libraries, metabolite sets, and metabolic pathways) have also been updated based on the latest data from the Human Metabolome Database (HMDB). A Docker image of MetaboAnalyst is also available to facilitate download and local installation of MetaboAnalyst. MetaboAnalyst 4.0 is freely available at http://metaboanalyst.ca.

...read moreread less

2,857 citations

Journal Article•DOI•

HMDB 4.0: the human metabolome database for 2018.

[...]

David S. Wishart¹, Yannick Djoumbou Feunang¹, Ana Marcu¹, An Chi Guo¹, Kevin Y. H. Liang¹, Rosa Vázquez-Fresno¹, Tanvir Sajed¹, Daniel Johnson¹, Carin Li¹, Naama Karu¹, Zinat Sayeeda¹, Elvis J. Lo¹, Nazanin Assempour¹, Mark V. Berjanskii¹, Sandeep Singhal¹, David Arndt¹, Yongjie Liang¹, Hasan Badran¹, Jason R. Grant¹, Arnau Serra-Cayuela¹, Yifeng Liu¹, Rupa Mandal¹, Vanessa Neveu², Allison Pon¹, Craig Knox¹, Michael Wilson¹, Claudine Manach³, Augustin Scalbert² - Show less +24 more•Institutions (3)

University of Alberta¹, International Agency for Research on Cancer², Institut national de la recherche agronomique³

04 Jan 2018-Nucleic Acids Research

TL;DR: This year's update to the HMDB, HMDB 4.0, represents the most significant upgrade to the database in its history and should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science.

...read moreread less

Abstract: The Human Metabolome Database or HMDB (www.hmdb.ca) is a web-enabled metabolomic database containing comprehensive information about human metabolites along with their biological roles, physiological concentrations, disease associations, chemical reactions, metabolic pathways, and reference spectra. First described in 2007, the HMDB is now considered the standard metabolomic resource for human metabolic studies. Over the past decade the HMDB has continued to grow and evolve in response to emerging needs for metabolomics researchers and continuing changes in web standards. This year's update, HMDB 4.0, represents the most significant upgrade to the database in its history. For instance, the number of fully annotated metabolites has increased by nearly threefold, the number of experimental spectra has grown by almost fourfold and the number of illustrated metabolic pathways has grown by a factor of almost 60. Significant improvements have also been made to the HMDB's chemical taxonomy, chemical ontology, spectral viewing, and spectral/text searching tools. A great deal of brand new data has also been added to HMDB 4.0. This includes large quantities of predicted MS/MS and GC-MS reference spectral data as well as predicted (physiologically feasible) metabolite structures to facilitate novel metabolite identification. Additional information on metabolite-SNP interactions and the influence of drugs on metabolite levels (pharmacometabolomics) has also been added. Many other important improvements in the content, the interface, and the performance of the HMDB website have been made and these should greatly enhance its ease of use and its potential applications in nutrition, biochemistry, clinical chemistry, clinical genetics, medicine, and metabolomics science.

...read moreread less

2,608 citations

Journal Article•DOI•

PubChem 2019 update: improved access to chemical data

[...]

Sunghwan Kim¹, Jie Chen¹, Tiejun Cheng¹, Asta Gindulyte¹, Jia He¹, Siqian He¹, Qingliang Li¹, Benjamin A. Shoemaker¹, Paul A. Thiessen¹, Bo Yu¹, Leonid Zaslavsky¹, Jian Zhang¹, Evan E Bolton¹ - Show less +9 more•Institutions (1)

National Institutes of Health¹

08 Jan 2019-Nucleic Acids Research

TL;DR: This paper describes the new developments in PubChem, a key chemical information resource for the biomedical research community, which released new web interfaces, such as PubChem Target View page, Sources page, Bioactivity dyad pages and Patent View page.

...read moreread less

Abstract: PubChem (https://pubchem.ncbi.nlm.nih.gov) is a key chemical information resource for the biomedical research community. Substantial improvements were made in the past few years. New data content was added, including spectral information, scientific articles mentioning chemicals, and information for food and agricultural chemicals. PubChem released new web interfaces, such as PubChem Target View page, Sources page, Bioactivity dyad pages and Patent View page. PubChem also released a major update to PubChem Widgets and introduced a new programmatic access interface, called PUG-View. This paper describes these new developments in PubChem.

...read moreread less

2,083 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse