Journal ArticleDOI
Fragment Database FDB-17.
TLDR
A much smaller subset of GDB-17 is selected, called the fragment database FDB- 17, which contains 10 million fragmentlike molecules evenly covering a broad value range for molecular size, polarity, and stereochemical complexity.Abstract:
To better understand chemical space we recently enumerated the database GDB-17 containing 166.4 billion possible molecules up to 17 atoms of C, N, O, S and halogen following the simple rules of chemical stability and synthetic feasibility. However, due to the combinatorial explosion caused by systematic enumeration GDB-17 is strongly biased toward the largest, functionally and stereochemically most complex molecules and far too large for most virtual screening tools. Herein we selected a much smaller subset of GDB-17, called the fragment database FDB-17, which contains 10 million fragmentlike molecules evenly covering a broad value range for molecular size, polarity, and stereochemical complexity. The database is available at www.gdb.unibe.ch for download and free use, together with an interactive visualization application and a Web-based nearest neighbor search tool to facilitate the selection of new fragment-sized molecules for chemical synthesis.read more
Citations
More filters
Journal ArticleDOI
Concepts of Artificial Intelligence for Computer-Assisted Drug Discovery
TL;DR: The current state-of-the art of AI-assisted pharmaceutical discovery is discussed, including applications in structure- and ligand-based virtual screening, de novo drug design, physicochemical and pharmacokinetic property prediction, drug repurposing, and related aspects.
Journal ArticleDOI
Transfer Learning for Drug Discovery
TL;DR: This perspective aims to provide an overview of transferLearning and related applications in drug discovery and give outlooks as to future development and application of transfer learning for drug discovery.
Journal ArticleDOI
Visualization of very large high-dimensional data sets as minimum spanning trees.
Daniel Probst,Jean-Louis Reymond +1 more
TL;DR: This paper applies a new data visualization method, TMAP, capable of representing data sets of up to millions of data points and arbitrary high dimensionality as a two-dimensional tree, to the most used chemistry data sets including databases of molecules such as ChEMBL, FDB17, the Natural Products Atlas, DSSTox, as well as to the MoleculeNet benchmark collection of data sets.
Journal ArticleDOI
Machine-learning structural and electronic properties of metal halide perovskites using a hierarchical convolutional neural network
TL;DR: It is shown that a well-designed hierarchical ML approach has a higher fidelity in predicting properties of the MHPs compared to straight-forward methods and underscores the importance of a careful network design and a hierarchical approach to alleviate issues associated with imbalanced dataset distributions.
Journal ArticleDOI
The Alexandria library, a quantum-chemical database of molecular properties for force field development.
TL;DR: The Alexandria library is presented as an open and freely accessible database of optimized molecular geometries, frequencies, electro static moments up to the hexadecupole, electrostatic potential, polarizabilities, and thermochemistry, obtained from quantum chemistry calculations for 2704 compounds.
References
More filters
Journal ArticleDOI
SMILES, a chemical language and information system. 1. introduction to methodology and encoding rules
TL;DR: This chapter discusses the construction of Benzenoid and Coronoid Hydrocarbons through the stages of enumeration, classification, and topological properties in a number of computers used for this purpose.
Journal ArticleDOI
Extended-Connectivity Fingerprints
David Rogers,Mathew Hahn +1 more
TL;DR: A description of their implementation has not previously been presented in the literature, and ECFPs can be very rapidly calculated and can represent an essentially infinite number of different molecular features.
Journal ArticleDOI
PubChem Substance and Compound databases
Sunghwan Kim,Paul A. Thiessen,Evan E Bolton,Jie Chen,Gang Fu,Asta Gindulyte,Lianyi Han,Jane He,Siqian He,Benjamin A. Shoemaker,Jiyao Wang,Bo Yu,Jian-Jian Zhang,Stephen H. Bryant +13 more
TL;DR: An overview of the PubChem Substance and Compound databases is provided, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access.
Journal ArticleDOI
ZINC 15 – Ligand Discovery for Everyone
Teague Sterling,John J. Irwin +1 more
TL;DR: A suite of ligand annotation, purchasability, target, and biology association tools, incorporated into ZINC and meant for investigators who are not computer specialists, offer new analysis tools that are easy for nonspecialists yet with few limitations for experts.
Journal ArticleDOI
DrugBank 4.0: shedding new light on drug metabolism
Vivian Law,Craig Knox,Yannick Djoumbou,Timothy Jewison,An Chi Guo,Yifeng Liu,Adam Maciejewski,David Arndt,Michael Wilson,Vanessa Neveu,Alexandra Tang,Geraldine Gabriel,Carol Ly,Sakina Adamjee,Zerihun T. Dame,Beomsoo Han,You Zhou,David S. Wishart +17 more
TL;DR: The latest update of DrugBank, DrugBank 4.0, has been further expanded to contain data on drug metabolism, absorption, distribution, metabolism, excretion and toxicity (ADMET) and other kinds of quantitative structure activity relationships (QSAR) information.