scispace - formally typeset
Search or ask a question
Journal ArticleDOI

ChEMBL: a large-scale bioactivity database for drug discovery

TL;DR: ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems.
Abstract: ChEMBL is an Open Data database containing binding, functional and ADMET information for a large number of drug-like bioactive compounds. These data are manually abstracted from the primary published literature on a regular basis, then further curated and standardized to maximize their quality and utility across a wide range of chemical biology and drug-discovery research problems. Currently, the database contains 5.4 million bioactivity measurements for more than 1 million compounds and 5200 protein targets. Access is available through a web-based interface, data downloads and web services at: https://www.ebi.ac.uk/chembldb.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: The particular strengths of TCMSP are the composition of the large number of herbal entries, and the ability to identify drug-target networks and drug-disease networks, which will help revealing the mechanisms of action of Chinese herbs, uncovering the nature ofTCM theory and developing new herb-oriented drugs.
Abstract: Modern medicine often clashes with traditional medicine such as Chinese herbal medicine because of the little understanding of the underlying mechanisms of action of the herbs. In an effort to promote integration of both sides and to accelerate the drug discovery from herbal medicines, an efficient systems pharmacology platform that represents ideal information convergence of pharmacochemistry, ADME properties, drug-likeness, drug targets, associated diseases and interaction networks, are urgently needed. The traditional Chinese medicine systems pharmacology database and analysis platform (TCMSP) was built based on the framework of systems pharmacology for herbal medicines. It consists of all the 499 Chinese herbs registered in the Chinese pharmacopoeia with 29,384 ingredients, 3,311 targets and 837 associated diseases. Twelve important ADME-related properties like human oral bioavailability, half-life, drug-likeness, Caco-2 permeability, blood-brain barrier and Lipinski’s rule of five are provided for drug screening and evaluation. TCMSP also provides drug targets and diseases of each active compound, which can automatically establish the compound-target and target-disease networks that let users view and analyze the drug action mechanisms. It is designed to fuel the development of herbal medicines and to promote integration of modern medicine and traditional medicine for drug discovery and development. The particular strengths of TCMSP are the composition of the large number of herbal entries, and the ability to identify drug-target networks and drug-disease networks, which will help revealing the mechanisms of action of Chinese herbs, uncovering the nature of TCM theory and developing new herb-oriented drugs. TCMSP is freely available at http://sm.nwsuaf.edu.cn/lsp/tcmsp.php .

2,451 citations


Cites methods from "ChEMBL: a large-scale bioactivity d..."

  • ...Structure files of molecules were downloaded from PubChem [18] Compound database, ChEMBL [19] and ChemSpider [20], or produced by ISIS Draw 2.5 (MDL Information Systems, Inc.) and further optimized by Sybyl 6.9 (Tripos, Inc.) with Sybyl force field and default parameters [2,21]....

    [...]

  • ...Gaulton A, Bellis LJ, Bento AP, Chambers J, Davies M, Hersey A, Light Y, McGlinchey S, Michalovich D, Al-Lazikani B, Overington JP: ChEMBL: a large-scale bioactivity database for drug discovery....

    [...]

  • ...Structure files of molecules were downloaded from PubChem [18] Compound database, ChEMBL [19] and ChemSpider [20], or produced by ISIS Draw 2....

    [...]

Journal ArticleDOI
TL;DR: The database contains over twenty million commercially available molecules in biologically relevant representations that may be downloaded in popular ready-to-dock formats and subsets and is freely available at zinc.docking.org.
Abstract: ZINC is a free public resource for ligand discovery. The database contains over twenty million commercially available molecules in biologically relevant representations that may be downloaded in popular ready-to-dock formats and subsets. The Web site also enables searches by structure, biological activity, physical property, vendor, catalog number, name, and CAS number. Small custom subsets may be created, edited, shared, docked, downloaded, and conveyed to a vendor for purchase. The database is maintained and curated for a high purchasing success rate and is freely available at zinc.docking.org.

2,144 citations

Journal ArticleDOI
TL;DR: A suite of ligand annotation, purchasability, target, and biology association tools, incorporated into ZINC and meant for investigators who are not computer specialists, offer new analysis tools that are easy for nonspecialists yet with few limitations for experts.
Abstract: Many questions about the biological activity and availability of small molecules remain inaccessible to investigators who could most benefit from their answers. To narrow the gap between chemoinformatics and biology, we have developed a suite of ligand annotation, purchasability, target, and biology association tools, incorporated into ZINC and meant for investigators who are not computer specialists. The new version contains over 120 million purchasable “drug-like” compounds – effectively all organic molecules that are for sale – a quarter of which are available for immediate delivery. ZINC connects purchasable compounds to high-value ones such as metabolites, drugs, natural products, and annotated compounds from the literature. Compounds may be accessed by the genes for which they are annotated as well as the major and minor target classes to which those genes belong. It offers new analysis tools that are easy for nonspecialists yet with few limitations for experts. ZINC retains its original 3D roots – ...

2,115 citations

Journal ArticleDOI
TL;DR: The latest update of DrugBank, DrugBank 4.0, has been further expanded to contain data on drug metabolism, absorption, distribution, metabolism, excretion and toxicity (ADMET) and other kinds of quantitative structure activity relationships (QSAR) information.
Abstract: DrugBank (http://www.drugbank.ca) is a comprehensive online database containing extensive biochemical and pharmacological information about drugs, their mechanisms and their targets. Since it was first described in 2006, DrugBank has rapidly evolved, both in response to user requests and in response to changing trends in drug research and development. Previous versions of DrugBank have been widely used to facilitate drug and in silico drug target discovery. The latest update, DrugBank 4.0, has been further expanded to contain data on drug metabolism, absorption, distribution, metabolism, excretion and toxicity (ADMET) and other kinds of quantitative structure activity relationships (QSAR) information. These enhancements are intended to facilitate research in xenobiotic metabolism (both prediction and characterization), pharmacokinetics, pharmacodynamics and drug design/discovery. For this release, >1200 drug metabolites (including their structures, names, activity, abundance and other detailed data) have been added along with >1300 drug metabolism reactions (including metabolizing enzymes and reaction types) and dozens of drug metabolism pathways. Another 30 predicted or measured ADMET parameters have been added to each DrugCard, bringing the average number of quantitative ADMET values for Food and Drug Administration-approved drugs close to 40. Referential nuclear magnetic resonance and MS spectra have been added for almost 400 drugs as well as spectral and mass matching tools to facilitate compound identification. This expanded collection of drug information is complemented by a number of new or improved search tools, including one that provides a simple analyses of drug-target, -enzyme and -transporter associations to provide insight on drug-drug interactions.

1,799 citations


Cites background from "ChEMBL: a large-scale bioactivity d..."

  • ...Additional experimental drugs were added from a variety of literature sources and online databases, such as ChEMBL (11)....

    [...]

Journal ArticleDOI
TL;DR: While the intrinsic complexity of natural product-based drug discovery necessitates highly integrated interdisciplinary approaches, the reviewed scientific developments, recent technological advances, and research trends clearly indicate that natural products will be among the most important sources of new drugs in the future.

1,760 citations


Cites background from "ChEMBL: a large-scale bioactivity d..."

  • ...mainly found in bioactivity databases, such as CHEMBL (Gaulton et al., 2012) or PubChem (Li et al....

    [...]

  • ...…however, it is important to stress that natural products often differ from typical synthetic pharmacologically active molecules (e.g., size, number of aromatic rings, flexibility), that are mainly found in bioactivity databases, such as CHEMBL (Gaulton et al., 2012) or PubChem (Li et al., 2010)....

    [...]

  • ...When applying this method to plant constituents, however, it is important to stress that natural products often differ from typical synthetic pharmacologically active molecules (e.g., size, number of aromatic rings, flexibility), that are mainly found in bioactivity databases, such as CHEMBL (Gaulton et al., 2012) or PubChem (Li et al., 2010)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Experimental and computational approaches to estimate solubility and permeability in discovery and development settings are described in this article, where the rule of 5 is used to predict poor absorption or permeability when there are more than 5 H-bond donors, 10 Hbond acceptors, and the calculated Log P (CLogP) is greater than 5 (or MlogP > 415).

14,026 citations

Journal ArticleDOI
TL;DR: In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI’s website.
Abstract: In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's website. NCBI resources include Entrez, PubMed, PubMed Central, LocusLink, the NCBI Taxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR, OrfFinder, Spidey, RefSeq, UniGene, HomoloGene, ProtEST, dbMHC, dbSNP, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker, Evidence Viewer, Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SARS Coronavirus Resource, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD) and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov.

9,604 citations

Journal ArticleDOI
TL;DR: This letter is in response to your two Citizen Petitions, requesting that the Food and Drug Administration (FDA or the Agency) require a cancer warning on cosmetic talc products.
Abstract: This letter is in response to your two Citizen Petitions dated November 17, 1994 and May 13, 2008, requesting that the Food and Drug Administration (FDA or the Agency) require a cancer warning on cosmetic talc products. Your 1994 Petition requests that all cosmetic talc bear labels with a warning such as \"Talcum powder causes cancer in laboratory animals. Frequent talc application in the female genital area increases the risk of ovarian cancer.\" Additionally, your 2008 Petition requests that cosmetic talcum powder products bear labels with a prominent warning such as: \"Frequent talc application in the female genital area is responsible for major risks of ovarian cancer.\" Further, both of your Petitions specifically request, pursuant to 21 CFR 1 0.30(h)(2), a hearing for you to present scientific evidence in support of this petition.

9,350 citations

Journal ArticleDOI
TL;DR: The method, termed topological PSA (TPSA), provides results which are practically identical with the 3D PSA, while the computation speed is 2-3 orders of magnitude faster and may be used for fast bioavailability screening of virtual libraries having millions of molecules.
Abstract: Molecular polar surface area (PSA), i.e., surface belonging to polar atoms, is a descriptor that was shown to correlate well with passive molecular transport through membranes and, therefore, allows prediction of transport properties of drugs. The calculation of PSA, however, is rather time-consuming because of the necessity to generate a reasonable 3D molecular geometry and the calculation of the surface itself. A new approach for the calculation of the PSA is presented here, based on the summation of tabulated surface contributions of polar fragments. The method, termed topological PSA (TPSA), provides results which are practically identical with the 3D PSA (the correlation coefficient between 3D PSA and fragment-based TPSA for 34 810 molecules from the World Drug Index is 0.99), while the computation speed is 2-3 orders of magnitude faster. The new methodology may, therefore, be used for fast bioavailability screening of virtual libraries having millions of molecules. This article describes the new methodology and shows the results of validation studies based on sets of published absorption data, including intestinal absorption, Caco-2 monolayer penetration, and blood-brain barrier penetration.

2,400 citations

Journal ArticleDOI
TL;DR: The Fifth Edition of the 'Guide to Receptors and Channels' is a compilation of the major pharmacological targets divided into seven sections: G protein-coupled receptors, ligand-gated ion channels, ion channel, catalytic receptors, nuclear receptors, transporters and enzymes.
Abstract: The Fifth Edition of the 'Guide to Receptors and Channels' is a compilation of the major pharmacological targets divided into seven sections: G protein-coupled receptors, ligand-gated ion channels, ion channels, catalytic receptors, nuclear receptors, transporters and enzymes. These are presented with nomenclature guidance and summary information on the best available pharmacological tools, alongside suggestions for further reading. Available alongside this publication is a portal at http://www.GuideToPharmacology.org which is produced in close association with NC-IUPHAR and allows free online access to the information presented in the Fifth Edition.

2,066 citations