scispace - formally typeset
Search or ask a question

Showing papers by "Alexander Tropsha published in 2014"


Journal ArticleDOI
TL;DR: In this paper, the authors provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive quantitative structure-activity relationship models.
Abstract: Quantitative structure–activity relationship modeling is one of the major computational tools employed in medicinal chemistry. However, throughout its entire history it has drawn both praise and criticism concerning its reliability, limitations, successes, and failures. In this paper, we discuss (i) the development and evolution of QSAR; (ii) the current trends, unsolved problems, and pressing challenges; and (iii) several novel and emerging applications of QSAR modeling. Throughout this discussion, we provide guidelines for QSAR development, validation, and application, which are summarized in best practices for building rigorously validated and externally predictive QSAR models. We hope that this Perspective will help communications between computational and experimental chemists toward collaborative development and use of QSAR models. We also believe that the guidelines presented here will help journal editors and reviewers apply more stringent scientific standards to manuscripts reporting new QSAR stu...

1,314 citations


Journal ArticleDOI
TL;DR: As the production and applications of ENMs rapidly expand, their environmental impacts and effects on human health are becoming increasingly significant, and a thorough understanding of how nanomaterials perturb cells and biological molecules is required.
Abstract: As defined by the European Commission, nanomaterial is a natural, incidental or manufactured material containing particles in an unbound state or as an aggregate or agglomerate in which ≥ 50% of the particles in the number size distribution have one or more external dimensions in the size range 1 to 100 nm. In specific cases and where warranted by concerns for the environment, health, safety or competition, the number size distribution threshold of 50% may be replaced with a threshold between 1 and 50%.1 Engineered nanomaterials (ENMs) refer to man-made nanomaterials. Materials in the nanometer range often possess unique physical, optical, electronic, and biological properties compared with larger particles, such as the strength of graphene,2 the electronic properties of carbon nanotubes (CNTs),3 the antibacterial activity of silver nanoparticles4 and the optical properties of quantum dots (QDs).5 The unique and advanced properties of ENMs have led to a rapid increase in their application. These applications include aerospace and airplanes, energy, architecture, chemicals and coatings, catalysts, environmental protection, computer memory, biomedicine and consumer products. Driven by these demands, the worldwide ENM production volume in 2016 is conservatively estimated in a market report by Future Markets to be 44,267 tons or ≥ $5 billion.6 As the production and applications of ENMs rapidly expand, their environmental impacts and effects on human health are becoming increasingly significant.7 Due to their small sizes, ENMs are easily made airborne.8 However, no accurate method to quantitatively measure their concentration in air currently exists. A recently reported incident of severe pulmonary fibrosis caused by inhaled polymer nanoparticles in seven female workers obtained much attention.9 In addition to the release of ENM waste from industrial sites, a major release of ENMs to environmental water occurs due to home and personal use of appliances, cosmetics and personal products, such as shampoo and sunscreen.10 Airborne and aqueous ENMs pose immediate danger to the human respiratory and gastrointestinal systems. ENMs may enter other human organs after they are absorbed into the bloodstream through the gastrointestinal or respiratory systems.11,12 Furthermore, ENMs in cosmetics and personal care products, such as lotion, sunscreen and shampoo may enter human circulation through skin penetration.13 ENMs are very persistent in the environment and are slowly degraded. The dissolved metal ions from ENMs can also revert back to nanoparticles under natural conditions.14 ENMs are stored in plants, microbes and animal organs and can be transferred and accumulated through the food chain.15,16 In addition to the accidental entry of ENMs into human and biological systems, ENMs are also purposefully injected into or enter humans for medicinal and diagnostic purposes.17 Therefore, interactions of ENMs with biological systems are inevitable. In addition to engineered nanomaterials, there are also naturally existing nanomaterials such as proteins and DNA molecules, which are key components of biological systems. These materials, combined with lipids and organic and inorganic small molecules, form the basic units of living systems –cells.18 To elucidate how nanomaterials affect organs and physiological functions, a thorough understanding of how nanomaterials perturb cells and biological molecules is required (Figure 1). Rapidly accumulating evidence indicates that ENMs interact with the basic components of biological systems, such as proteins, DNA molecules and cells.19-21 The driving forces for such interactions are quite complex and include the size, shape and surface properties (e.g., hydrophobicity, hydrogen-bonding capability, pi-bonds and stereochemical interactions) of ENMs.22-25 Figure 1 Interactions of nanoparticles with biological systems at different levels. Nanoparticles enter the human body through various pathways, reaching different organs and contacting tissues and cells. All of these interactions are based on nanoparticle-biomacromolecule ... Evidence also indicates that chemical modifications on a nanoparticle’s surface alter its interactions with biological systems.26-28 These observations not only support the hypothesis that basic nano-bio interactions are mainly physicochemical in nature but also provide a powerful approach to controlling the nature and strength of a nanoparticle’s interactions with biological systems. Practically, a thorough understanding of the fundamental chemical interactions between nanoparticles and biological systems has two direct impacts. First, this knowledge will encourage and assist experimental approaches to chemically modify nanoparticle surfaces for various industrial or medicinal applications. Second, a range of chemical information can be combined with computational methods to investigate nano-biological properties and predict desired nanoparticle properties to direct experiments.29-31 The literature regarding nanoparticle-biological system interactions has increased exponentially in the past decade (Figure 2). However, a mechanistic understanding of the chemical basis for such complex interactions is still lacking. This review intends to explore such an understanding in the context of recent publications. Figure 2 An analysis of literature statistics indicates growing concern for the topics that are the focus of this review. The number of publications and citations were obtained using the keywords “nanoparticles” and “biological systems” ... A breakthrough technology cannot prosper without wide acceptance from the public and society; that is, it must pose minimal harm to human health and the environment. Nanotechnology is now facing such a critical challenge. We must elucidate the effects of ENMs on biological systems (such as biological molecules, human cells, organs and physiological systems). Accumulating experimental evidence suggests that nanoparticles interact with biological systems at nearly every level, often causing unwanted physiological consequences. Elucidating these interactions is the goal of this review. This endeavor will help regulate the proper application of ENMs in various products and their release into the environment. A more significant mission of this review is to direct the development of “safe-by-design” ENMs, as their demands for and applications continue to increase.

470 citations


Journal ArticleDOI
TL;DR: Two loci in the major histocompatibility complex are independently associated with clozapine-induced agranulocytosis/granulocytopenia (CIAG), and these associations dovetail with the roles of these genes in immunogenetic phenotypes and adverse drug responses for other medications, and provide insight into the pathophysiology of CIAG.
Abstract: Clozapine-induced agranulocytosis/granulocytopenia, or CIAG, is characterised by a rare and potentially fatal reaction to antipsychotic drugs. Here, the authors identify genetic variants in two immune-related genes that may contribute to the pathophysiology of CIAG.

143 citations


Journal ArticleDOI
TL;DR: A simple MODelability Index (MODI) is introduced that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds.
Abstract: We introduce a simple MODelability Index (MODI) that estimates the feasibility of obtaining predictive QSAR models (correct classification rate above 0.7) for a binary data set of bioactive compounds. MODI is defined as an activity class-weighted ratio of the number of nearest-neighbor pairs of compounds with the same activity class versus the total number of pairs. The MODI values were calculated for more than 100 data sets, and the threshold of 0.65 was found to separate the nonmodelable and modelable data sets.

106 citations


Journal ArticleDOI
TL;DR: This study aimed to generate predictive and well-characterized quantitative structure-activity relationship (QSAR) models for hERG blockage using the largest publicly available dataset of 11,958 compounds from the ChEMBL database and identified putative hERG blockers and non-blockers among currently marketed drugs.
Abstract: Several non-cardiovascular drugs have been withdrawn from the market due to their inhibition of hERG K+ channels that can potentially lead to severe heart arrhythmia and death. As hERG safety testing is a mandatory FDArequired procedure, there is a considerable interest for developing predictive computational tools to identify and filter out potential hERG blockers early in the drug discovery process. In this study, we aimed to generate predictive and wellcharacterized quantitative structure–activity relationship (QSAR) models for hERG blockage using the largest publicly available dataset of 11,958 compounds from the ChEMBL database. The models have been developed and validated according to OECD guidelines using four types of descriptors and four different machine-learning techniques. The classification accuracies discriminating blockers from non-blockers were as high as 0.83-0.93 on external set. Model interpretation revealed several SAR rules, which can guide structural optimization of some hERG blockers into non-blockers. We have also applied the generated models for screening the World Drug Index (WDI) database and identify putative hERG blockers and non-blockers among currently marketed drugs. The developed models can reliably identify blockers and non-blockers, which could be useful for the scientific community. A freely accessible web server has been developed allowing users to identify putative hERG blockers and non-blockers in chemical libraries of their interest (http://labmol.farmacia.ufg.br/predherg).

76 citations


Journal ArticleDOI
TL;DR: In this paper, small molecule inhibitors of the yeast PITP Sec14-like transfer proteins (PITPs) have been proposed and validated in vitro and in vivo, showing that NPPMs exhibit exquisite pathway selectivity in inhibiting phosphoinositide signaling in cells.
Abstract: Sec14-like phosphatidylinositol transfer proteins (PITPs) integrate diverse territories of intracellular lipid metabolism with stimulated phosphatidylinositol-4-phosphate production and are discriminating portals for interrogating phosphoinositide signaling. Yet, neither Sec14-like PITPs nor PITPs in general have been exploited as targets for chemical inhibition for such purposes. Herein, we validate what is to our knowledge the first small-molecule inhibitors (SMIs) of the yeast PITP Sec14. These SMIs are nitrophenyl(4-(2-methoxyphenyl)piperazin-1-yl)methanones (NPPMs) and are effective inhibitors in vitro and in vivo. We further establish that Sec14 is the sole essential NPPM target in yeast and that NPPMs exhibit exquisite targeting specificities for Sec14 (relative to related Sec14-like PITPs), propose a mechanism for how NPPMs exert their inhibitory effects and demonstrate that NPPMs exhibit exquisite pathway selectivity in inhibiting phosphoinositide signaling in cells. These data deliver proof of concept that PITP-directed SMIs offer new and generally applicable avenues for intervening with phosphoinositide signaling pathways with selectivities superior to those afforded by contemporary lipid kinase-directed strategies.

40 citations


Journal ArticleDOI
TL;DR: This study presents the first successful application of QSPR models for the computer-model-driven design of liposomal drugs.

39 citations


Journal ArticleDOI
TL;DR: The novel 5-HT1A actives identified with the QSAR-based virtual screening approach could be potentially developed as novel anxiolytics or potential antischizophrenic drugs.
Abstract: The 5-hydroxytryptamine 1A (5-HT1A) serotonin receptor has been an attractive target for treating mood and anxiety disorders such as schizophrenia. We have developed binary classification quantitative structure-activity relationship (QSAR) models of 5-HT1A receptor binding activity using data retrieved from the PDSP Ki database. The prediction accuracy of these models was estimated by external 5-fold cross-validation as well as using an additional validation set comprising 66 structurally distinct compounds from the World of Molecular Bioactivity database. These validated models were then used to mine three major types of chemical screening libraries, i.e., drug-like libraries, GPCR targeted libraries, and diversity libraries, to identify novel computational hits. The five best hits from each class of libraries were chosen for further experimental testing in radioligand binding assays, and nine of the 15 hits were confirmed to be active experimentally with binding affinity better than 10 μM. The most active compound, Lysergol, from the diversity library showed very high binding affinity (Ki) of 2.3 nM against 5-HT1A receptor. The novel 5-HT1A actives identified with the QSAR-based virtual screening approach could be potentially developed as novel anxiolytics or potential antischizophrenic drugs.

28 citations


Journal ArticleDOI
TL;DR: It is found that two agonist-bound THRβ conformations could effectively discriminate their corresponding ligands from presumed non-binders and one of the agonists from antagonists, which can eliminate the THR-mediated mechanism of action for chemicals of concern.

25 citations


Journal ArticleDOI
TL;DR: The development of The Children's Pharmacy Collaborative is described, which should grow over time, serve as a resource for professionals and families, and stimulate drug-repurposing efforts for a range of pediatric disorders.

18 citations


Journal ArticleDOI
TL;DR: The cheminformatics investigation of diverse drugs with known FGT penetration using cluster analysis and quantitative structure-activity relationships (QSAR) modeling is described to support the findings by correctly predicting the penetration class of rilpivirine and dolutegravir.
Abstract: The exposure of oral antiretroviral (ARV) drugs in the female genital tract (FGT) is variable and almost unpredictable. Identifying an efficient method to find compounds with high tissue penetration would streamline the development of regimens for both HIV preexposure prophylaxis and viral reservoir targeting. Here we describe the cheminformatics investigation of diverse drugs with known FGT penetration using cluster analysis and quantitative structure–activity relationships (QSAR) modeling. A literature search over the 1950–2012 period identified 58 compounds (including 21 ARVs and representing 13 drug classes) associated with their actual concentration data for cervical or vaginal tissue, or cervicovaginal fluid. Cluster analysis revealed significant trends in the penetrative ability for certain chemotypes. QSAR models to predict genital tract concentrations normalized to blood plasma concentrations were developed with two machine learning techniques utilizing drugs' molecular descriptors and p...

Journal ArticleDOI
TL;DR: This review discusses several approaches for integrating chemical and biological data for predicting biological effects of chemicals in vivo and concludes that while no method consistently shows superior performance, the integrative approaches rank consistently among the best yet offer enriched interpretation of models over those built with either chemical or biological data alone.
Abstract: Cheminformatics approaches such as Quantitative Structure Activity Relationship (QSAR) modeling have been used traditionally for predicting chemical toxicity. In recent years, high throughput biological assays have been increasingly employed to elucidate mechanisms of chemical toxicity and predict toxic effects of chemicals in vivo. The data generated in such assays can be considered as biological descriptors of chemicals that can be combined with molecular descriptors and employed in QSAR modeling to improve the accuracy of toxicity prediction. In this review, we discuss several approaches for integrating chemical and biological data for predicting biological effects of chemicals in vivo and compare their performance across several data sets. We conclude that while no method consistently shows superior performance, the integrative approaches rank consistently among the best yet offer enriched interpretation of models over those built with either chemical or biological data alone. We discuss the outlook for such interdisciplinary methods and offer recommendations to further improve the accuracy and interpretability of computational models that predict chemical toxicity.

Journal ArticleDOI
TL;DR: The high-throughput screening (HTS) Navigator software incorporates advanced cheminformatics capabilities such as chemical structure storage and visualization, fast similarity search and chemical neighborhood analysis for retrieved hits.
Abstract: Summary: We report on the development of the high-throughput screening (HTS) Navigator software to analyze and visualize the results of HTS of chemical libraries. The HTS Navigator processes output files from different plate readers' formats, computes the overall HTS matrix, automatically detects hits and has different types of baseline navigation and correction features. The software incorporates advanced cheminformatics capabilities such as chemical structure storage and visualization, fast similarity search and chemical neighborhood analysis for retrieved hits. The software is freely available for academic laboratories. Availability and implementation: http://fourches.web.unc.edu/ Contact: ude.cnu.liame@sehcruof Supplementary information: Supplementary data are available at Bioinformatics online.

Book ChapterDOI
01 Jan 2014
TL;DR: Several statistical criteria are proposed, which can with high confidence answer a question, whether it is possible to build a predictive model for a dataset prior to actual modeling, i.e. to establish, whether the dataset is modelable.
Abstract: It is not always possible to build predictive Quantitative Structure-Activity Relationships (QSAR) models for a given chemical dataset. In this work, we propose several statistical criteria, which can with high confidence answer a question, whether it is possible to build a predictive model for a dataset prior to actual modeling, i.e. to establish, whether the dataset is modelable. Calculation of these criteria is fast, and using them in QSAR studies could dramatically reduce modelers’ time and efforts, as well as computational resources necessary to build QSAR models for at least some datasets, especially for those which are not modelable. The calculation of modelability criteria is based on the k-nearest neighbors approach. For all datasets, as modelability criteria we have proposed dataset diversity (MODI_DIV) and new activity cliff indices (MODI_ACI). For datasets with binary end points, as modelability criteria we have proposed the correct classification rate (MODI_CCR) CCR = 0.5(sensitivity + specificity) for leave-one-out (LOO) cross-validation in the entire descriptor space, and correct classification rate for similarity search (MODI_ssCCR) in the entire descriptor space with leave 20 %-out (five-fold) cross-validation. For binary datasets, all these modelability criteria were tested on 42 datasets with previously generated QSAR models. Two latter criteria (MODI_CCR and MODI_ssCCR) were found to have high correlation with the predictivity of QSAR models (QSAR_CCR) and were additionally tested on 60 ToxCast end points with QSAR modeling results published recently (Thomas RS, Black MB, Li L, Healy E, Chu T-M, Bao W, Andersen MD, Wolfinger RD. Toxicol Sci: Off JSoc Toxicol 128(2):398–417, 2012). These modelability criteria can be used to classify many datasets as modelable or non-modelable. These criteria can be generalized to datasets with compounds belonging to more than two categories or classes. Additionally, criteria which take into account errors of prediction MODI_CAT i and MODI_CLASS i were proposed for datasets with compounds belonging to more than two (i > 2) categories or classes and continuous end points, divided into i > 2 bins. For continuous end points, LOO cross-validation q 2 for similarity search with different numbers of nearest neighbors in the entire descriptor space (MODI_q 2), and similarity search coefficient of determination (MODI_ssR 2) in the entire descriptor space were proposed as modelability criteria. Our preliminary studies demonstrated high correlation between the external predictivity of QSAR models (QSAR_R 2) and each of the MODI_q 2 and MODI_ssR 2. On the other hand, for datasets with any binary or continuous response variable, MODI_DIVs and MODI_ACIs were found to be less useful to establish dataset modelability.

Journal ArticleDOI
TL;DR: The use of fragment‐based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds’ target property, as well as establishing statistical figures of merit reflecting the model validated predictive power.
Abstract: We present a novel approach to generating fragment-based molecular descriptors. The molecules are represented by labeled undirected chemical graph. Fast Frequent Subgraph Mining (FFSM) is used to find chemical-fragments (subgraphs) that occur in at least a subset of all molecules in a dataset. The collection of frequent subgraphs (FSG) forms a dataset-specific descriptors whose values for each molecule are defined by the number of times each frequent fragment occurs in this molecule. We have employed the FSG descriptors to develop variable selection k Nearest Neighbor (kNN) QSAR models of several datasets with binary target property including Maximum Recommended Therapeutic Dose (MRTD), Salmonella Mutagenicity (Ames Genotoxicity), and P-Glycoprotein (PGP) data. Each dataset was divided into training, test, and validation sets to establish the statistical figures of merit reflecting the model validated predictive power. The classification accuracies of models for both training and test sets for all datasets exceeded 75 %, and the accuracy for the external validation sets exceeded 72 %. The model accuracies were comparable or better than those reported earlier in the literature for the same datasets. Furthermore, the use of fragment-based descriptors affords mechanistic interpretation of validated QSAR models in terms of essential chemical fragments responsible for the compounds' target property.

Journal ArticleDOI
TL;DR: The freely-accessible HTS Navigator software is developed to enable and facilitate the processing and analysis of polypharmacological HTS data and the synergistic potential of different cheminformatics approaches to detect both false-positive and false-negative compounds using neighborhood analysis and target baseline correction factors is emphasized.
Abstract: Many drugs are characterized by polypharmacological mechanisms of action. Thus, prospective drug discovery studies often start by testing large compound libraries in multiple and diverse High-Throughput Screening (HTS) assays. These large heterogeneous data collections pose numerous computational challenges concerning processing, curation, and analysis of untreated output files generated by plate readers. We have developed the freely-accessible HTS Navigator software to enable and facilitate the processing and analysis of polypharmacological HTS data. We report on the capabilities of Navigator and present several case studies where we employed cheminformatics approaches embedded within the Navigator to curate and analyze large datasets of compounds tested toward different panels of targets. Examples include libraries of compounds tested for their inhibition potencies across several CYP450; or for their inhibition of multiple protein kinases; or their binding profiles against multiple GPCRs. We show how to quickly identify and highlight compounds with unique mono- and dual- selectivity for certain targets in the curated HTS matrix. We discuss the problem of experimental variability in HTS data and its consequences for molecular modeling and emphasize the synergistic potential of different cheminformatics approaches to detect both false-positive and false-negative compounds using neighborhood analysis and target baseline correction factors. Finally, we describe the Chemical−Biological Read-Across (CBRA) approach [1] also implemented in the Navigator to infer the activity of external compounds from both chemical (defined by chemical similarity) and biological (defined by the similarity of HTS profiles) analogues.