scispace - formally typeset
Search or ask a question
Author

Xiang Simon Wang

Bio: Xiang Simon Wang is an academic researcher from Howard University. The author has contributed to research in topics: Virtual screening & Quantitative structure–activity relationship. The author has an hindex of 16, co-authored 34 publications receiving 643 citations. Previous affiliations of Xiang Simon Wang include University of Washington & University of North Carolina at Chapel Hill.

Papers
More filters
Journal ArticleDOI
TL;DR: This study illustrates the power of the combined QSAR-VS method as a general approach for the effective identification of structurally novel bioactive compounds.
Abstract: Inhibitors of histone deacetylases (HDACIs) have emerged as a new class of drugs for the treatment of human cancers and other diseases because of their effects on cell growth, differentiation, and apoptosis. In this study we have developed several quantitative structure−activity relationship (QSAR) models for 59 chemically diverse histone deacetylase class 1 (HDAC1) inhibitors. The variable selection k nearest neighbor (kNN) and support vector machines (SVM) QSAR modeling approaches using both MolconnZ and MOE chemical descriptors generated from two-dimensional rendering of compounds as chemical graphs have been employed. We have relied on a rigorous model development workflow including the division of the data set into training, test, and external sets and extensive internal and external validation. Highly predictive QSAR models were generated with leave-one-out cross-validated (LOO-CV) q2 and external R2 values as high as 0.80 and 0.87, respectively, using the kNN/MolconnZ approach and 0.93 and 0.87, re...

98 citations

Journal ArticleDOI
TL;DR: A three-dimensional homology molecular model of the mouse MC4 receptor complex with the hAGRP(87-132) ligand docked into the receptor has been developed to identify putative antagonist ligand-receptor interactions and the identification of a novel subnanomolar melanocortin peptide template Tyr-c[Asp-His-DPhe-Arg-Trp-Asn-Ala-Phe-Dpr]-Tyr-NH(2).
Abstract: Agouti-related protein (AGRP) is one of only two naturally known antagonists of G-protein-coupled receptors (GPCRs) identified to date. Specifically, AGRP antagonizes the brain melanocortin-3 and -4 receptors involved in energy homeostasis. Alpha-melanocyte stimulating hormone (alpha-MSH) is one of the known endogenous agonists for these melanocortin receptors. Insight into putative interactions between the antagonist AGRP amino acids with the melanocortin-4 receptor (MC4R) may be important for the design of unique ligands for the treatment of obesity related diseases and is currently lacking in the literature. A three-dimensional homology molecular model of the mouse MC4 receptor complex with the hAGRP(87-132) ligand docked into the receptor has been developed to identify putative antagonist ligand-receptor interactions. Key putative AGRP-MC4R interactions include the Arg111 of hAGRP(87-132) interacting in a negatively charged pocket located in a cavity formed by transmembrane spanning (TM) helices 1, 2, 3, and 7, capped by the acidic first extracellular loop (EL1) and specifically with the conserved melanocortin receptor residues mMC4R Glu92 (TM2), mMC4R Asp114 (TM3), and mMC4R Asp118 (TM3). Additionally, Phe112 and Phe113 of hAGRP(87-132) putatively interact with an aromatic hydrophobic pocket formed by the mMC4 receptor residues Phe176 (TM4), Phe193 (TM5), Phe253 (TM6), and Phe254 (TM6). To validate the AGRP-mMC4R model complex presented herein from a ligand perspective, we generated nine chimeric peptide ligands based on a modified antagonist template of the hAGRP(109-118) (Tyr-c[Asp-Arg-Phe-Phe-Asn-Ala-Phe-Dpr]-Tyr-NH(2)). In these chimeric ligands, the antagonist AGRP Arg-Phe-Phe residues were replaced by the melanocortin agonist His/D-Phe-Arg-Trp amino acids. These peptides resulted in agonist activity at the mouse melanocortin receptors (mMC1R and mMC3-5Rs). The most notable results include the identification of a novel subnanomolar melanocortin peptide template Tyr-c[Asp-His-DPhe-Arg-Trp-Asn-Ala-Phe-Dpr]-Tyr-NH(2) that is equipotent to alpha-MSH at the mMC1, mMC3, and mMC5 receptors but is 30-fold more potent than alpha-MSH at the mMC4R. Additionally, these studies identified a new and novel >200-fold MC4R versus MC3R selective peptide Tyr-c[Asp-D-Phe-Arg-Trp-Asn-Ala-Phe-Dpr]-Tyr-NH(2) template. Furthermore, when the His-DPhe-Arg-Trp sequence is used to replace the hAGRP Arg-Phe-Phe residues in the "mini"-AGRP (hAGRP87-120, C105A) template, a potent nanomolar agonist resulted at the mMC1R and MC3-5Rs.

67 citations

Journal ArticleDOI
TL;DR: It is demonstrated that the use of target-specific pose (scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in VS studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD).
Abstract: Poor performance of scoring functions is a well-known bottleneck in structure-based virtual screening (VS), which is most frequently manifested in the scoring functions' inability to discriminate between true ligands vs known nonbinders (therefore designated as binding decoys). This deficiency leads to a large number of false positive hits resulting from VS. We have hypothesized that filtering out or penalizing docking poses recognized as non-native (i.e., pose decoys) should improve the performance of VS in terms of improved identification of true binders. Using several concepts from the field of cheminformatics, we have developed a novel approach to identifying pose decoys from an ensemble of poses generated by computational docking procedures. We demonstrate that the use of target-specific pose (scoring) filter in combination with a physical force field-based scoring function (MedusaScore) leads to significant improvement of hit rates in VS studies for 12 of the 13 benchmark sets from the clustered version of the Database of Useful Decoys (DUD). This new hybrid scoring function outperforms several conventional structure-based scoring functions, including XSCORE::HMSCORE, ChemScore, PLP, and Chemgauss3, in 6 out of 13 data sets at early stage of VS (up 1% decoys of the screening database). We compare our hybrid method with several novel VS methods that were recently reported to have good performances on the same DUD data sets. We find that the retrieved ligands using our method are chemically more diverse in comparison with two ligand-based methods (FieldScreen and FLAP::LBX). We also compare our method with FLAP::RBLB, a high-performance VS method that also utilizes both the receptor and the cognate ligand structures. Interestingly, we find that the top ligands retrieved using our method are highly complementary to those retrieved using FLAP::RBLB, hinting effective directions for best VS applications. We suggest that this integrative VS approach combining cheminformatics and molecular mechanics methodologies may be applied to a broad variety of protein targets to improve the outcome of structure-based drug discovery studies.

40 citations

Journal ArticleDOI
TL;DR: The method has greatly reduced the “artificial enrichment” and “analogue bias” of a published GPCRs benchmarking set, i.e., GPCR Ligand Library (GLL)/GPCR Decoy Database (GDD), and addressed an important issue about the ratio of decoys per ligand.
Abstract: Benchmarking data sets have become common in recent years for the purpose of virtual screening, though the main focus had been placed on the structure-based virtual screening (SBVS) approaches. Due to the lack of crystal structures, there is great need for unbiased benchmarking sets to evaluate various ligand-based virtual screening (LBVS) methods for important drug targets such as G protein-coupled receptors (GPCRs). To date these ready-to-apply data sets for LBVS are fairly limited, and the direct usage of benchmarking sets designed for SBVS could bring the biases to the evaluation of LBVS. Herein, we propose an unbiased method to build benchmarking sets for LBVS and validate it on a multitude of GPCRs targets. To be more specific, our methods can (1) ensure chemical diversity of ligands, (2) maintain the physicochemical similarity between ligands and decoys, (3) make the decoys dissimilar in chemical topology to all ligands to avoid false negatives, and (4) maximize spatial random distribution of ligan...

39 citations

Journal ArticleDOI
TL;DR: These studies suggest that validated QSAR models could complement structure based docking and scoring approaches in identifying promising hits by virtual screening of molecular libraries and question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors.
Abstract: The use of inaccurate scoring functions in docking algorithms may result in the selection of compounds with high predicted binding affinity that nevertheless are known experimentally not to bind to the target receptor. Such falsely predicted binders have been termed 'binding decoys'. We posed a question as to whether true binders and decoys could be distinguished based only on their structural chemical descriptors using approaches commonly used in ligand based drug design. We have applied the k-Nearest Neighbor (kNN) classification QSAR approach to a dataset of compounds characterized as binders or binding decoys of AmpC beta-lactamase. Models were subjected to rigorous internal and external validation as part of our standard workflow and a special QSAR modeling scheme was employed that took into account the imbalanced ratio of inhibitors to non-binders (1:4) in this dataset. 342 predictive models were obtained with correct classification rate (CCR) for both training and test sets as high as 0.90 or higher. The prediction accuracy was as high as 100% (CCR = 1.00) for the external validation set composed of 10 compounds (5 true binders and 5 decoys) selected randomly from the original dataset. For an additional external set of 50 known non-binders, we have achieved the CCR of 0.87 using very conservative model applicability domain threshold. The validated binary kNN QSAR models were further employed for mining the NCGC AmpC screening dataset (69653 compounds). The consensus prediction of 64 compounds identified as screening hits in the AmpC PubChem assay disagreed with their annotation in PubChem but was in agreement with the results of secondary assays. At the same time, 15 compounds were identified as potential binders contrary to their annotation in PubChem. Five of them were tested experimentally and showed inhibitory activities in millimolar range with the highest binding constant K(i) of 135 microM. Our studies suggest that validated QSAR models could complement structure based docking and scoring approaches in identifying promising hits by virtual screening of molecular libraries.

37 citations


Cited by
More filters
28 Jul 2005
TL;DR: PfPMP1)与感染红细胞、树突状组胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作�ly.
Abstract: 抗原变异可使得多种致病微生物易于逃避宿主免疫应答。表达在感染红细胞表面的恶性疟原虫红细胞表面蛋白1(PfPMP1)与感染红细胞、内皮细胞、树突状细胞以及胎盘的单个或多个受体作用,在黏附及免疫逃避中起关键的作用。每个单倍体基因组var基因家族编码约60种成员,通过启动转录不同的var基因变异体为抗原变异提供了分子基础。

18,940 citations

Journal ArticleDOI
TL;DR: Most critical QSAR modeling routines that are regarded as best practices in the field are examined, including procedures used to validate models, both internally and externally, as well as the need to define model applicability domains that should be used when models are employed for the prediction of external compounds or compound libraries.
Abstract: After nearly five decades "in the making", QSAR modeling has established itself as one of the major computational molecular modeling methodologies. As any mature research discipline, QSAR modeling can be characterized by a collection of well defined protocols and procedures that enable the expert application of the method for exploring and exploiting ever growing collections of biologically active chemical compounds. This review examines most critical QSAR modeling routines that we regard as best practices in the field. We discuss these procedures in the context of integrative predictive QSAR modeling workflow that is focused on achieving models of the highest statistical rigor and external predictive power. Specific elements of the workflow consist of data preparation including chemical structure (and when possible, associated biological data) curation, outlier detection, dataset balancing, and model validation. We especially emphasize procedures used to validate models, both internally and externally, as well as the need to define model applicability domains that should be used when models are employed for the prediction of external compounds or compound libraries. Finally, we present several examples of successful applications of QSAR models for virtual screening to identify experimentally confirmed hits.

1,362 citations

Journal ArticleDOI
TL;DR: This Review discusses the mechanisms by which FT and GGT1 inhibitors (FTIs and G GTIs, respectively) affect signal transduction pathways, cell cycle progression, proliferation and cell survival, and strategies to overcome this.
Abstract: Protein farnesylation and geranylgeranylation, together referred to as prenylation, are lipid post-translational modifications that are required for the transforming activity of many oncogenic proteins, including some RAS family members. This observation prompted the development of inhibitors of farnesyltransferase (FT) and geranylgeranyl-transferase 1 (GGT1) as potential anticancer drugs. In this Review, we discuss the mechanisms by which FT and GGT1 inhibitors (FTIs and GGTIs, respectively) affect signal transduction pathways, cell cycle progression, proliferation and cell survival. In contrast to their preclinical efficacy, only a small subset of patients responds to FTIs. Identifying tumours that depend on farnesylation for survival remains a challenge, and strategies to overcome this are discussed. One GGTI has recently entered the clinic, and the safety and efficacy of GGTIs await results from clinical trials.

517 citations

Posted Content
TL;DR: AtomNet is introduced, the first structure-based, deep convolutional neural network designed to predict the bioactivity of small molecules for drug discovery applications and it is shown that AtomNet outperforms previous docking approaches on a diverse set of benchmarks by a large margin.
Abstract: Deep convolutional neural networks comprise a subclass of deep neural networks (DNN) with a constrained architecture that leverages the spatial and temporal structure of the domain they model. Convolutional networks achieve the best predictive performance in areas such as speech and image recognition by hierarchically composing simple local features into complex models. Although DNNs have been used in drug discovery for QSAR and ligand-based bioactivity predictions, none of these models have benefited from this powerful convolutional architecture. This paper introduces AtomNet, the first structure-based, deep convolutional neural network designed to predict the bioactivity of small molecules for drug discovery applications. We demonstrate how to apply the convolutional concepts of feature locality and hierarchical composition to the modeling of bioactivity and chemical interactions. In further contrast to existing DNN techniques, we show that AtomNet's application of local convolutional filters to structural target information successfully predicts new active molecules for targets with no previously known modulators. Finally, we show that AtomNet outperforms previous docking approaches on a diverse set of benchmarks by a large margin, achieving an AUC greater than 0.9 on 57.8% of the targets in the DUDE benchmark.

452 citations

Journal ArticleDOI
TL;DR: This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed inQSAR to a wide range of research areas outside of traditional QSar boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics.
Abstract: Prediction of chemical bioactivity and physical properties has been one of the most important applications of statistical and more recently, machine learning and artificial intelligence methods in chemical sciences. This field of research, broadly known as quantitative structure–activity relationships (QSAR) modeling, has developed many important algorithms and has found a broad range of applications in physical organic and medicinal chemistry in the past 55+ years. This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed in QSAR to a wide range of research areas outside of traditional QSAR boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics. As modern research methods generate rapidly increasing amounts of data, the knowledge of robust data-driven modelling methods professed within the QSAR field can become essential for scientists working both within and outside of chemical research. We hope that this contribution highlighting the generalizable components of QSAR modeling will serve to address this challenge.

383 citations