scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Protein and ligand preparation: parameters, protocols, and influence on virtual screening enrichments

12 Apr 2013-Journal of Computer-aided Molecular Design (Springer Netherlands)-Vol. 27, Iss: 3, pp 221-234
TL;DR: It is shown that database enrichment is improved with proper preparation and that neglecting certain steps of the preparation process produces a systematic degradation in enrichments, which can be large for some targets.
Abstract: Structure-based virtual screening plays an important role in drug discovery and complements other screening approaches. In general, protein crystal structures are prepared prior to docking in order to add hydrogen atoms, optimize hydrogen bonds, remove atomic clashes, and perform other operations that are not part of the x-ray crystal structure refinement process. In addition, ligands must be prepared to create 3-dimensional geometries, assign proper bond orders, and generate accessible tautomer and ionization states prior to virtual screening. While the prerequisite for proper system preparation is generally accepted in the field, an extensive study of the preparation steps and their effect on virtual screening enrichments has not been performed. In this work, we systematically explore each of the steps involved in preparing a system for virtual screening. We first explore a large number of parameters using the Glide validation set of 36 crystal structures and 1,000 decoys. We then apply a subset of protocols to the DUD database. We show that database enrichment is improved with proper preparation and that neglecting certain steps of the preparation process produces a systematic degradation in enrichments, which can be large for some targets. We provide examples illustrating the structural changes introduced by the preparation that impact database enrichment. While the work presented here was performed with the Protein Preparation Wizard and Glide, the insights and guidance are expected to be generalizable to structure-based virtual screening with other docking methods.
Citations
More filters
Journal ArticleDOI
19 Sep 2018-Neuron
TL;DR: The types of information molecular dynamics simulations can provide and the ways in which they typically motivate further experimental work are described.

964 citations


Cites methods from "Protein and ligand preparation: par..."

  • ...Most of the common simulation software packages include some software for system preparation, and a number of recently introduced or improved software packages simplify the preparation process (Betz, 2017; Jo et al., 2008; Sastry et al., 2013)....

    [...]

Journal ArticleDOI
TL;DR: The accuracy of the method, called Geometry, Frequency, Noncovalent, eXtended TB (GFN-xTB), is extensively benchmarked for various systems in comparison with existing semiempirical approaches, and the method is applied to a few representative structural problems in chemistry.
Abstract: We propose a novel, special purpose semiempirical tight binding (TB) method for the calculation of structures, vibrational frequencies, and noncovalent interactions of large molecular systems with 1000 or more atoms. The functional form of the method is related to the self-consistent density functional TB scheme and mostly avoids element-pair-specific parameters. The parametrization covers all spd-block elements and the lanthanides up to Z = 86 using reference data at the hybrid density functional theory level. Key features of the Hamiltonian are the use of partially polarized Gaussian-type orbitals, a double-ζ orbital basis for hydrogen, atomic-shell charges, diagonal third-order charge fluctuations, coordination number-dependent energy levels, a noncovalent halogen-bond potential, and the well-established D3 dispersion correction. The accuracy of the method, called Geometry, Frequency, Noncovalent, eXtended TB (GFN-xTB), is extensively benchmarked for various systems in comparison with existing semiempi...

896 citations

Journal ArticleDOI
TL;DR: The principles and applications of Virtual Screening (VS) within the context of SBDD are examined and different procedures ranging from the initial stages of the process that include receptor and library pre-processing, to docking, scoring and post-processing of topscoring hits are examined.
Abstract: Structure-based drug discovery (SBDD) is becoming an essential tool in assisting fast and cost-efficient lead discovery and optimization. The application of rational, structure-based drug design is proven to be more efficient than the traditional way of drug discovery since it aims to understand the molecular basis of a disease and utilizes the knowledge of the three-dimensional structure of the biological target in the process. In this review, we focus on the principles and applications of Virtual Screening (VS) within the context of SBDD and examine different procedures ranging from the initial stages of the process that include receptor and library pre-processing, to docking, scoring and post-processing of topscoring hits. Recent improvements in structure-based virtual screening (SBVS) efficiency through ensemble docking, induced fit and consensus docking are also discussed. The review highlights advances in the field within the framework of several success studies that have led to nM inhibition directly from VS and provides recent trends in library design as well as discusses limitations of the method. Applications of SBVS in the design of substrates for engineered proteins that enable the discovery of new metabolic and signal transduction pathways and the design of inhibitors of multifunctional proteins are also reviewed. Finally, we contribute two promising VS protocols recently developed by us that aim to increase inhibitor selectivity. In the first protocol, we describe the discovery of micromolar inhibitors through SBVS designed to inhibit the mutant H1047R PI3Kα kinase. Second, we discuss a strategy for the identification of selective binders for the RXRα nuclear receptor. In this protocol, a set of target structures is constructed for ensemble docking based on binding site shape characterization and clustering, aiming to enhance the hit rate of selective inhibitors for the desired protein target through the SBVS process.

597 citations


Cites background from "Protein and ligand preparation: par..."

  • ...The importance of protein preparation in docking performance has been recently reported [11]....

    [...]

  • ...have explored each of the steps involved in preparing a system for VS [11]....

    [...]

  • ...To efficiently address the above-mentioned structural issues, several protein preparation schemes have been proposed [5, 11, 12]....

    [...]

Journal ArticleDOI
TL;DR: Overall, the ligand binding poses could be identified in most cases by the evaluated docking programs but the ranks of the binding affinities for the entire dataset could not be well predicted by most docking programs.
Abstract: As one of the most popular computational approaches in modern structure-based drug design, molecular docking can be used not only to identify the correct conformation of a ligand within the target binding pocket but also to estimate the strength of the interaction between a target and a ligand. Nowadays, as a variety of docking programs are available for the scientific community, a comprehensive understanding of the advantages and limitations of each docking program is fundamentally important to conduct more reasonable docking studies and docking-based virtual screening. In the present study, based on an extensive dataset of 2002 protein–ligand complexes from the PDBbind database (version 2014), the performance of ten docking programs, including five commercial programs (LigandFit, Glide, GOLD, MOE Dock, and Surflex-Dock) and five academic programs (AutoDock, AutoDock Vina, LeDock, rDock, and UCSF DOCK), was systematically evaluated by examining the accuracies of binding pose prediction (sampling power) and binding affinity estimation (scoring power). Our results showed that GOLD and LeDock had the best sampling power (GOLD: 59.8% accuracy for the top scored poses; LeDock: 80.8% accuracy for the best poses) and AutoDock Vina had the best scoring power (rp/rs of 0.564/0.580 and 0.569/0.584 for the top scored poses and best poses), suggesting that the commercial programs did not show the expected better performance than the academic ones. Overall, the ligand binding poses could be identified in most cases by the evaluated docking programs but the ranks of the binding affinities for the entire dataset could not be well predicted by most docking programs. However, for some types of protein families, relatively high linear correlations between docking scores and experimental binding affinities could be achieved. To our knowledge, this study has been the most extensive evaluation of popular molecular docking programs in the last five years. It is expected that our work can offer useful information for the successful application of these docking tools to different requirements and targets.

582 citations

Journal ArticleDOI
TL;DR: 3D structure of a newly discovered enzyme that can digest highly crystalline PET, the primary material used in the manufacture of single-use plastic beverage bottles, in some clothing, and in carpets is characterized and it is shown that PETase degrades another semiaromatic polyester, polyethylene-2,5-furandicarboxylate (PEF), which is an emerging, bioderived PET replacement with improved barrier properties.
Abstract: Poly(ethylene terephthalate) (PET) is one of the most abundantly produced synthetic polymers and is accumulating in the environment at a staggering rate as discarded packaging and textiles. The properties that make PET so useful also endow it with an alarming resistance to biodegradation, likely lasting centuries in the environment. Our collective reliance on PET and other plastics means that this buildup will continue unless solutions are found. Recently, a newly discovered bacterium, Ideonella sakaiensis 201-F6, was shown to exhibit the rare ability to grow on PET as a major carbon and energy source. Central to its PET biodegradation capability is a secreted PETase (PET-digesting enzyme). Here, we present a 0.92 A resolution X-ray crystal structure of PETase, which reveals features common to both cutinases and lipases. PETase retains the ancestral α/β-hydrolase fold but exhibits a more open active-site cleft than homologous cutinases. By narrowing the binding cleft via mutation of two active-site residues to conserved amino acids in cutinases, we surprisingly observe improved PET degradation, suggesting that PETase is not fully optimized for crystalline PET degradation, despite presumably evolving in a PET-rich environment. Additionally, we show that PETase degrades another semiaromatic polyester, polyethylene-2,5-furandicarboxylate (PEF), which is an emerging, bioderived PET replacement with improved barrier properties. In contrast, PETase does not degrade aliphatic polyesters, suggesting that it is generally an aromatic polyesterase. These findings suggest that additional protein engineering to increase PETase performance is realistic and highlight the need for further developments of structure/activity relationships for biodegradation of synthetic polyesters.

545 citations


Cites background from "Protein and ligand preparation: par..."

  • ...Additional details can be found in SI Appendix (62, 63)....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: The goals of the PDB are described, the systems in place for data deposition and access, how to obtain further information and plans for the future development of the resource are described.
Abstract: The Protein Data Bank (PDB; http://www.rcsb.org/pdb/ ) is the single worldwide archive of structural data of biological macromolecules. This paper describes the goals of the PDB, the systems in place for data deposition and access, how to obtain further information, and near-term plans for the future development of the resource.

34,239 citations

Journal ArticleDOI
TL;DR: It is shown that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckia genetic algorithm is the most efficient, reliable, and successful of the three.
Abstract: A novel and robust automated docking method that predicts the bound conformations of flexible ligands to macromolecular targets has been developed and tested, in combination with a new scoring function that estimates the free energy change upon binding. Interestingly, this method applies a Lamarckian model of genetics, in which environmental adaptations of an individual's phenotype are reverse transcribed into its genotype and become . heritable traits sic . We consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein)ligand test systems having known three-dimensional structure. We show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three. The empirical free energy function was calibrated using a set of 30 structurally known protein)ligand complexes with experimentally determined binding constants. Linear regression analysis of the observed binding constants in terms of a wide variety of structure-derived molecular properties was performed. The final model had a residual standard y1 y1 .

9,322 citations

Journal ArticleDOI
TL;DR: The Protein Data Bank is a computer-based archival file for macromolecular structures that stores in a uniform format atomic co-ordinates and partial bond connectivities, as derived from crystallographic studies.

7,983 citations

Journal ArticleDOI
TL;DR: Glide approximates a complete systematic search of the conformational, orientational, and positional space of the docked ligand to find the best docked pose using a model energy function that combines empirical and force-field-based terms.
Abstract: Unlike other methods for docking ligands to the rigid 3D structure of a known protein receptor, Glide approximates a complete systematic search of the conformational, orientational, and positional space of the docked ligand In this search, an initial rough positioning and scoring phase that dramatically narrows the search space is followed by torsionally flexible energy optimization on an OPLS-AA nonbonded potential grid for a few hundred surviving candidate poses The very best candidates are further refined via a Monte Carlo sampling of pose conformation; in some cases, this is crucial to obtaining an accurate docked pose Selection of the best docked pose uses a model energy function that combines empirical and force-field-based terms Docking accuracy is assessed by redocking ligands from 282 cocrystallized PDB complexes starting from conformationally optimized ligand geometries that bear no memory of the correctly docked pose Errors in geometry for the top-ranked pose are less than 1 A in nearly ha

6,828 citations

Journal ArticleDOI
TL;DR: GOLD (Genetic Optimisation for Ligand Docking) is an automated ligand docking program that uses a genetic algorithm to explore the full range of ligand conformational flexibility with partial flexibility of the protein, and satisfies the fundamental requirement that the ligand must displace loosely bound water on binding.

5,882 citations

Related Papers (5)