scispace - formally typeset
Search or ask a question

Showing papers by "Alexander Tropsha published in 1999"


Journal ArticleDOI
TL;DR: The success of all of the QSAR methods indicates the presence of an intrinsic structure-activity relationship in this group of compounds and affords more robust design and prediction of biological activities of novel D(1) ligands.
Abstract: Several quantitative structure-activity relationship (QSAR) methods were applied to 29 chemically diverse D(1) dopamine antagonists. In addition to conventional 3D comparative molecular field analysis (CoMFA), cross-validated R(2) guided region selection (q(2)-GRS) CoMFA (see ref 1) was employed, as were two novel variable selection QSAR methods recently developed in one of our laboratories. These latter methods included genetic algorithm-partial least squares (GA-PLS) and K nearest neighbor (KNN) procedures (see refs 2-4), which utilize 2D topological descriptors of chemical structures. Each QSAR approach resulted in a highly predictive model, with cross-validated R(2) (q(2)) values of 0.57 for CoMFA, 0.54 for q(2)-GRS, 0.73 for GA-PLS, and 0.79 for KNN. The success of all of the QSAR methods indicates the presence of an intrinsic structure-activity relationship in this group of compounds and affords more robust design and prediction of biological activities of novel D(1) ligands.

85 citations


Journal ArticleDOI
TL;DR: This work developed a novel method for molecular diversity sampling called SAGE (simulated annealing guided evaluation of molecular diversity), and showed that when the percentage of active points was low, the hit rates obtained by SAGE were always higher than those obtained by random sampling.
Abstract: We have developed a novel method for molecular diversity sampling called SAGE (simulated annealing guided evaluation of molecular diversity). Compounds in chemical databases or virtual combinatorial libraries are conventionally represented as points in multidimensional descriptor space. The SAGE algorithm selects a desired number of optimally diverse points (compounds) from a database. The diversity of a subset of points is measured by a specially designed diversity function, and the most diverse subset is selected using Simulated Annealing (SA) as the optimization tool. Application of SAGE to two simulated data sets of randomly distributed points in two-dimensional space afforded diverse and representative selection as judged by visual inspection. SAGE was also applied, in comparison with random sampling, to two other simulated data sets with points distributed among many clusters. We found that SAGE sampling covered significantly more clusters than the random sampling. By defining a fraction of data poi...

53 citations


Journal ArticleDOI
TL;DR: The synthesis and biological evaluation of additional PAT analogues are reported in order to identify differences in binding at these two sites, and a revision of the previous comparative molecular field analysis study of the PAT ligands is reported that yields a highly predictive model for 66 compounds with a cross-validated R(2) (q(2)) value of 0.67.
Abstract: A series of 1-phenyl-3-amino-1,2,3,4-tetrahydronaphthalenes (1-phenyl-3-aminotetralins, PATs) previously was found to modulate tyrosine hydroxylase activity and dopamine synthesis in rodent forebra...

40 citations


Journal ArticleDOI
01 Sep 1999-Proteins
TL;DR: A comparison of two models of the random‐coil state based on statistical distributions from the structural database and the molecular dynamics simulations indicates that the database distributions are greatly influenced by long‐range interactions and dominated by specific recognizable elements of protein structure.
Abstract: This study presents a comparison of two models of the random-coil state, one based on statistical distributions from the structural database and the other based on molecular dynamics simulations. The database model relies on the assumption that the random- or statistical-coil state of a particular residue can be described by its conformational distribution in a sufficiently diverse subset of protein structures. The molecular dynamics model is based on distributions from molecular simulations carried out on “dipeptide” models (single residues with N-terminal acetyl and C-terminal N′-methyl amide blocking groups). A comparison of the two models for the residues Ala, Asn, Asp, Gly, and Val indicates that the database distributions are greatly influenced by long-range interactions and dominated by specific recognizable elements of protein structure. In contrast, the limited structural scope of the dipeptide models presents the extreme case of a peptide under the influence of only short-range interactions. The models were evaluated by a comparison of scalar coupling constants calculated from the conformational distributions and compared with experimentally values determined for unstructured peptides. Although the models gave different distributions, there was similar agreement with experiment. This comparison emphasizes the differences and limitations in each model and highlights the difficulty in presenting an accurate picture of the random-coil state. Proteins 1999;36:407– 418. © 1999 Wiley-Liss, Inc.

27 citations



Journal ArticleDOI
TL;DR: A generalized linear response (GLR) method was developed and applied to hydration free energy calculations in this article, which is more general than the existing linear response methods for free energy calculation because it applies the linear response approximation to electrostatic and van der Waals interactions.
Abstract: A generalized linear response (GLR) method was developed and applied to hydration free energy calculations. According to this method, the atomic hydration can be described as a two‐step process. In the first step a point particle is introduced into water, which, according to the scaled particle theory, creates a cavity with the size of a water molecule. The free energy change of this step for the simple point charge (SPC) water model can be calculated as 1.49kBT. In the second step the introduced point particle is transformed into a solute atom. The free energy change of this step can be calculated by the linear response approximation, which is applied to van der Waals and electrostatic interactions, as 〈V Ha 〉0.5. Here V Ha is the solute–water interaction function, and 〈⋅⋅⋅〉0.5 denotes the ensemble average at the midpoint of the thermodynamic path between the point particle state and the hydration state. The GLR method was tested by the calculation of hydration free energies of several neutral organic compounds. The results of the calculation were in close agreement with the experiment and were also comparable with those obtained by the conventional free energy simulation method; the computational cost was decreased by about one order of magnitude. The GLR approach is more general than the existing linear response methods for free energy calculations because it applies the linear response approximation to electrostatic and van der Waals interactions and does not incorporate any empirically determined parameters. ©1999 John Wiley & Sons, Inc. J Comput Chem 20: 749–759, 1999

4 citations


Proceedings ArticleDOI
01 Dec 1999
TL;DR: Both computational chemists and computational biologists address this challenge by developing fast and accurate methods of biomolecular database analysis to enhance their ability to discover or design lead molecules of pharmaceutical significance.
Abstract: We live in the time of " –ics " sciences. Genomics, proteomics, genetics, bioinformatics, cheminformatics are a few examples of recent terms that refer to relatively new and rapidly growing areas of both academic and industrial research. This growth implies an unprecedented accumulation of biomolecular information stored in ever growing databases. These include both gene databases and various databases of organic molecules, which contain millions of individual molecular entries. All these molecular information has to be stored, manipulated, understood and used, in a rational way, in designing new drugs. Computational analysis of molecular diversity and similarity, database mining, combinatorial library design has become one of the most vital areas of biocomputing. It is well known that many genes and their products have an interesting yet unidentified function, and the analysis of sequence-structure-function relationships has been a traditional area of bioinformatics. Perhaps it is less obvious but experimental medicinal chemists face, in many respects, the same type of problems as experimental biologists. Practically every chemical has certain, frequently unknown, biological effect (cf. gene sequences with unknown function). There is a huge array of chemical molecules (cf. millions of genes), many of which could potentially be drugs, but their specificity against a particular biological target is yet to be determined (as well, in many cases, as the target itself). The computational aspects of macromolecular vs. chemical database analysis appear strikingly common yet complementary. Indeed, the exact challenge is that of matchmaking: how to find the right drug for the right target? Both computational chemists and computational biologists address this challenge by developing fast and accurate methods of biomolecular database analysis to enhance our ability to discover or design lead molecules of pharmaceutical significance. These approaches rely on our understanding of three-dimensional structure of both organic and biological macromolecules and the development of rigorous quantitative models that explain experimental structure-activity (for organic molecules) and sequence-structure

1 citations