Journal Article•DOI•

Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function

Q: What contributions have the authors mentioned in the paper "Automated docking using a lamarckian genetic algorithm and an empirical binding free energy function" ?

The authors consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein ] ligand test systems having known three-dimensional structure. The authors show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three.

Q: How many atoms were used to determine the similarity of the docked conformations?

The user-defined root-meanŽ .square positional deviation rmsd tolerance was used to determine if two docked conformations were similar enough to be included in the same cluster, and symmetrically related atoms in the ligand were considered.

Q: How many test systems were used to test the docking procedure?

Six of the seven test systems used to test the docking procedure, which were originally used to test AUTODOCK, version 2.4,7 were also in the training set of 30 protein]ligand complexes; therefore, to validate the chosen coefficients, linear regression was repeated for the set of 24 protein]ligand complexes, excluding the 6 overlapping test systems.

Q: How many retries were used to generate a low energy random initial state?

The maximum initial energy allowed was 0.0 kcal moly1, and the maximum number of retries was 1000, used to generate a low energy random initial state to begin each simulated annealing docking.

Q: What is the atomic solvation parameter for a given atom?

The solvation parameter for a given atom (S, used in the equation above) is calculated as:Si ¼ ðASPi þ QASP jqijÞ (4)where qi is the atomic charge and ASP and QASP are the atomic solvation parameters derived here.

Q: What was the charge assignment for ter-minal phosphate groups?

Charges on ter-minal phosphate groups were assigned improperly, with a totalcharge of 0.5, so the remaining 0.5 charge was split manually between the four surrounding oxygen atoms.

Q: What is the term for the loss of torsional entropy upon binding?

The term for the loss of torsional entropy upon binding (DSconf) is directly proportional to the number of rotatable bonds in the molecule (Ntors):Sconf ¼ WconfNtors (3)The number of rotatable bonds include all torsional degreesof freedom, including rotation of polar hydrogen atoms onhydroxyl groups and the like.

Q: How many complexes were used to calibrate the free energy function of autodock?

Thirty protein]ligand complexes with published binding constants were used in the calibraŽ .tion of AUTODOCK’s free energy function Table The author, and were chosen from the set of 45 used by Bohm,54¨

Q: What are the first two terms for the bound and unbound states of the ligand?

The first two terms are intramolecular energies for thebound and unbound states of the ligand, and the following twoterms are intramolecular energies for the bound and unboundstates of the protein.

Q: What is the rmsd of the lowest energy found by any search method?

The crystallographic rmsd of the lowest energy Ž .found by any search method for each of the ˚protein]ligand test systems were all within 1.14 A, or less, of the crystal structure.

Garrett M. Morris¹, David S. Goodsell¹, Robert Scott Halliday², Ruth Huey¹, William E. Hart³, Richard K. Belew⁴, Arthur J. Olson¹ - Show less +3 more•Institutions (4)

Scripps Research Institute¹, Hewlett-Packard², Sandia National Laboratories³, University of California, San Diego⁴

15 Nov 1998-Journal of Computational Chemistry (Wiley)-Vol. 19, Iss: 14, pp 1639-1662

TL;DR: It is shown that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckia genetic algorithm is the most efficient, reliable, and successful of the three.

read less

Abstract: A novel and robust automated docking method that predicts the bound conformations of flexible ligands to macromolecular targets has been developed and tested, in combination with a new scoring function that estimates the free energy change upon binding. Interestingly, this method applies a Lamarckian model of genetics, in which environmental adaptations of an individual's phenotype are reverse transcribed into its genotype and become . heritable traits sic . We consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein)ligand test systems having known three-dimensional structure. We show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three. The empirical free energy function was calibrated using a set of 30 structurally known protein)ligand complexes with experimentally determined binding constants. Linear regression analysis of the observed binding constants in terms of a wide variety of structure-derived molecular properties was performed. The final model had a residual standard y1 y1 .

...read moreread less

Summary (1 min read)

Jump to: [Introduction] – [Results and Discussion] and [Conclusions]

Introduction

Automated docking is widely used for the prediction of biomo- lecular complexes in structure/function analysis and in molecular design.
Dozens of effective methods are available, incorporating different trade-offs in molecular representation, energy evalua- tion, and conformational sampling to provide predictions with a reasonable computational effort.
In their hands, AutoDock3 has proven to be effective in roughly half of the complexes that the authors have studied.
The remain- ing half show significant motion of the receptor upon binding, and thus have required a more sophisticated model of motion in the receptor, typically performed outside of AutoDock3.
This capability also pro- vides an effective method for analysis of covalently attached ligands.

Results and Discussion

The authors first test of AutoDock4 is a redocking experiment using a set of 188 diverse protein-ligand complexes.
In 100 of 188 complexes, the docked conformation with lowest energy was within 3.5 Å RMSD of the crystallographic conformation.
(C) Cross docking with ARG8 treated as flexible in the protease.
Roughly 2/3 of the small inhibitors were docked successfully, and the mid-size ones were very successful.
The block at lower right shows docking of cyclic urea inhibitors with protease structures without the structural water.

Conclusions

Dependence on grid-based energy evaluation is a major limita- tion of AutoDock4.
It is required to allow rapid evaluation of binding energies during the docking simulation, but it places a severe restriction on the representation of the target macromole- cule: all of the atoms included in the grid must be treated as rigid.
The off-grid modeling of specific sidechains is a method for incorporating limited flexibility within this paradigm, and the results presented here show that it will be effective in some cases.
Adding flexibility presents several problems: (1) the calculation of the receptor energy is more computationally intensive since flexible regions must be evaluated by a full pair- wise energy evaluation, and (2) the conformational space is larger, and hence, there is more potential for false positives.

Did you find this useful? Give us your feedback

Figures (13)

FIGURE 1. This figure illustrates genotypic and phenotypic search, and contrasts Darwinian and Lamarckian search.27 The space of the genotypes is represented by the lower horizontal line, and the space of the phenotypes is represented by the upper horizontal line. Genotypes are mapped to phenotypes by a developmental mapping function. The fitness function is ( )f x . The result of applying the genotypic mutation operator to the parent’s genotype is shown on the right-hand side of the diagram, and has the corresponding phenotype shown. Local search is shown on the left-hand side. It is normally performed in phenotypic space and employs information about the fitness landscape. Sufficient iterations of the local search arrive at a local minimum, and an inverse mapping function is used to convert from its phenotype to its corresponding genotype. In the case of molecular docking, however, local search is performed by continuously converting from the genotype to the phenotype, so inverse mapping is not required. The genotype of the parent is replaced by the resulting genotype, however, in accordance with Lamarckian principles.

Figure 3. Performance of the new force field. These graphs are histograms showing the number of complexes with a given error in the predicted free energy of binding. Values of zero correspond to complexes that are perfectly predicted, positive values are cases where the predicted energy is too favorable (too negative). The random curve was generated by using a random number between 0 and 1 for values of the h-bond, dispersion/repulsion, electrostatic, and desolvation energies for complexes in the calibration set, and then deriving parameters based on these randomized energies. (A) Results for the calibration set. (B) Results for the HIV protease test set.

FIGURE 4. A comparison of the lowest energy structure found by each search method and the crystal structure. The latter is shown in black. The simulated annealing results are rendered with a striped texture, the genetic algorithm results are shaded gray, and the Lamarckian genetic algorithm results are white. Oxygen atoms are shown as spheres; other heteroatoms are not shown. Note that simulated annealing failed in the last two test cases, 4hmg and 4dfr, but both the genetic algorithm and the Lamarckian genetic algorithm succeeded.

Figure 5. Performance of the new force field in redocking experiments. Each point in the graph represents one protein–ligand complex in the calibration set. Open circles are cases where the conformation of best predicted energy is within 2.5 Å of the crystallographic conformation. Dots are cases where AutoDock finds a conformation within 2.5 Å of the crystallographic conformation, but it is not the best energy. Complexes that were not successfully redocked are shown with an X. In each case, 50 docking simulations were performed and results that were within 2.0 Å of each other were clustered. The vertical axis shows the number of clusters found for each complex (ideally, if AutoDock was able to find the global minimum structure, we would see one cluster). The horizontal axis shows the number of torsional degrees of freedom in the ligand. Note that AutoDock fails with ligands with greater than 15–20 rotatable bonds.

Table 5. Parameters Calibrated with the Reduced set of 155 Complexes.

Table 3. Parameters for the Force Field.

Content maybe subject to copyright Report

AutoDock-related material

1. Morris, G. M., Goodsell, D. S., Halliday, R.S., Huey, R., Hart, W. E.,

Belew, R. K. and Olson, A. J. Automated Docking Using a Lamarckian

Genetic Algorithm and and Empirical Binding Free Energy Function. J.

Comput. Chem., 1998, 19, 1639-1662.

2. Huey, R., Morris, G. M., Olson, A. J. and Goodsell, D. S. A

Semiempirical Free Energy Force Field with Charge-Based Desolvation.

J. Comput. Chem., 2007, 28, 1145-1652.

3. Morris, G.M., Huey, R., Lindstrom, W., Sanner, M.F., Belew, R.K.,

Goodsell, D.S. and Olson, A.J. AutoDock4 and AutoDockTools4:

Automated docking with selective receptor flexibility. J. Comput. Chem.

2009, 30, 2785-2791.

— —

< <

Automated Docking Using a Lamarckian

Genetic Algorithm and an Empirical

Binding Free Energy Function

GARRETT M. MORRIS,

DAVID S. GOODSELL,

ROBERT S. HALLIDAY,

RUTH HUEY,

WILLIAM E. HART,

RICHARD K. BELEW,

ARTHUR J. OLSON

Department of Molecular Biology, MB-5, The Scripps Research Institute, 10550 North Torrey Pines

Road, La Jolla, California 92037-1000

Hewlett-Packard, San Diego, California

Applied Mathematics Department, Sandia National Laboratories, Albuqurque, NM

Department of Computer Science & Engineering, University of California, San Diego, La Jolla, CA

Received February 1998; accepted 24 June 1998

ABSTRACT: A novel and robust automated docking method that predicts the

bound conformations of flexible ligands to macromolecular targets has been

developed and tested, in combination with a new scoring function that estimates

the free energy change upon binding. Interestingly, this method applies a

Lamarckian model of genetics, in which environmental adaptations of an

individual’s phenotype are reverse transcribed into its genotype and become

Ž.

heritable traits sic . We consider three search methods, Monte Carlo simulated

annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm,

and compare their performance in dockings of seven protein᎐ligand test systems

having known three-dimensional structure. We show that both the traditional

and Lamarckian genetic algorithms can handle ligands with more degrees of

freedom than the simulated annealing method used in earlier versions of

UTODOCK, and that the Lamarckian genetic algorithm is the most efficient,

reliable, and successful of the three. The empirical free energy function was

calibrated using a set of 30 structurally known protein᎐ligand complexes with

experimentally determined binding constants. Linear regression analysis of the

observed binding constants in terms of a wide variety of structure-derived

molecular properties was performed. The final model had a residual standard

error of 9.11 kJ mol 2.177 kcal mol and was chosen as the new energy

Correspondence to: A. J. Olson; e-mail: olson@scripps.edu

Contractrgrant sponsor: National Institutes of Health, con-

tractrgrant numbers: GM48870, RR08065

()

Journal of Computational Chemistry, Vol. 19, No. 14, 1639᎐1662 1998

䊚 1998 John Wiley & Sons, Inc. CCC 0192-8651 / 98 / 141639-24

MORRIS ET AL.

function. The new search methods and empirical free energy function are

available in AUTODOCK, version 3.0. 䊚 1998 John Wiley & Sons, Inc. J Comput

Chem 19: 1639᎐1662, 1998

Keywords: automated docking; binding affinity; drug design; genetic algorithm;

flexible small molecule protein interaction

Introduction

fast atom-based computational docking tool

Ais essential to most techniques for structure-

based drug design.

1, 2

Reported techniques for au-

tomated docking fall into two broad categories:

matching methods and docking simulation meth-

ods.

Matching methods create a model of the

active site, typically including sites of hydrogen

bonding and sites that are sterically accessible, and

then attempt to dock a given inhibitor structure

into the model as a rigid body by matching its

geometry to that of the active site. The most suc-

cessful example of this approach is D

OCK,

4, 5

which

is efficient enough to screen entire chemical

databases rapidly for lead compounds. The second

class of docking techniques model the docking of a

ligand to a target in greater detail: the ligand

begins randomly outside the protein, and explores

translations, orientations, and conformations until

an ideal site is found. These techniques are typi-

cally slower than the matching techniques, but

they allow flexibility within the ligand to be mod-

eled and can utilize more detailed molecular me-

chanics to calculate the energy of the ligand in the

context of the putative active site. They allow

computational chemists to investigate modifica-

tions of lead molecules suggested by the chemi-

cal intuition and expertise of organic synthetic

chemists.

UTODOCK

6, 7

is an example of the latter, more

physically detailed, flexible docking technique.

Previous releases of AUTODOCK combine a rapid

grid-based method for energy evaluation,

8, 9

pre-

calculating ligand᎐protein pairwise interaction en-

ergies so that they may be used as a look-up table

during simulation, with a Monte Carlo simulated

annealing search

10, 11

for optimal conformations of

ligands. AUTODOCK has been applied with great

success in the prediction of bound conformations

of enzyme᎐inhibitor complexes,

12, 13

peptide᎐anti-

body complexes,

and even protein᎐protein inter-

actions

; these and other applications have been

reviewed elsewhere.

We initiated the current work to remedy two

Ž.

limitations of AUTODOCK. i We have found that

the simulated annealing search method performs

well with ligands that have roughly eight rotatable

bonds or less: problems with more degrees of

freedom rapidly become intractable. This de-

Ž.

manded a more efficient search method. ii

UTODOCK is often used to obtain unbiased dock-

ings of flexible inhibitors in enzyme active sites: in

computer-assisted drug-design, novel modifica-

tions of such lead molecules can be investigated

computationally. Like many other computational

approaches, A

UTODOCK performs well in predict-

ing relative quantities and rankings for series of

similar molecules; however, it has not been possi-

ble to estimate in AUTODOCK whether a ligand will

bind with a millimolar, micromolar, or nanomolar

binding constant. Earlier versions of AUTODOCK

used a set of traditional molecular mechanics

force-field parameters that were not directly corre-

lated with observed binding free energies; hence,

we needed to develop a force field that could be

used to predict such quantities.

Molecular docking is a difficult optimization

problem, requiring efficient sampling across the

entire range of positional, orientational, and con-

Ž.

formational possibilities. Genetic algorithms GA

fulfill the role of global search particularly well,

and are increasingly being applied to problems

that suffer from combinatorial explosions due to

their many degrees of freedom. Both canonical

genetic algorithms

17 ᎐ 21

and evolutionary program-

ming methods

have been shown to be successful

in both drug design and docking.

In this report, we describe two major advances

that are included in the new release of AUTODOCK,

version 3.0. The first is the addition of three new

search methods: a genetic algorithm; a local search

method; and a novel, adaptive global᎐local search

method based on Lamarckian genetics, the La-

Ž.

marckian genetic algorithm LGA . The second ad-

vance is an empirical binding free energy force

field that allows the prediction of binding free

energies, and hence binding constants, for docked

ligands.

VOL. 19, NO. 141640

AUTOMATED DOCKING

Methods

GENETIC ALGORITHMS

Genetic algorithms

use ideas based on the lan-

guage of natural genetics and biological evolu-

tion.

In the case of molecular docking, the partic-

ular arrangement of a ligand and a protein can be

defined by a set of values describing the transla-

tion, orientation, and conformation of the ligand

with respect to the protein: these are the ligand’s

state variables and, in the GA, each state variable

corresponds to a gene. The ligand’s state corre-

sponds to the genotype, whereas its atomic coordi-

nates correspond to the phenotype. In molecular

docking, the fitness is the total interaction energy

of the ligand with the protein, and is evaluated

using the energy function. Random pairs of indi-

viduals are mated using a process of crossover,in

which new individuals inherit genes from either

parent. In addition, some offspring undergo ran-

dom mutation, in which one gene changes by a

random amount. Selection of the offspring of the

current generation occurs based on the individual’s

fitness: thus, solutions better suited to their envi-

ronment reproduce, whereas poorer suited ones

die.

A variety of approaches have been adopted to

improve the efficiency of the genetic algorithm.

Classical genetic algorithms represent the genome

as a fixed-length bit string, and employ binary

crossover and binary mutation to generate new

individuals in the population. Unfortunately, in

many problems, such binary operators can gener-

ate values that are often outside the domain of

interest, leading to gross inefficiencies in the search.

The use of real encodings helps to limit the genetic

algorithm to reasonable domains. Alternative ge-

netic algorithms have been reported

that employ

more complicated representations and more so-

phisticated operators besides crossover and muta-

tion. Some of these retain the binary represen-

tation, but must employ decoders and repair

algorithms to avoid building illegal individuals

from the chromosome, and these are frequently

computationally intensive. However, the search

performance of the genetic algorithm can be im-

proved by introducing a local search method.

26, 27

HYBRID SEARCH METHODS IN AUTODOCK

Earlier versions of AUTODOCK used optimized

variants of simulated annealing.

6, 7

Simulated an-

nealing may be viewed as having both global and

local search aspects, performing a more global

search early in the run, when higher temperatures

allow transitions over energy barriers separating

energetic valleys, and later on performing a more

local search when lower temperatures place more

focus on local optimization in the current valley.

UTODOCK 3.0 retains the functionality of earlier

versions, but adds the options of using a genetic

Ž.

algorithm GA for global searching, a local search

Ž.

LS method to perform energy minimization, or a

combination of both, and builds on the work of

Belew and Hart.

27, 28

The local search method is

based on that of Solis and Wets,

which has the

advantage that it does not require gradient infor-

mation about the local energy landscape, thus fa-

cilitating torsional space search. In addition, the

local search method is adaptive, in that it adjusts

the step size depending upon the recent history of

energies: a user-defined number of consecutive

failures, or increases in energy, cause the step size

to be doubled; conversely, a user-defined number

of consecutive successes, or decreases in energy,

cause the step size to be halved. The hybrid of the

GA method with the adaptive LS method together

form the so-called Lamarckian genetic algorithm

Ž.

LGA , which has enhanced performance relative

to simulated annealing and GA alone,

21, 26

and is

described in detail later. Thus, the addition of

these new GA-based docking methods enhances

AUTODOCK, and allows problems with more de-

grees of freedom to be tackled. Furthermore, it is

now possible to use the same force field as is used

in docking to perform energy minimization of

ligands.

IMPLEMENTATION

In our implementation of the genetic algorithm,

the chromosome is composed of a string of real-

valued genes: three Cartesian coordinates for the

ligand translation; four variables defining a

quaternion specifying the ligand orientation; and

one real-value for each ligand torsion, in that or-

der. Quaternions are used to define the orienta-

tion

of the ligand, to avoid the gimbal lock

problem experienced with Euler angles.

The or-

der of the genes that encode the torsion angles is

defined by the torsion tree created by A

UTOTORS,a

preparatory program used to select rotatable bonds

in the ligand. Thus, there is a one-to-one mapping

from the ligand’s state variables to the genes of the

individual’s chromosome.

The genetic algorithm begins by creating a ran-

dom population of individuals, where the user

JOURNAL OF COMPUTATIONAL CHEMISTRY 1641

MORRIS ET AL.

defines the number of individuals in the popula-

tion. For each random individual in the initial

population, each of the three translation genes for

x, y, and z is given a uniformly distributed ran-

dom value between the minimum and maximum

x, y, and z extents of the grid maps, respectively;

the four genes defining the orientation are given a

random quaternion, consisting of a random unit

vector and a random rotation angle between y180⬚

and q180⬚; and the torsion angle genes, if any, are

given random values between y180⬚ and q180⬚.

Furthermore, a new random number generator has

been introduced that is hardware-independent.

is used in the LS, GA, and LGA search engines,

and allows results to be reproduced on any hard-

ware platform given the same seed values. The

creation of the random initial population is fol-

lowed by a loop over generations, repeating until

the maximum number of generations or the maxi-

mum number of energy evaluations is reached,

whichever comes first. A generation consists of

five stages: mapping and fitness evaluation, selec-

tion, crossover, mutation, and elitist selection, in

that order. In the Lamarckian GA, each generation

is followed by local search, being performed on a

user-defined proportion of the population. Each of

these stages is discussed in more detail in what

follows.

Mapping translates from each individual’s geno-

type to its corresponding phenotype, and occurs

over the entire population. This allows each indi-

vidual’s fitness to be evaluated. This is the sum of

the intermolecular interaction energy between the

ligand and the protein, and the intramolecular

interaction energy of the ligand. The physicochem-

ical nature of the energy evaluation function is

described in detail later. Every time an individual’s

energy is calculated, either during global or local

search, a count of the total number of energy

evaluations is incremented.

This is followed, in our implementation, by pro-

portional selection to decide which individuals will

reproduce. Thus, individuals that have better-

than-average fitness receive proportionally more

offspring, in accordance with:

f y f

²:

n s f / f

²:

f y f

where n is the integer number of offspring to be

allocated to the individual; f is the fitness of the

Ž.

individual i.e., the energy of the ligand ; f is the

fitness of the worst individual, or highest energy,

in the last N generations i.e., N is a user-defina-

.²:

ble parameter, typically 10 ; and f is the mean

fitness of the population. Because the worst fitness,

²:

f , will always be larger than either f or f ,

except when f s f , then for individuals that have

²:

a fitness lower than the mean, f - f , the nu-

merator in this equation, f y f , will always be

²:

greater than the denominator f y f , and thus

such individuals will be allocated at least one

offspring, and thus will be able to reproduce.

²:

UTODOCK checks for f s f beforehand, and if

true, the population is assumed to have con-

verged, and the docking is terminated.

Crossover and mutation are performed on ran-

dom members of the population according to

user-defined rates of crossover and mutation. First,

crossover is performed. Two-point crossover is

used, with breaks occurring only between genes,

never within a gene—this prevents erratic changes

in the real values of the genes. Thus, both parents’

chromosomes would be broken into three pieces at

the same gene positions, each piece containing one

or more genes; for instance, ABC and abc. The

chromosomes of the resulting offspring after two-

point crossover would be AbC and aBc. These

offspring replace the parents in the population,

keeping the population size constant. Crossover is

followed by mutation; because the translational,

orientational, and torsional genes are represented

by real variables, the classical bit-flip mutation

would be inappropriate. Instead, mutation is per-

formed by adding a random real number that has

a Cauchy distribution to the variable, the distribu-

tion being given by:

␤

Ž.

C ␣,

␤

, x s

Ž.

␲␤

q x y ␣

Ž.

␣ G 0,

␤

) 0, y⬁ - x - ⬁

where ␣ and

␤

are parameters that affect the

mean and spread of the distribution. The Cauchy

distribution has a bias toward small deviates, but,

unlike the Gaussian distribution, it has thick tails

that enable it to generate large changes occasion-

ally.

An optional user-defined integer parameter

elitism determines how many of the top individu-

als automatically survive into the next generation.

If the elitism parameter is non-zero, the new popu-

lation that has resulted from the proportional se-

lection, crossover, and mutation is sorted accord-

ing to its fitness; the fitness of new individuals

VOL. 19, NO. 141642

HTML Viewer

Frequently Asked Questions (12)

Q1. What contributions have the authors mentioned in the paper "Automated docking using a lamarckian genetic algorithm and an empirical binding free energy function" ?

The authors consider three search methods, Monte Carlo simulated annealing, a traditional genetic algorithm, and the Lamarckian genetic algorithm, and compare their performance in dockings of seven protein ] ligand test systems having known three-dimensional structure. The authors show that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckian genetic algorithm is the most efficient, reliable, and successful of the three.

Q2. How many atoms were used to determine the similarity of the docked conformations?

The user-defined root-meanŽ .square positional deviation rmsd tolerance was used to determine if two docked conformations were similar enough to be included in the same cluster, and symmetrically related atoms in the ligand were considered.

Q3. How many test systems were used to test the docking procedure?

Six of the seven test systems used to test the docking procedure, which were originally used to test AUTODOCK, version 2.4,7 were also in the training set of 30 protein]ligand complexes; therefore, to validate the chosen coefficients, linear regression was repeated for the set of 24 protein]ligand complexes, excluding the 6 overlapping test systems.

Q4. How many retries were used to generate a low energy random initial state?

The maximum initial energy allowed was 0.0 kcal moly1, and the maximum number of retries was 1000, used to generate a low energy random initial state to begin each simulated annealing docking.

Q5. What is the atomic solvation parameter for a given atom?

The solvation parameter for a given atom (S, used in the equation above) is calculated as:Si ¼ ðASPi þ QASP jqijÞ (4)where qi is the atomic charge and ASP and QASP are the atomic solvation parameters derived here.

Q6. What was the charge assignment for ter-minal phosphate groups?

Charges on ter-minal phosphate groups were assigned improperly, with a totalcharge of 0.5, so the remaining 0.5 charge was split manually between the four surrounding oxygen atoms.

Q7. What is the term for the loss of torsional entropy upon binding?

The term for the loss of torsional entropy upon binding (DSconf) is directly proportional to the number of rotatable bonds in the molecule (Ntors):Sconf ¼ WconfNtors (3)The number of rotatable bonds include all torsional degreesof freedom, including rotation of polar hydrogen atoms onhydroxyl groups and the like.

Q8. How many complexes were used to calibrate the free energy function of autodock?

Thirty protein]ligand complexes with published binding constants were used in the calibraŽ .tion of AUTODOCK’s free energy function Table The author, and were chosen from the set of 45 used by Bohm,54¨

Q9. What are the first two terms for the bound and unbound states of the ligand?

The first two terms are intramolecular energies for thebound and unbound states of the ligand, and the following twoterms are intramolecular energies for the bound and unboundstates of the protein.

Q10. What is the rmsd of the lowest energy found by any search method?

The crystallographic rmsd of the lowest energy Ž .found by any search method for each of the ˚protein]ligand test systems were all within 1.14 A, or less, of the crystal structure.

Q11. What is the reason for the large discrepancy?

This large discrepancy may be due to neglect of the conformational rearrangements of streptavidin upon binding biotin, which are neglected in the docking simulation and binding free energy calculation.

Q12. How many ligands were not predicted correctly by AutoDock 4?

The remaining 28 com-plexes were not predicted correctly by AutoDock 4, most casesdue to the fact that they were very large ligands with greaterthan 15 degrees of torsional freedom (see Fig. 5).

Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function

Summary (1 min read)

Introduction

Results and Discussion

Conclusions

Figures (13)

Citations

Additional excerpts

References

Related Papers (5)

Frequently Asked Questions (12)

Q1. What contributions have the authors mentioned in the paper "Automated docking using a lamarckian genetic algorithm and an empirical binding free energy function" ?

Q2. How many atoms were used to determine the similarity of the docked conformations?

Q3. How many test systems were used to test the docking procedure?

Q4. How many retries were used to generate a low energy random initial state?

Q5. What is the atomic solvation parameter for a given atom?

Q6. What was the charge assignment for ter-minal phosphate groups?

Q7. What is the term for the loss of torsional entropy upon binding?

Q8. How many complexes were used to calibrate the free energy function of autodock?

Q9. What are the first two terms for the bound and unbound states of the ligand?

Q10. What is the rmsd of the lowest energy found by any search method?

Q11. What is the reason for the large discrepancy?

Q12. How many ligands were not predicted correctly by AutoDock 4?