scispace - formally typeset
Open Access

Software News and Update A Semiempirical Free Energy Force Field with Charge-Based Desolvation

TLDR
In this article, a semi-empirical free energy force field for use in AutoDock4 and similar grid-based docking methods is presented, based on a comprehensive thermodynamic model that allows incorporation of intramolecular energies into the predicted free energy of binding.
Abstract
The authors describe the development and testing of a semiempirical free energy force field for use in AutoDock4 and similar grid-based docking methods. The force field is based on a comprehensive thermodynamic model that allows incorporation of intramolecular energies into the predicted free energy of binding. It also incorpo- rates a charge-based method for evaluation of desolvation designed to use a typical set of atom types. The method has been calibrated on a set of 188 diverse protein-ligand complexes of known structure and binding energy, and tested on a set of 100 complexes of ligands with retroviral proteases. The force field shows improvement in redock- ing simulations over the previous AutoDock3 force field.

read more

Content maybe subject to copyright    Report

Software News and Update
A Semiempirical Free Energy Force Field
with Charge-Based Desolvation
RUTH HUEY, GARRETT M. MORRIS, ARTHUR J. OLSON, DAVID S. GOODSELL
Department of Molecular Biology, Scripps Research Institute, La Jolla, California 92102
Received 26 January 2006; Revised 13 March 2006; Accepted 24 April 2006
DOI 10.1002/jcc.20634
Published online 1 February 2007 in Wiley InterScience (www.interscience.wiley.com).
Abstract: The authors describe the development and testing of a semiempirical free energy force field for use in
AutoDock4 and similar grid-based docking methods. The force field is based on a comprehensive thermodynamic
model that allows incorporation of intramolecular energies into the predicted free energy of binding. It also incorpo-
rates a charge-based method for evaluation of desolvation designed to use a typical set of atom types. The method
has been calibrated on a set of 188 diverse protein–ligand complexes of known structure and binding energy, and
tested on a set of 100 complexes of ligands with retroviral proteases. The force field shows improvement in redock-
ing simulations over the previous AutoDock3 force field.
q 2007 Wiley Periodicals, Inc. J Comput Chem 28: 1145–1152, 2007
Key words: computational docking; prediction of free energy; force fields; desolvation models; computer-aided
drug design; AutoDock4
Introduction
Computational docking currently plays an essential role in drug
design and in the study of macromolecular structure and interac-
tion.
1–5
Docking simulations require two basic methods: a search
method for exploring the conformational space available to the
system and a force field to evaluate the energetics of each con-
formation. Given the desire to minimize computational effort,
there is a trade-off between these two elements. We may choose
to use a highly sophisticated force field while searching only a
small portion of the conformational space. This is the approach
taken by methods such as molecular dynamics and free energy
perturbation, which use physically based energy functions com-
bined with full atomic simulation to yield accurate estimates of
the energetics of molecular processes.
6
However, these methods
are currently too computationally intensive to allow blind dock-
ing of a ligand to a protein. Computational docking is typically
performed by employing a simpler force field and exploring a
wider region of conformational space. This is the approach taken
by AutoDock (http://autodock.scripps.edu) and other computa-
tional docking methods, which may predict bound conformation
with no a priori knowledge of the binding site or its location on
the macromolecule.
7–9
Empirical free energy force fields, which define simple func-
tional forms for ligand–protein interactions, and semiempirical
free energy force fields, which combine traditional molecular
mechanics force fields with empirical weights and/or empirical
functional forms, have been used by a wide variety of computa-
tional docking methods.
7–9
These force fields provide a fast
method to rank potential inhibitor candidates or bound states
based on an empirical score. In some cases, this score may be
calibrated to yield an estimate of the free energy of binding.
The AutoDock3 force field
10
is an example. It uses a molecu-
lar mechanics approach to evaluate enthalpic contributions such
as dispersion/repulsion and hydrogen bonding and an empirical
approach to evaluate the entropic contribution of changes in sol-
vation and conformational mobility. Empirical weights are
applied to each of the components based on calibration against a
set of known binding constants. The final semiempirical force
field is designed to yield an estimate of the binding constant.
We describe here the development and testing of an
improved semiempirical free energy force field for AutoDock4,
designed to address several significant limitations of the Auto-
Dock3 force field. The new force field, which has been cali-
brated using a large set of diverse protein–ligand complexes,
includes two major advances. The first is the use of an improved
thermodynamic model of the binding process, which now allows
inclusion of intramolecular terms in the estimated free energy.
Second, the force field includes a full desolvation model that
includes terms for all atom types, including the favorable ener-
getics of desolvating carbon atoms as well as the unfavorable
Contract/grant sponsor: National Institutes of Health; contract/grant num-
bers: R01 GM069832, P01 GM48870
Correspondence to: D.S. Goodsell; e-mail: goodsell@scripps.edu
q 2007 Wiley Periodicals, Inc.

energetics of desolvating polar and charged atoms. The force
field also incorporates an improved model of directionality in
hydrogen bonds, now predicting the proper alignment of groups
with multiple hydrogen bonds such as DNA bases.
11
Methods
Overview
The semiempirical free energy force field estimates the ener-
getics of the process of binding of two (or more) molecules in a
water environment using pair-wise terms to evaluate the interac-
tion between the two molecules and an empirical method to esti-
mate the contribution of the surrounding water. This differs
from a traditional molecular mechanics force field, which also
relies on pair-wise atomic terms, but typically uses explicit
water molecules to evaluate solvation contributions. The goal of
the empirical free energy force field is to capture the complex
enthalpic and entropic contributions in a limited number of eas-
ily evaluated terms.
The approach taken in AutoDock is shown in Figure 1. The
free energy of binding is estimated to be equal to the difference
between (1) the energy of the ligand and the protein in a sepa-
rated unbound state and (2) the energy of the ligand–protein
complex. This is broken into two steps for the evaluation: we
evaluate the intramolecular energetics of the transition from the
unbound state to the bound conformation for each of the mole-
cules separately, and then evaluate the intermolecular energetics
of bringing the two molecules together into the bound complex.
The force field includes six pair-wise evaluations (V) and an
estimate of the conformational entropy lost upon binding (DS
conf
):
G V
LL
bound
V
LL
unbound
V
PP
bound
V
PP
unbound
V
PL
bound
V
PL
unbound
S
conf
1
In this equation, L refers to the ‘ligand’ and P refers to the
‘protein’ in a protein–ligand complex; note, however, that the
approach is equally valid for any types of molecules in a com-
plex. The first two terms are intramolecular energies for the
bound and unbound states of the ligand, and the following two
terms are intramolecular energies for the bound and unbound
states of the protein. The change in intermolecular energy
between the bound and unbound states is in the third parenthe-
ses. It is assumed that the two molecules are sufficiently distant
from one another in the unbound state that V
P–L
unbound
is zero. In
the current study, we did not allow motion in the protein, so the
bound state of the protein is identical with the protein unbound
state, and the difference in their intramolecular energy is zero.
The pair-wise atomic terms include evaluations for disper-
sion/repulsion, hydrogen bonding, electrostatics, and desolvation:
V W
vdw
X
i;j
A
ij
r
12
ij
B
ij
r
6
ij
!
W
hbound
X
i;j
Et
C
ij
r
12
ij
D
ij
r
10
ij
!
W
elec
X
i;j
q
i
q
j
"r
ij
r
ij
W
sol
X
i;j
S
i
V
j
S
j
V
i
e
r
2
ij
=2
2
2
The weighting constants W are the ones that are optimized to
calibrate the empirical free energy based on a set of experimen-
tally characterized complexes. The first term is a typical 6/12
potential for dispersion/repulsion interactions. Parameters A and
B were taken from the Amber force field.
12
The second term is
a directional H-bond term based on a 10/12 potential.
13
The pa-
rameters C and D are assigned to give a maximal well depth of
5 kcal/mol at 1.9 A
˚
for OH and NH, and a depth of 1 kcal/
mol at 2.5 A
˚
for SH. Directionality of the hydrogen bond
interaction E(t) is dependent on the angle t away from ideal
bonding geometry and is described fully in previous work.
10,14
Directionality is further enhanced by limiting the number of
hydrogen bonds available to each point in the grid to the actual
Figure 1. The force field evaluates binding in two steps. The ligand and protein start in an unbound con-
formation. The first step evaluates the intramolecular energetics of the transition from these unbound states
to the conformation that the ligand or protein will adopt in the bound complex. The second step evaluates
the intermolecular energetics of combining the ligand and protein in their bound conformations.
1146 Huey et al.
Vol. 28, No. 6
Journal of Computational Chemistry
Journal of Computational Chemistry DOI 10.1002/jcc

number of hydrogen bonds that could be formed.
11
Electrostatic
interactions are evaluated with a screened Coulomb potential
identical with that used in AutoDock3.
15
The final term is a des-
olvation potential based on the volume (V) of the atoms sur-
rounding a given atom, weighted by a solvation parameter (S)
and an exponential term based on the distance.
16
As in the origi-
nal report, the distance weighting factor is set to 3.5 A
˚
.
As in our previous work, this force field is calibrated for a
united atom model, which explicitly includes heavy atoms and
polar hydrogen atoms. Intramolecular energies are calculated for
all pairs of atoms within the ligand (or protein, if it has free tor-
sional degrees of freedom), excluding 1–2, 1–3, and 1–4 interac-
tions.
The term for the loss of torsional entropy upon binding
(DS
conf
) is directly proportional to the number of rotatable bonds
in the molecule (N
tors
):
S
conf
W
conf
N
tors
(3)
The number of rotatable bonds include all torsional degrees
of freedom, including rotation of polar hydrogen atoms on
hydroxyl groups and the like.
Desolvation
The desolvation term is evaluated using the general approach of
Wesson and Eisenberg.
17
Two pieces of information are needed
(1) an atomic solvation parameter for each atom type, which is
an estimate of the energy needed to transfer the atom between a
fully hydrated state and a fully buried state [S
i
in eq. (2)] and
(2) an estimate of the amount of desolvation when the ligand is
docked [V
i
in eq. (2)]. The amount of desolvation is calculated
using a volume-summing method similar to the Stouten et al.
method.
16
We have developed a modified approach for the
atomic solvation parameters based on the chemical type and the
atomic charge of the atom. This approach is designed to use the
simple set atom types used in AutoDock and other docking
methods. Incorporation of the atomic charge into the solvation
parameter removes the need to use two discrete charged and
uncharged atom types for oxygen and nitrogen. This is particu-
larly useful with hybridized charged groups such as carboxylic
acids, which require arbitrary assignment of a formal charge on
one oxygen or the other when using previous approaches.
The solvation parameter for a given atom (S, used in the
equation above) is calculated as:
S
i
ASP
i
QASP jq
i
j (4)
where q
i
is the atomic charge and ASP and QASP are the
atomic solvation parameters derived here. The ASP is calibrated
using six atom types: aliphatic carbons (C), aromatic carbons
(A), nitrogen, oxygen, sulfur, and hydrogen. A single QASP is
calibrated over the set of charges on all atom types.
The estimation of the extent of desolvation is calibrated to
use atomic volumes calculated as a sphere with radius equal to
the contact radius (C/A, 2.00; N, 1.75; O, 1.60; S, 2.00). The
amount of shielding in a typical protein was evaluated using the
188 proteins in the calibration set (described later). For each
atom in the protein, the volume term in the free energy equation
was evaluated:
V
i
X
k6i
V
k
e
r
ik
2
=2
2
(5)
where the sum is performed over all atoms k in the protein,
excluding all atoms in the same amino acid residue as i. The
maximal value of DV for each amino acid type over the entire
set of proteins was then determined. These values were used to
perform a least-squares fit of the model to a set of experimental
vacuum-to-water transfer energies,
18
to determine values for the
atomic solvation parameters ASP and QASP. The program R
(http://www.r-project.org) was used to perform the fit. Several
formulations were tested. The best results were obtained using
two carbon types, a nonaromatic (C) and an aromatic (A) type
(Table1), where the aromatic type is limited to carbon atoms
within aromatic ring structures. The maximum error was 1.88
kcal/mol in that case. Using a single carbon type, the maximum
error increased to 3.00 kcal/mol, and use of a single type for all
atoms, along with the charge, gave a maximum error of 4.12
kcal/mol.
We used a simple approximation for incorporation of addi-
tional atom types in the desolvation model. The ASP is assigned
to the average of the values from the six atom types used in the
calibration and the same QASP is applied.
Unbound States
This method relies on assignment of an unbound state for the
ligand and protein. In this work, we tested three approaches to
the unbound state, as shown in Figure 2. These states are simple
approximations to the ensemble of unbound conformations, mak-
ing a few extreme assumptions about which conformations dom-
inate the energetics of the ensemble.
The first approach (the ‘extended’ state) is a fully extended
conformation, which models a fully solvated conformation with
few internal contacts. A short optimization was performed on
the ligand in isolation using a uniform potential inversely pro-
portional to the distance between each pair of atoms. This
pushes all atoms as far away from one another as possible.
Table 1. Calibration of the Desolvation Model.
Type ASP (std error)
C 0.00143 (0.00019)
A 0.00052 (0.00012)
N 0.00162 (0.00182)
O 0.00251 (0.00189)
H 0.00051 (0.00052)
S 0.00214 (0.00118)
QASP 0.01097 (0.00263). Note that these values must be multiplied
by the empirically determined weighting factor [W
sol
in eq. (2)] for the
estimate of the free energy.
1147Semiempirical Free Energy Force Field
Journal of Computational Chemistry DOI 10.1002/jcc

The second approach (the ‘compact’ state) is a minimized
conformation that has substantial internal contacts, modeling a
folded state for the unbound ligand. We wished to include the
new solvation model in the determination of this unbound state,
so we used AutoDock4 with values of the energetic parameters
taken from the calibration using the extended state. A short
Lamarckian genetic algorithm conformational search was per-
formed, using an empty affinity grid. As expected, these confor-
mations tend to bury hydrophobic portions inside and form inter-
nal hydrogen bond interactions.
The final approach (the ‘bound’ state) uses the assumption
used in AutoDock3 and many other docking methods. In this, it
is assumed that the conformation of the unbound state is identi-
cal to the conformation of the bound state.
Coordinate Sets
The force field was calibrated and tested using a large collection
of protein complexes for which experimental information on
binding strength is available. These complexes were taken from
two sources.
The force field was calibrated on a set of 188 complexes.
Binding data were obtained from the Ligand–Protein Database
(http://lpdb.scripps.edu), and coordinates were obtained from the
Protein Data Bank (http://www.pdb.org). These complexes were
checked and corrected if necessary for the proper biological
unit, and files were regularized to remove alternate locations and
to have consistent naming of atoms. Hydrogen atoms were then
added automatically using Babel,
19
atomic charges were added
using the Gasteiger PEOE method,
20
and then nonpolar hydro-
gen atoms were merged. The Gasteiger method was chosen for
its fast and easy operation and ready availability as part of Ba-
bel. Formal charges were assigned to metal ions. Charges on ter-
minal phosphate groups were assigned improperly, with a total
charge of 0.5, so the remaining 0.5 charge was split man-
ually between the four surrounding oxygen atoms. Babel also
treated a sulfonamide group in several ligands as neutral; a sin-
gle negative charge, as reported in the original structure reports,
was split manually between the oxygen and nitrogen atoms in
these groups. Ligands were processed in ADT (the graphical
interface to AutoDock, http://autodock.scripps.edu/resources/adt)
to assign atom types and torsion degrees of freedom. Finally, a
short optimization of the ligand was performed using the local
search capability of AutoDock3, to relax any unacceptable con-
tacts in the crystallographic conformation.
Binding data for a test set of 100 retroviral protease com-
plexes was obtained from the PDBBind database (http://www.
pdbbind.org), and coordinates were obtained from the Protein
Data Bank. They were processed similarly to the calibration set.
Redocking
Redocking experiments were performed with AutoDock4 and
the new empirical free energy force field. For each complex, 50
docking experiments were performed using the Lamarckian
genetic algorithm conformational search with the default param-
eters from AutoDock3. A maximum of 25 million energy evalu-
ations was applied for each experiment. The results were clus-
tered using a tolerance of 2.0 A
˚
. For comparison, docking
experiments with the same parameters were performed with
AutoDock3.
Results and Discussion
Performance
The force field was calibrated on a set of 188 complexes tabu-
lated in the Ligand–Protein Database. Calibration with the three
Figure 2. Comparison of the extended, compact, and bound confor-
mations of the HIV protease inhibitor indinavir, taken from PDB
entry 1hsg. Note that the extended and bound states are quite simi-
lar, and that the hydrophobic groups have formed a cluster in the
compact state.
1148 Huey et al.
Vol. 28, No. 6
Journal of Computational Chemistry
Journal of Computational Chemistry DOI 10.1002/jcc

different unbound states gave similar results, as shown in Figure
3. Table 2 shows standard errors for the three unbound states for
several sets. The first line is standard error over the entire set of
188 complexes used in the calibration. The second line is the
standard error calculated for the set of 155 complexes that do
not have close ligand–metal contacts, using parameters cali-
brated against the whole set. The performance is significantly
better on this smaller set of proteins. The third line is the stand-
ard error when evaluating binding energies for a naive set of
100 HIV protease complexes, not used in the calibration. The
final four lines are results from a cross-validation study, where
1/4 of the complexes were removed from the calibration set, and
then evaluated with the parameters calibrated with the remaining
3/4 of the set.
Table 3 shows parameters for the three calibrations, including
standard errors and t values for the parameters. The compact
state is clearly the worst of the three. The extended and bound
states have similar performance, with the bound state showing
slightly better statistics in the calibration and the cross-valida-
tion, but poorer performance with the HIV test set. Based on
these results, we have chosen the extended conformation as the
default unbound state in AutoDock4. However, it is possible
within AutoDock4 to use any of the models for the unbound
state, with the appropriate values for the force field parameters,
if there is specific information for a given application about the
nature of the unbound state.
The new force field requires slightly more computation time
for the energy evaluation than the previous AutoDock3 force
field. It requires precalculation of the energy of the unbound
state, which is not performed in AutoDock3. During the docking
simulation, it requires one additional operation to look-up the
charge-dependent desolvation component from a precalculated
AutoGrid map.
Magnitude of Terms
The total predicted free energy is surprisingly well distributed
between the four pair-wise terms, as shown in Table 4. The dis-
persion/repulsion term provides an average of about 0.3 to
0.5 kcal/mol per nonhydrogen ligand atom. Hydrogen bonding
provides approximately 0.6 kcal/mol for single hydrogen
bonds and twice that with oxygen atoms that accept two hydro-
gen bonds. Electrostatic energies range widely, but give an over-
all average that is favorable by a few tenths of a kcal/mol. How-
ever, in the best cases of interactions with metal ions, electro-
statics can provide significant stabilization of several kcal/mol.
Desolvation of hydrophobic groups provides a favorable interac-
tion of 0.13 kcal/mol in the best cases and the worst case of
Figure 3. Performance of the new force field. These graphs are histograms showing the number of
complexes with a given error in the predicted free energy of binding. Values of zero correspond to
complexes that are perfectly predicted, positive values are cases where the predicted energy is too
favorable (too negative). The random curve was generated by using a random number between 0 and 1
for values of the h-bond, dispersion/repulsion, electrostatic, and desolvation energies for complexes in
the calibration set, and then deriving parameters based on these randomized energies. (A) Results for
the calibration set. (B) Results for the HIV protease test set.
Table 2. Calibration Results.
Coordinate set Extended Compact Bound AutoDock3
Std error (188 complexes) 2.62 2.72 2.52 2.63
Std error (155 complexes) 2.35 2.51 2.25 2.33
Std error (HIV test set) 2.80 2.52 2.99 3.48
Std error (cross validation) 2.41 2.57 2.36
2.44 2.47 2.36
2.83 2.92 2.73
3.24 3.38 3.07
Values for standard errors are in kcal/mol.
1149Semiempirical Free Energy Force Field
Journal of Computational Chemistry DOI 10.1002/jcc

Citations
More filters
Journal ArticleDOI

AutoDock4 and AutoDockTools4: Automated docking with selective receptor flexibility

TL;DR: AutoDock4 incorporates limited flexibility in the receptor and its utility in analysis of covalently bound ligands is reported, using both a grid‐based docking method and a modification of the flexible sidechain technique.
Journal ArticleDOI

SwissParam: a fast force field generation tool for small organic molecules.

TL;DR: A fast force field generation tool, called SwissParam, able to generate, for arbitrary small organic molecule, topologies, and parameters based on the Merck molecular force field, but in a functional form that is compatible with the CHARMM force field is presented.
Journal ArticleDOI

Computational Methods in Drug Discovery

TL;DR: Computer-aided drug discovery/design methods have played a major role in the development of therapeutically important small molecules for over three decades and theory behind the most important methods and recent successful applications are discussed.
Journal ArticleDOI

Ligand docking and binding site analysis with PyMOL and Autodock/Vina.

TL;DR: An interface between the popular molecular graphics system PyMOL and the molecular docking suites Autodock and Vina is presented and it is demonstrated how the combination of docking and visualization can aid structure-based drug design efforts.
Journal ArticleDOI

Resveratrol Ameliorates Aging-Related Metabolic Phenotypes by Inhibiting cAMP Phosphodiesterases

TL;DR: It is reported that the metabolic effects of resveratrol result from competitive inhibition of cAMP-degrading phosphodiesterases, leading to elevated cAMP levels, and administration of PDE4 inhibitors may also protect against and ameliorate the symptoms of metabolic diseases associated with aging.
References
More filters
Journal ArticleDOI

Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function

TL;DR: It is shown that both the traditional and Lamarckian genetic algorithms can handle ligands with more degrees of freedom than the simulated annealing method used in earlier versions of AUTODOCK, and that the Lamarckia genetic algorithm is the most efficient, reliable, and successful of the three.
Journal ArticleDOI

A new force field for molecular mechanical simulation of nucleic acids and proteins

TL;DR: In this paper, a force field for simulation of nucleic acids and proteins is presented, which is based on the ECEPP, UNECEPP, and EPEN energy refinement software.
Journal ArticleDOI

Iterative partial equalization of orbital electronegativity – a rapid access to atomic charges

TL;DR: In this article, a method for the rapid calculation of atomic charges in σ-bonded and nonconjugated π-systems is presented, where only the connectivities of the atoms are considered.
Journal ArticleDOI

Docking and scoring in virtual screening for drug discovery: methods and applications.

TL;DR: Key concepts and specific features of small-molecule–protein docking methods are reviewed, selected applications are highlighted and recent advances that aim to address the acknowledged limitations of established approaches are discussed.
Journal ArticleDOI

A computational procedure for determining energetically favorable binding sites on biologically important macromolecules.

TL;DR: The interaction of a probe group with a protein of known structure is computed at sample positions throughout and around the macromolecule, giving an array of energy values.
Related Papers (5)