scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Structural requirements for potential HIV-integrase inhibitors identified using pharmacophore-based virtual screening and molecular dynamics studies

23 Feb 2016-Molecular BioSystems (The Royal Society of Chemistry)-Vol. 12, Iss: 3, pp 982-993
TL;DR: The study suggested that the screened compounds might be promising HIV-integrase inhibitors and new chemical entities obtained from the NCI database will be subjected to experimental studies to confirm potential inhibition of HIV integrase.
Abstract: Acquired immunodeficiency syndrome (AIDS) is a life-threatening disease which is a collection of symptoms and infections caused by a retrovirus, human immunodeficiency virus (HIV). There is currently no curative treatment and therapy is reliant on the use of existing anti-retroviral drugs. Pharmacoinformatics approaches have already proven their pivotal role in the pharmaceutical industry for lead identification and optimization. In the current study, we analysed the binding preferences and inhibitory activity of HIV-integrase inhibitors using pharmacoinformatics. A set of 30 compounds were selected as the training set of a total 540 molecules for pharmacophore model generation. The final model was validated by statistical parameters and further used for virtual screening. The best mapped model (R = 0.940, RMSD = 2.847, Q2 = 0.912, se = 0.498, Rpred2 = 0.847 and rm(test)2 = 0.636) explained that two hydrogen bond acceptor and one aromatic ring features were crucial for the inhibition of HIV-integrase. From virtual screening, initial hits were sorted using a number of parameters and finally two compounds were proposed as promising HIV-integrase inhibitors. Drug-likeness properties of the final screened compounds were compared to FDA approved HIV-integrase inhibitors. HIV-integrase structure in complex with the most active and final screened compounds were subjected to 50 ns molecular dynamics (MD) simulation studies to check comparative stability of the complexes. The study suggested that the screened compounds might be promising HIV-integrase inhibitors. The new chemical entities obtained from the NCI database will be subjected to experimental studies to confirm potential inhibition of HIV integrase.

Summary (4 min read)

Introduction

  • Human immunodeficiency virus (HIV) is the aetiological agent of acquired immunodeficiency syndrome (AIDS) which destroys the immune system of the body leaving the victim vulnerable to infections, malignancies and neurological disorder.
  • To date the highly active antiretroviral therapy 6 which is combined therapy using the above classes of inhibitors is widely used for patients with advance infection but has failed to eradicate the virus.
  • Several research groups worldwide identified integrase inhibitors 11-17 using pharmacoinformatics approaches for potential application for HIV therapy.
  • The pharmacophore models are widely used in the field of drug discovery by providing valuable information to study SAR and reveals the mechanism of ligand-target relationship by deducing the nature of functional groups and non-covalent bonding patterns 20 .

Materials and methods

  • At present, several popular commercial and freeware packages are used for ligand-based method to derive 3D pharmacophore models and also help in estimation of biological activities.
  • This is commercially available software containing several module packages and widely used in pharmacoinformatics drug discovery 25-28 .
  • The 3D QSAR Pharmacophore Generation module takes input of structure and activity data for a set of potential HIV-integrase ligand to create hypotheses.
  • The HypoGen allows identification of hypotheses that are common to the ‘active’ molecules of training set but absent in the ‘inactive’ molecules, whilst HipHop identifies hypotheses present both in ‘active’ and ‘inactive’ compounds.
  • In the present work the HypoGen module was used to generate the hypotheses.

Dataset

  • It was also kept in mind that no compounds were common in any two training sets except for the most active and least active molecules.
  • For each compound, the coordinates were corrected, atoms were typed and energy was minimized using the modified CHARMm force field 30, 31 .

Pharmacophore model generation

  • In order to generate the pharmacophore space model the 3D QSAR Pharmacophore Model Generation module of DS was used.
  • Out of BEST/FAST, the BEST method was considered to obtain multiple acceptable conformations which provides complete and enhanced coverage of conformational space with help of rigorous energy minimization and optimizing the conformations by the poling algorithm 32 .
  • In final step, the remaining hypotheses improve the score with help of small perturbations 33, 35 .
  • All four sets were used to develop the pharmacophore models and statistical parameters were calculated based on training & test sets molecules.
  • The information concerning the structure and the biological activity of test set compounds of Set 1 is provided in Table S5 in the supplementary information, while all the data regarding the training set (Set 1) molecules are reported in Fig. Fig. 1 2D chemical structures of the training set compounds and the activity values (IC50) are given in the parentheses.

Internal validation

  • Leave-one out (LOO) cross-validation is one of the important internally validation protocol of the selected model, in which one compound was randomly deleted from training set in each cycle and model redeveloped using the rest of the compounds with the same parameters used in original model.
  • The activity of the deleted compounds was calculated based on the newly developed model.
  • The above procedure applied for all molecules of the training set and predicted activity recorded.
  • This parameter measure the degree of deviation of the predicted activity from the observed ones.

Cost function analysis

  • To select the final pharmacophore model several statistical parameters were employed at the time of hypothesis generation, these included spacing, uncertainty, and weight variation.
  • The spacing represents the minimum inter-features distance which may be permissible in the final hypothesis, while the weight variation reflects the level of magnitude explored by the hypothesis in which every feature implies some degree of magnitude of the biological activity of the compound.
  • The uncertainty returns the error of prediction which signifies the standard deviation of the error cost.
  • The weight cost is directly proportional to the deviation of weight variation from its input value.
  • The configuration cost implies entropy of hypothesis space and it is reported that value should have <17 for a good pharmacophore model.

Test set prediction

  • Any robust pharmacoinformatics model should have the capability to predict the activities of the compounds other than training set.
  • In case of moderately active compounds, 17 and 6 compounds were underestimated as highly active and overestimated as least active respectively.
  • The remaining 450 compounds were classified in their observed and predicted activity correctly (Table S5 in supplementary file) which suggests that the selected model was able to provide accurate estimation for the biological activities of external compounds.
  • The correlation (R) between observed and estimated activity of test compounds was found to be 0.915 and the R 2 pred value of 0.847 with error of prediction (sp) of 0.697.
  • The r 2 m(test) and Δr 2 m(test) were found to be 0.636 and 0.130 respectively, explaining that selected model has adequate predictive potential.

Decoy set

  • Decoy set validation of pharmacophore model is one of the important approaches to evaluate the screening capability of the model.
  • The hypo1 was screened by a set of 900 HIVintegrase decoys obtained by DecoyFinder1.1 amalgamated with 80 active HIV-integrase inhibitors.
  • The ROC plot was derived for the model and given in Fig. 5 which indicated that actives and decoys are well-classified.
  • Average EF 1 % value for pharmacophore model was found to be 9.80 which indicated that model has identified active compounds very well and the top 1% hit is enriched with active compounds.

Molecular docking

  • The molecular docking study gives the accurate and preferred orientations of the molecule at the receptor site of the macromolecule.
  • The crystal structure of HIV integrase (PDB ID: 1QS4) was collected from RCSBProtein Data Bank.
  • In order to validate the docking protocol, selfdocking 54 is one of the important techniques in which bound ligand is docked at the catalytic site of protein molecule and the conformer of the original bound ligand is superimposed to the docked poses to calculate root mean square deviation (RMSD) values.
  • The RMSD values was found 1.406Å, which indicated that the protocol was selected in the docking method was validated.
  • Asp116 was also found to be important to form one of each hydrogen bond and bump interactions with NSC651812.

Molecular Dynamics

  • In order to perform MD simulation and part of the analysis of the trajectories the AMBER 12 48 was used for the selected docked poses.
  • The generalized amber force field was used for preparation of both ligand and receptor.
  • Each system was minimized for 500 steps of each conjugate gradient and steepest descent method.
  • In order to compare the drug-likeness of the screened compounds with existing Food and Drug Administration (FDA) approved HIV integrase inhibitors different parameters including dockscore, estimated activity, fit value, molecular weight, logP, violation of Lipinski’s rule of five, molecular volume, molecular refractivity, number of H-bonds and number of bump interactions were analysed.
  • LogP measures the hydrophobicity of the molecules.

Results and Discussion

  • The HypoGen module was used to develop the pharmacophore model based on training set (ntr = 30) compounds selected from the whole dataset.
  • The Feature mapping protocol of DS was used to select the pharmacophoric features, ‘HBA’, ‘HBD’, ’H’ and ‘R’ as required for chemical features and were given as input to the 3D QSAR pharmacophore generation along with keeping minimum and maximum feature value ‘0’ and ‘5’ respectively.
  • From Table 1 it is delineated that the high correlation coefficient, less rmsd, highest cost difference and minimum error values were observed for Hypo1 in comparison to other hypotheses.
  • The predicted activity of the training set molecules explained that one active compound was overestimated as moderately active and two moderately active molecules were underestimated as active compounds.
  • Therefore it can be postulated that to design or synthesize new chemical entities of HIV integrase inhibitors HBA and R factors with critical inter-feature distances (Fig. 2c) will be crucial factors.

Validation Correlation Total cost

  • From Table 3, the average of correlation coefficient for all 19 trials was found to be 0.642.
  • It was also observed that the total costs of randomized runs were much higher than the total cost of Hypo 1.
  • The above discussion undoubtedly demonstrated that the selected pharmacophore model was not produced by chance.

Virtual screening

  • In order to find potential molecules that are HIV integrase inhibitors virtual screening is a powerful technique and also effective as an alternative to high-throughput screening methodologies.
  • ‘Maximum Hits’ was set to 600 for each screening method.
  • After deletion of redundant molecules, the remaining 1121 compounds were fitted to the pharmacophore model by the Ligand Pharmacophore Mapping protocol of DS with maximum omitted feature set to ‘0’.
  • Furthermore the Lipinski’s rule of five 52 and Veber’s 53 rule were checked for 13 compounds.
  • The remaining 5 compounds further were taken into consideration for molecular docking study in the active site of HIV integrase (PDB ID: 1QS4).

Molecular dynamics

  • In order to analyse stability of molecular docked complexes of HIVintegrase with H13, NSC91705 and NSC651812 molecular dynamics studies were performed.
  • Both molecular docking and lowest energy complexes of MD simulation of screened compounds explain the importance Asp64 and Asp116 amino residues at the active site cavity.
  • In order to compare the drug-likeness of screened compounds with FDA approved HIV-integrase inhibitors different parameters of H13, Dolutegravir, Elvitegravir, Raltegravir, NSC91705 and NSC651812 were calculated and reported in Table 4.

Conclusion

  • Pharmacophore-based virtual screening studies were carried out to identify potential molecules for therapeutic application in HIV/AIDS.
  • Hypotheses were validated using R 2 pred, sp, R 2 m(test), Δr 2 m(test), Fischer’s randomization and decoy set, and finally Hypo1 was selected as the best model.
  • In the molecular docking study, a number of binding interactions were observed between final screened compounds and catalytic amino acid residues of HIV-integrase.
  • RMSD, RMSF, potential energy and total energy were recorded of the most active compound and final screened molecules.

Did you find this useful? Give us your feedback

Figures (14)

Content maybe subject to copyright    Report

a
Department of Chemical Pathology, Faculty of Health Sciences, University of Pretoria
and National Health Laboratory Service Tshwane Academic Division, Pretoria, South
Africa.
b
Division of Chemical Pathology, University of Cape Town, , South Africa.
Correspondence should be addressed to T.S. Pillay, Department of Chemical Pathology,
Faculty of Health Sciences, University of Pretoria, Private Bag X323, Arcadia, Pretoria,
0007
Email: tspillay@gmail.com
Phone: +27-12-319-2114
Fax: +27-12-3283600
Structural requirements for potential HIV-integrase inhibitors
identified using pharmacophore-based virtual screening and
molecular dynamics studies
Md Ataul Islam
a
, Tahir S. Pillay
*a,b
Acquired immunodeficiency syndrome (AIDS) is a life-threatening disease which is a collection of symptoms and infections
caused by a retrovirus, human immunodeficiency virus (HIV). There is currently no curative treatment and therapy is
reliant on the use of existing anti-retroviral drugs. Pharmacoinformatics approaches have already proven their pivotal role
in the pharmaceutical industry for lead identification and optimization. In the current study, we analysed the binding
preferences and inhibitory activity of HIV-integrase inhibitors using pharmacoinformatics. A set of 30 compounds were
selected as the training set of a total 540 molecules for pharmacophore model generation. The final model was validated
by statistical parameters and further used for virtual screening. The best mapped model (R = 0.940, rmsd = 2.847, Q
2
=
0.912, se = 0.498, R
2
pred
= 0.847 and r
2
m (test)
= 0.636) explained that two hydrogen bond acceptor and one aromatic ring
features were crucial for the inhibition of HIV-integrase. From virtual screening, initial hits were sorted using a number of
parameters and finally two compounds were proposed as promising HIV-integrase inhibitors. Drug-likeness properties of
the final screened compounds were compared to FDA approved HIV-integrase inhibitors. HIV-integrase structure in
complex with the most active and final screened compounds were subjected to 50ns molecular dynamics (MD) simulation
studies to check comparative stability of the complexes. The study suggested that the screened compounds might be
promising HIV-integrase inhibitors. The new chemical entities obtained from the NCI database will be subjected to
experimental studies to confirm potential inhibition of HIV integrase.
Introduction
Human immunodeficiency virus (HIV) is the aetiological agent of
acquired immunodeficiency syndrome (AIDS) which destroys the
immune system of the body leaving the victim vulnerable to
infections, malignancies and neurological disorder. Owing to its
rapid spread it has become a serious global threat and there is no
curative treatment for this fatal disease. According to statistics by
World Health Organization (WHO), a total of 37.20 million people
are living with AIDS and 1.70 million people died in 2013 alone.
Currently, there are 3 categories of therapeutic anti-HIV drugs
based on their inhibitory mechanisms
1
and these include
nucleoside reverse transcriptase inhibitors (NRTIs)
2
, non-nucleoside
reverse transcriptase inhibitors (NNRTIs)
3
, and protease inhibitors
(PIs)
4, 5
. To date the highly active antiretroviral therapy (HAART)
6
which is combined therapy using the above classes of inhibitors is
widely used for patients with advance infection but has failed to
eradicate the virus. HAART is intended to slow down viral
replication and lower the patient’s total burden of HIV infection,
but this treatment is not entirely cost effective and is often out of
reach of people worldwide. The genome of the HIV encodes for
three enzymes viz. the protease, reverse transcriptase and
integrase. The integrase is a 32 kDa enzyme made of three
functional domains included an N-terminal domain, catalytic core
domain and a less conserved C-terminal domain
7, 8
. The HIV
1

integrase has no equivalent counterpart or sequence homologue in
the human host cell and consequently it can be considered as an
attractive drug target
9
. After the approval of Raltegravir
10
as anti-
HIV drug several integrase inhibitors emerged as promising class of
therapeutics for the treatment of AIDS. Raltegravir and several
other inhibitors have been identified to possess anti-HIV integrase
activity but these have adverse effects on prolonged use and the
development of drug resistance drives the need to explore new
novel and potential chemical scaffolds for the treatment of AIDS.
The computational methods in drug discovery collectively termed
pharmacoinformatics, includes structure activity relationship (SAR),
pharmacophore, virtual screening and molecular docking and these
have proven their pivotal role in the pharmaceutical industry for
lead identification and optimization. Several research groups
worldwide identified integrase inhibitors
11-17
using
pharmacoinformatics approaches for potential application for HIV
therapy. Consistent with the objective of developing new potent
and less toxic integrase inhibitors the current research explores the
binding preferences of the inhibitory molecules of HIV integrase in
terms of space modelling study and virtual screening along with
molecular docking and molecular dynamics.
A pharmacophore model is an collection of steric and electronic
features and provides an intuitive way of depicting and
understanding the binding properties of small molecules along with
an explanation of optimum supra-molecular interactions with a
precise biological target, to activate (or block) its biological
response
18, 19
. It can also be defined that the pharmacophore idea is
based on the kinds of interaction observed in molecular
appreciation, i.e., hydrophobic, hydrogen bonding, and charge
interaction. For the HIV integrase inhibitors the hydrogen bond
acceptor (HBA) and donor (HBD), hydrophobic (H) and aromatic ring
(RA) pharmacophore features were found to be the important
functional features associated with the selectivity and potency. The
pharmacophore models are widely used in the field of drug
discovery by providing valuable information to study SAR and
reveals the mechanism of ligand-target relationship by deducing the
nature of functional groups and non-covalent bonding patterns
20
. It
can also be used in virtual screening to identify potential molecules,
predict the activity of the newly synthesized compound before
animal experiment; or understand the possible mechanism of
action
21, 22
. In the current study, an attempt was made to identify
the pharmacophore hypothesis using the HypoGen module
23
based
on key chemical features of HIV-integrase inhibitors with inhibition
constant covering a satisfactory wide range of magnitude. The
model was validated using several statistical approaches including
Fischer’s randomization and test set prediction. The validated
model was utilized for the virtual screening to select the virtual hits
from structural database. The molecular docking study was also
performed to elucidate the binding interactions and preferred
orientation of proposed potential molecules. The potential of the
work is displayed by the identification of two potential lead
molecules as integrase inhibitors. Finally, the molecular dynamics
study was performed to analyse stability and precise binding
interactions of the screened molecules inside the receptor cavity of
HIV integrase.
Materials and methods
At present, several popular commercial and freeware packages are
used for ligand-based method to derive 3D pharmacophore models
and also help in estimation of biological activities. Here we used
Discovery Studio 4.0 (DS)
24
for the 3D QSAR pharmacophore, virtual
screening and molecular docking studies. This is commercially
available software containing several module packages and widely
used in pharmacoinformatics drug discovery
25-28
. The 3D QSAR
Pharmacophore Generation module takes input of structure and
activity data for a set of potential HIV-integrase ligand to create
hypotheses. Two modules, HypoGen and HipHop are used for
ligand-based pharmacophore modelling. The HypoGen allows
identification of hypotheses that are common to the ‘active’
molecules of training set but absent in the ‘inactive’ molecules,
whilst HipHop identifies hypotheses present both in ‘active’ and
‘inactive’ compounds. In the present work the HypoGen module
was used to generate the hypotheses.
Dataset
1437 compounds belong to a collection of HIV-1 integrase inhibitors
were downloaded from BindingDB (http://www.bindingdb.org/)
with data on inhibitory activity (IC
50
). Duplicate and compounds
without definite activity values were deleted and finally 540
compounds considered as whole dataset for the study. Training and
test set compounds were separated from whole dataset for
pharmacophore model generation and validation of generated
model respectively. The molecules of the dataset have a wide range
2

of IC
50
, from 2.000 to 1000000.000 nM. For simplicity the whole
dataset was divided into three sets on the basis of inhibitory
activities values; highly active (IC
50
< 100.000 nM, +++), moderately
active (100.000 IC
50
< 1000.000 nM, ++) and least active (IC
50
1000.000 nM, +). For selection of the training set for
pharmacophore model generation in DS basic guidelines laid down
by Li et al.
29
were followed. The guidelines reported as a) molecules
should be selected to provide clear and brief information with
structure features and range of activity, b) at least 16 diverse
molecules for training set should be considered to ensure the
statistical significance and avoid chance correlation, c) the training
set must include the most and the least active molecules and d) the
biological activity data of the molecules should have spanned at
least 4 orders of magnitude. Following the above guidelines four
training sets were (Set 1, Set 2, Set 3 and Set 4) generated
containing 30 compounds each. It was also kept in mind that no
compounds were common in any two training sets except for the
most active and least active molecules. The remaining 410
molecules were considered as test set molecules for each set and
used for assessing the performance of pharmacophore model. The
2D/3D visualizer
24
of DS was used to generate three-dimensional
coordinates of the compounds. For each compound, the
coordinates were corrected, atoms were typed and energy was
minimized using the modified CHARMm force field
30, 31
. The several
packages of DS were used for pharmacophore, virtual screening and
molecular docking studies.
Pharmacophore model generation
In order to generate the pharmacophore space model the 3D QSAR
Pharmacophore Model Generation module of DS was used.
Conformations of the training set molecules were generated by Cat-
Conf program of the DS software package. Out of BEST/FAST, the
BEST method was considered to obtain multiple acceptable
conformations which provides complete and enhanced coverage of
conformational space with help of rigorous energy minimization
and optimizing the conformations by the poling algorithm
32
. In the
BEST algorithm, the chemical features are arranged in space instead
of simply the arrangement of atoms
33
. For prediction of the
favourable features for the highly active compounds of the dataset
the Feature mapping was considered. Mapped features were given
as input features for pharmacophore model generation. Using the
conformer along with chemical features the modules operates in
two modes such as HipHop and HypoGen. The HipHop approach
generates the pharmacophore models by using active compounds
only, while the HypoGen approach considered both active and
inactive compounds in order to find out a hypothesis which is
common in the active molecules and absent in the inactive
compounds
33
. Top ten hypotheses are generated by the HypoGen
with consideration of the training set, conformational models and
chemical features through three steps: constructive, subtractive
and optimization
34
. In the first step, hypotheses are generated that
are common in the most active compounds; in subtractive phase,
inactive compounds are removed from those that fit the
hypotheses. In final step, the remaining hypotheses improve the
score with help of small perturbations
33, 35
. The best hypothesis was
selected based on the best correlation coefficient (R), low root
mean square deviation (rmsd), cost function analysis and good
predictive ability.
All four sets were used to develop the pharmacophore models and
statistical parameters were calculated based on training & test sets
molecules. Statistical results are depicted in Table S2 in
Supplementary file. It was observed that Set 1 gives the better
statistical results compared to Sets 2 4. Hence the Set 1 (n
tr
= 30,
Fig. 1) was considered as the training set in the current study. In the
remaining section “training set” is explained as Set 1 compounds
and “test set” as test compounds of Set 1. Training set molecules in
SMILES format with observed and estimated activity along with fit
values of Set 2, Set3 and Set 4 are depicted in Tables S2, S3 and S4
respectively in Supplementary data. The information concerning the
structure and the biological activity of test set compounds of Set 1
is provided in Table S5 in the supplementary information, while all
the data regarding the training set (Set 1) molecules are reported in
Fig. 1. The activity value distribution of the training set molecules of
Set 1 is given in Figure S1 (Supplementary file).
3

Fig. 1 2D chemical structures of the training set compounds and the
activity values (IC
50
) are given in the parentheses.
Validation of pharmacophore model
In order to check the predictivity and applicability as well as
robustness of the pharmacoinformatics model, the pharmacophore
hypotheses developed were validated by five different methods, (1)
internal validation, (2) cost function analysis, (3) Fischer’s
randomization test, (4) test set prediction and (5) decoy set.
Internal validation
Leave-one out (LOO) cross-validation is one of the important
internally validation protocol of the selected model, in which one
compound was randomly deleted from training set in each cycle
and model redeveloped using the rest of the compounds with the
same parameters used in original model. The activity of the deleted
compounds was calculated based on the newly developed model.
The above procedure applied for all molecules of the training set
and predicted activity recorded. Two important statistical
parameters, the LOO cross-validated correlation coefficient (Q
2
) and
error of estimation (se) were calculated based upon predicted
activity of training compounds. It is reported that high Q
2
(>0.5) and
low se explained better predictive ability
36
.
Another parameter, the modified r
2
(r
2
m(LOO)
) developed by Roy et
al.
37, 38
was calculated to confirm the good predictive ability of the
training set molecules. This parameter measure the degree of
deviation of the predicted activity from the observed ones. It was
stated that model may be considered with r
2
m(LOO)
>0.5.
Cost function analysis
To select the final pharmacophore model several statistical
parameters were employed at the time of hypothesis generation,
these included spacing, uncertainty, and weight variation. The
spacing represents the minimum inter-features distance which may
be permissible in the final hypothesis, while the weight variation
reflects the level of magnitude explored by the hypothesis in which
every feature implies some degree of magnitude of the biological
activity of the compound. The default values of spacing and weight
variation are 300 and 0.3 respectively but some cases it varies from
400 to 100 and 1 to 2 respectively. The uncertainty returns the
error of prediction which signifies the standard deviation of the
error cost. The default value of uncertainty parameter is 3 but it
may vary from 1.5 to 4.0 depending upon the nature of work. The
cost-function analysis is an important aspect for selection of final
model which minimized three terms, viz., weight cost, error cost,
and configuration cost. The weight cost is directly proportional to
the deviation of weight variation from its input value. The error cost
is the deviation between the predicted activity and experimentally
determined activity of the molecule in the training set. A fixed cost
is determined by the complexity of the hypothesis space. The
configuration cost implies entropy of hypothesis space and it is
reported that value should have <17 for a good pharmacophore
model. HypoGen module also calculates the null hypothesis which is
the assumption that there is no relationship in the data, and the
experimental activities are distributed about their mean. It is
N
N
OH
O
HN
F
O
N
O
H1 (60.000 nM)
N
N
OH
O
HN
F
O
N
H2 (62.000 nM)
N
OH
O
HN
O
F
N
N
S
O
O
N
H3 (5.000 nM)
N
N
OH
O
HN
F
O
N
H
N
S
O
O
H4 (30.000 nM)
N
N
OH
O
HN
F
O
N
N
O
N
H
H5 (52.000 nM)
N
N
OH
O
HN
O
N
O
F
Cl
H6 (28.000 nM)
N
OH
O
HN
O
F
N
N
H7 (4.000 nM)
N
N
OH
O
HN
F
O
N
N
S
O
O
H8 (11.000 nM)
N
OH
O
HN
O
F
N
N
O
H9 (5.000 nM)
N
OH
O
HN
O
F
N
N
O
N
N
H10 (3.000 nM)
N
N
OH
O
HN
O
N
O
F
Br
H11 (44.000 nM)
N
OH
O
HN
O
F
N
N
S
O
O
N
H12 (10.000 nM)
N
O
OH
O
H
N
F
O
N
N
H13 (2.000)
N
N
OH
N
N
O
F
S
HN
O
O
H14 (12.000 nM)
N
N
OH
N
N
N
O
F
N
O
O
N
H15 (3.000 nM)
N
N
OH
N
N
O
F
O
H
2
N
H16 (6.000 nM)
N
N
OH
O
N
H
F
H17 (12.000 nM)
N
N
OH
N
N
O
F
N
N
H18 (20.000 nM)
N
N
OH
N
N
N
O
F
S
O
O
H19 (8.000 nM)
N
N
OH
N
N
N
H
H20 (2400.000 nM)
O
O
O
OH
F
H21 (250.000 nM)
O
O
O
OH
O
H22 (140.000 nM)
O
O
O
O
O
O
O
O
HO
OH
H23 (1900.000 nM)
N
H
H
N
S
S
O
O
O
O
O
O
H24 (97000.000 nM)
N
N
HN
H
N
N
O
O
O
HO
F
H25 (12.800 nM)
HN
NH
OH
O
HO
O
O
O
H26 (28000.000 nM)
N
N
H
N
O
HO
F
N
O
H27 (25.000 nM)
NH
NH
HO
OH
H28 (10000.000 nM)
H29 (1000000.000 nM)
N
N
OH
O
O
O
F
Cl
O
O
H30 (15600.000 nM)
N
O
O
O O
Cl
F
4

illustrated that the higher (>60) cost difference (cost = null cost -
total cost) indicted that the hypothesis does not reflect a chance
correlation.
Fischer’s randomization test
In order to check the strong relationship between the chemical
compound and the biological activity of the training set molecules
the Fischer’s randomization test was performed. In this method, the
biological activity was scrambled and assigned new values. After
that the pharmacophore hypotheses were generated with new
values of activity using the original pharmacophoric features and
constraints used to generate the original pharmacophore
hypotheses. If the randomization run generates improved
correlation coefficient and/or better statistical parameters than the
original hypothesis may be considered to be developed by chance.
Different number of spreadsheets are generated based on the
statistical significance randomization run. The statistical significance
is given by following equation.
[1 (1 )/ ]Significance a b
(1)
Where, a denotes the number of hypotheses with a total cost less
than the best hypothesis, whereas b implies a collection of
HypoGen runs and random runs. For example, at 95% confidence
level total number of random spreadsheet are generated as 19 (b =
20) and each generated spreadsheet is submitted to HypoGen using
the same parameters as the initial run. In the present study, the
developed pharmacophore model was tested at 95% confidence
level which produced 19 spreadsheets.
Test set prediction
The ability to judge the external predictivity of the model beyond
the training set molecules is an important step which verifies ability
of prediction of test compounds. Accordingly, in the present work
410 test compounds were predicted using the developed
pharmacophore model by Ligand Pharmacophore Mapping
protocol in DS, given in Table S1 (Supplementary file). Quality of
prediction of the pharmacophore model was adjudged best on
statistical parameters, R
2
pred
(correlation coefficient) and s
p
(error of
prediction)
39, 40
.
The R
2
pred
value depends on the average experimental activity of the
training set molecules. Since these parameters depend on average
value, it might be achieved for compounds with a wider range of
activity value, but this may not be guaranteed that the estimated
activity values are very close to those experimental activity. As a
result, instead of a good overall correlation being maintained, there
is chance of a significant numerical difference between the two
values. In order to better indicate the predictivity of the
pharmacoinformatics model, modified r
2
[r
2
m(test)
]
41, 42
values were
calculated (threshold value=0.5).
Decoy set
In order to check the efficiency of the screening capacity of the
selected pharmacophore model, it was validated using the decoy
set approach. Decoy set method checks how the model can select
active molecules over inactive molecules on screening with
amalgamated active and inactive molecules. In this purpose a set
decoy was generated by DecoyFinder1.1
43
. Decoys physically
resemble active inhibitors but differ chemically to avoid bias in the
enrichment factor calculation. Based on five parameters decoys
were selected and these included molecular weight, number of
rotational bonds, hydrogen-bond donor count, hydrogen-bond
acceptor count, and the octanolwater partition coefficient of the
active inhibitors. In order to discriminate decoys and active
inhibitors chemically, the MACCS fingerprints were calculated
according to the maximum Tanimoto coefficient values. The
screening dataset consisting of 80 active HIV-integrase inhibitors
and 900 decoys obtained from DecoyFinder was used for queries of
the selected hypothesis. Different statistical parameters including
the accuracy and enrichment factor (EF) were calculated to validate
the pharmacophore model. Screened molecules based on
pharmacophore model were ranked on basis of fit value. For the
assessment of effectiveness of screening the enrichment calculation
of the dataset was performed. The EF implies total known active
inhibitors retrieved from the part of screened database. In the
current study, EF (1%) was calculated from the top 1% hits. Another
parameter Boltzmann-enhanced discrimination of receiver
operating characteristic (BEDROC) which gives the significance of
the dataset screening was also calculated. The BEDROC is a
comprehensive form of receiver operating characteristic (ROC),
which recognises problems in the screening method. Calculation of
the enrichment factor and BEDROC are described by Bhayye et. al
44
.
Virtual screening
5

Citations
More filters
Journal ArticleDOI
TL;DR: The proposed de novo designed molecules can be considered as promising antibacterial chemical agents subject to experimental validation, in vitro.
Abstract: The rapidly increasing rate of antibiotic resistance is of great concern. Approximately two million deaths result annually from bacterial infections worldwide. Therefore, there is a paramount requi...

31 citations


Cites methods from "Structural requirements for potenti..."

  • ...Detailed protocol of the MD simulation using AMBER (Case et al., 2014) is described in our previous publication (Islam & Pillay, 2016)....

    [...]

Journal ArticleDOI
TL;DR: A series of 55 tyrosine derivatives designed for evaluation as selective COX-2 inhibitors and investigated by in silico for their anti-inflammatory activities using C-Docker concluded that out of 55, 19 molecules possessed best binding energy and these molecules had more selective and safer COx-2 inhibitor profile compared to the standard celecoxib.
Abstract: Drugs that inhibit cyclooxygenase-2 (COX-2) while sparing cyclooxygenase-1 (COX-1) represent a new attractive therapeutic development and offer new perspective for further use of COX-2 inhibitors. Intention of this work is to develop safer, selective COX-2 inhibitors that do not produce harmful effects. A series of 55 tyrosine derivatives were designed for evaluation as selective COX-2 inhibitors and investigated by in silico for their anti-inflammatory activities using C-Docker. The results of docking study showed that 35 molecules were found to selectively inhibit the enzyme COX-2. These molecules formed stable π hydrophobic and additional van der Waals interactions in the active site side pocket of COX-2. The molecules selected from docking studies were examined through ADMET descriptors and Osiris property explorer to find its safety profile as well. The tyrosine derivatives containing toxic fragments were eliminated. The results conclude that out of 55, 19 molecules possessed best binding energy (< −3.333 kcal/mol) and these molecules had more selective and safer COX-2 inhibitor profile compared to the standard celecoxib.

27 citations


Cites methods from "Structural requirements for potenti..."

  • ...Docking protocol validation The validation of the docking protocol is essential to analyse the prediction ability of the proposed method [12]....

    [...]

Journal ArticleDOI
TL;DR: This is the first report on prioritization of small molecules from National Cancer Institute (NCI) and Maybridge data sets towards targeting CtDAP‐AT, and the proposed compound shall aid in effective combating of a broad spectrum of C.t infections as it surpassed all the levels of prioritization.
Abstract: Chlamydia trachomatis (C.t) is a gram-negative obligate intracellular bacteria, which is a major causative of infectious blindness and sexually transmitted diseases. A surge in multidrug resistance among chlamydial species has posed a challenge to adopt alternative drug targeting strategies. Recently, in C.t, L,L-diaminopimelate aminotransferase (CtDAP-AT) is proven to be a potential drug target due its essential role in cell survival and host nonspecificity. Hence, in this study, a multilevel precision-based virtual screening of CtDAP-AT was performed to identify potential inhibitors, wherein, an integrative stringent scoring and filtration were performed by coupling, glide docking score, binding free energy, ADMET (absorption, distribution, metabolism, and excretion, toxicity) prediction, density functional theory (quantum mechanics), and molecular dynamics simulation (molecular mechanics). On cumulative analysis, NSC_5485 (1,3-bis((7-chloro-4-quinolinyl)amino)-2-propanol) was found to be the most potential lead, as it showed higher order significance in terms of binding affinity, bonded interactions, favorable ADMET, chemical reactivity, and greater stabilization during complex formation. This is the first report on prioritization of small molecules from National Cancer Institute (NCI) and Maybridge data sets (341 519 compounds) towards targeting CtDAP-AT. Thus, the proposed compound shall aid in effective combating of a broad spectrum of C.t infections as it surpassed all the levels of prioritization.

16 citations

Journal ArticleDOI
TL;DR: Pyruvoyl-dependent arginine decarboxylase, DNA-repair protein and porin (outer membrane protein) are reported as the most viable targets of C.t which can be potentially targeted by compounds.
Abstract: Chlamydia trachomatis (C.t) is a major causative of infectious blindness in world. It is a real challenge to combat Chlamydial infection as it is an intracellular pathogen. Hence, it is essential t...

13 citations


Cites background from "Structural requirements for potenti..."

  • ...…become a powerful tool to discover potential hits that can bind to a particular receptor site and block or trigger the activity of a target protein (Islam & Pillay, 2016; Kumar, Sinha, Sharma, Purohit, & Padwad, 2019; Purohit, Kumar, & Hallan, 2018; Rajendran, 2016; Rajendran, Gopalakrishnan, &…...

    [...]

Journal ArticleDOI
TL;DR: In this article, a multivariate QSAR study was conducted with 54 molecules employed by Ordered Predictors Selection (OPS) and Partial Least Squares (PLS) for the selection of variables and model construction.
Abstract: In the present study, 199 compounds derived from pyrimidine, pyrimidone and pyridopyrazine carboxamides with inhibitory activity against HIV-1 integrase were modeled. Subsequently, a multivariate QSAR study was conducted with 54 molecules employed by Ordered Predictors Selection (OPS) and Partial Least Squares (PLS) for the selection of variables and model construction, respectively. Topological, electrotopological, geometric, and molecular descriptors were used. The selected real model was robust and free from chance correlation; in addition, it demonstrated favorable internal and external statistical quality. Once statistically validated, the training model was used to predict the activity of a second data set ( n = 145). The root mean square deviation ( RMSD ) between observed and predicted values was 0.698. Although it is a value outside of the standards, only 15 (10.34%) of the samples exhibited higher residual values than 1 log unit, a result considered acceptable. Results of Williams and Euclidean applicability domains relative to the prediction showed that the predictions did not occur by extrapolation and that the model is representative of the chemical space of test compounds.

9 citations

References
More filters
Journal ArticleDOI

40,330 citations


"Structural requirements for potenti..." refers background or methods in this paper

  • ...The implication of the hypothesis was calculated as per equation (1)....

    [...]

  • ...In order to check the predictivity and applicability as well as robustness of the pharmacoinformatics model, the pharmacophore hypotheses developed were validated by five different methods, (1) internal validation, (2) cost function analysis, (3) Fischer’s randomization test, (4) test set prediction and (5) decoy set....

    [...]

  • ...[1 (1 ) / ] Significance a b = − + (1) Where, a denotes the number of hypotheses with a total cost less than the best hypothesis, whereas b implies a collection of HypoGen runs and random runs....

    [...]

Journal ArticleDOI
TL;DR: The CHARMM (Chemistry at Harvard Macromolecular Mechanics) as discussed by the authors is a computer program that uses empirical energy functions to model macromolescular systems, and it can read or model build structures, energy minimize them by first- or second-derivative techniques, perform a normal mode or molecular dynamics simulation, and analyze the structural, equilibrium, and dynamic properties determined in these calculations.
Abstract: CHARMM (Chemistry at HARvard Macromolecular Mechanics) is a highly flexible computer program which uses empirical energy functions to model macromolecular systems. The program can read or model build structures, energy minimize them by first- or second-derivative techniques, perform a normal mode or molecular dynamics simulation, and analyze the structural, equilibrium, and dynamic properties determined in these calculations. The operations that CHARMM can perform are described, and some implementation details are given. A set of parameters for the empirical energy function and a sample run are included.

14,725 citations

Journal ArticleDOI
TL;DR: Experimental and computational approaches to estimate solubility and permeability in discovery and development settings are described in this article, where the rule of 5 is used to predict poor absorption or permeability when there are more than 5 H-bond donors, 10 Hbond acceptors, and the calculated Log P (CLogP) is greater than 5 (or MlogP > 415).
Abstract: Experimental and computational approaches to estimate solubility and permeability in discovery and development settings are described In the discovery setting 'the rule of 5' predicts that poor absorption or permeation is more likely when there are more than 5 H-bond donors, 10 H-bond acceptors, the molecular weight (MWT) is greater than 500 and the calculated Log P (CLogP) is greater than 5 (or MlogP > 415) Computational methodology for the rule-based Moriguchi Log P (MLogP) calculation is described Turbidimetric solubility measurement is described and applied to known drugs High throughput screening (HTS) leads tend to have higher MWT and Log P and lower turbidimetric solubility than leads in the pre-HTS era In the development setting, solubility calculations focus on exact value prediction and are difficult because of polymorphism Recent work on linear free energy relationships and Log P approaches are critically reviewed Useful predictions are possible in closely related analog series when coupled with experimental thermodynamic solubility measurements

14,026 citations

Journal Article

11,116 citations

Journal ArticleDOI
TL;DR: Reduced molecular flexibility, as measured by the number of rotatable bonds, and low polar surface area or total hydrogen bond count are found to be important predictors of good oral bioavailability, independent of molecular weight.
Abstract: Oral bioavailability measurements in rats for over 1100 drug candidates studied at SmithKline Beecham Pharmaceuticals (now GlaxoSmithKline) have allowed us to analyze the relative importance of molecular properties considered to influence that drug property. Reduced molecular flexibility, as measured by the number of rotatable bonds, and low polar surface area or total hydrogen bond count (sum of donors and acceptors) are found to be important predictors of good oral bioavailability, independent of molecular weight. That on average both the number of rotatable bonds and polar surface area or hydrogen bond count tend to increase with molecular weight may in part explain the success of the molecular weight parameter in predicting oral bioavailability. The commonly applied molecular weight cutoff at 500 does not itself significantly separate compounds with poor oral bioavailability from those with acceptable values in this extensive data set. Our observations suggest that compounds which meet only the two cr...

5,191 citations

Frequently Asked Questions (10)
Q1. What are the contributions mentioned in the paper "Structural requirements for potential hiv-integrase inhibitors identified using pharmacophore-based virtual screening and molecular dynamics studies†" ?

Pillay et al. this paper used a pharmacophore-based virtual screening and molecular dynamics studies to identify potential HIV-integrase inhibitors. 

In order to check the predictivity and applicability as well as robustness of the pharmacoinformatics model, the pharmacophore hypotheses developed were validated by five different methods, (1) internal validation, (2) cost function analysis, (3) Fischer’s randomization test, (4) test set prediction and (5) decoy set. 

Importance of two HB acceptor and one ring aromatic sites in the pharmacophore can be correlated with binding mode of the most active and screened compounds. 

Two important statistical parameters, the LOO cross-validated correlation coefficient (Q 2 ) and error of estimation (se) were calculated based upon predicted activity of training compounds. 

After deletion of redundant molecules, the remaining 1121 compounds were fitted to the pharmacophore model by the Ligand Pharmacophore Mapping protocol of DS with maximum omitted feature set to ‘0’. 

In order to find potential molecules that are HIV integrase inhibitors virtual screening is a powerful technique and also effective as an alternative to high-throughput screening methodologies. 

Based on five parameters decoys were selected and these included molecular weight, number of rotational bonds, hydrogen-bond donor count, hydrogen-bond acceptor count, and the octanol–water partition coefficient of the active inhibitors. 

The guidelines reported as a) molecules should be selected to provide clear and brief information with structure features and range of activity, b) at least 16 diverse molecules for training set should be considered to ensure the statistical significance and avoid chance correlation, c) the training set must include the most and the least active molecules and d) the biological activity data of the molecules should have spanned at least 4 orders of magnitude. 

It is reported that a protein structure may be suitable for molecular docking study with the low resolution (<2.5Å) and R-factor (<0.28) 45 . 

Different statistical parameters including the accuracy and enrichment factor (EF) were calculated to validate the pharmacophore model.