scispace - formally typeset
Search or ask a question

Showing papers in "Sar and Qsar in Environmental Research in 2016"


Journal ArticleDOI
TL;DR: The OECD QSAR Toolbox is a software application intended to be used by governments, the chemical industry and other stakeholders in filling gaps in (eco)toxicity data needed for assessing the hazards of chemicals.
Abstract: The OECD QSAR Toolbox is a software application intended to be used by governments, the chemical industry and other stakeholders in filling gaps in (eco)toxicity data needed for assessing the hazards of chemicals. The development and release of the Toolbox is a cornerstone in the computerization of hazard assessment, providing an 'all inclusive' tool for the application of category approaches, such as read-across and trend analysis, in a single software application, free of charge. The Toolbox incorporates theoretical knowledge, experimental data and computational tools from various sources into a logical workflow. The main steps of this workflow are substance identification, identification of relevant structural characteristics and potential toxic mechanisms of interaction (i.e. profiling), identification of other chemicals that have the same structural characteristics and/or mechanism (i.e. building a category), data collection for the chemicals in the category and use of the existing experimental data to fill the data gap(s). The description of the Toolbox workflow and its main functionalities is the scope of the present article.

115 citations


Journal ArticleDOI
TL;DR: The development of an automated KNIME workflow to curate and correct errors in the structure and identity of chemicals using the publicly available PHYSPROP physicochemical properties and environmental fate datasets is described.
Abstract: The increasing availability of large collections of chemical structures and associated experimental data provides an opportunity to build robust QSAR models for applications in different fields. On...

81 citations


Journal ArticleDOI
TL;DR: It is demonstrated that AD is not a monolithic concept and can be broken down into three well-defined sub-domains assessing confidence at the model, prediction and decision levels, respectively.
Abstract: In recent years the applicability domain (AD) of a prediction system has become an important concern in (Q)SAR modelling, especially in the context of human safety assessment. Today AD is an active research topic, and many methods have been designed to estimate the adequacy of a model and the confidence in its outcome for a given prediction task. Unfortunately, the wide spectrum of techniques developed for this purpose is based on various definitions of the concept of AD, often taking into account different types of information. This variety of methodologies confuses the end users and makes the comparison of the AD for different models almost impossible. In this article, we demonstrate that AD is not a monolithic concept and can be broken down into three well-defined sub-domains assessing confidence at the model, prediction and decision levels, respectively. By leveraging this separation of concerns we have an opportunity to clarify, formalize and extend the definition of AD. We propose a framework that captures this new vision with the aim to initiate a global effort to converge towards a common AD definition within the (Q)SAR community.

76 citations


Journal ArticleDOI
TL;DR: The utility of protein corona composition to predict the bioactivity of gold nanoparticles and identified the main proteins that act as promoters or inhibitors of cell association are confirmed and could be used to support new toxicological studies on gold-based nanomaterials.
Abstract: The understanding of the mechanisms and interactions that occur when nanomaterials enter biological systems is important to improve their future use. The adsorption of proteins from biological fluids in a physiological environment to form a corona on the surface of nanoparticles represents a key step that influences nanoparticle behaviour. In this study, the quantitative description of the composition of the protein corona was used to study the effect on cell association induced by 84 surface-modified gold nanoparticles of different sizes. Quantitative relationships between the protein corona and the activity of the gold nanoparticles were modelled by using several machine learning-based linear and non-linear approaches. Models based on a selection of only six serum proteins had robust and predictive results. The Projection Pursuit Regression method had the best performances (r(2) = 0.91; Q(2)loo = 0.81; r(2)ext = 0.79). The present study confirmed the utility of protein corona composition to predict the bioactivity of gold nanoparticles and identified the main proteins that act as promoters or inhibitors of cell association. In addition, the comparison of several techniques showed which strategies offer the best results in prediction and could be used to support new toxicological studies on gold-based nanomaterials.

44 citations


Journal ArticleDOI
TL;DR: It is evident from the results that the proposed penalized linear regression model with L1/2-norm may possibly be a promising penalized method in the field of computational chemistry research, especially when the number of molecular descriptors exceeds thenumber of compounds.
Abstract: In high-dimensional quantitative structure–activity relationship (QSAR) modelling, penalization methods have been a popular choice to simultaneously address molecular descriptor selection and QSAR model estimation. In this study, a penalized linear regression model with L1/2-norm is proposed. Furthermore, the local linear approximation algorithm is utilized to avoid the non-convexity of the proposed method. The potential applicability of the proposed method is tested on several benchmark data sets. Compared with other commonly used penalized methods, the proposed method can not only obtain the best predictive ability, but also provide an easily interpretable QSAR model. In addition, it is noteworthy that the results obtained in terms of applicability domain and Y-randomization test provide an efficient and a robust QSAR model. It is evident from the results that the proposed method may possibly be a promising penalized method in the field of computational chemistry research, especially when the nu...

30 citations


Journal ArticleDOI
TL;DR: Logistic regression performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds.
Abstract: The paper highlights the use of the logistic regression (LR) method in the construction of acceptable statistically significant, robust and predictive models for the classification of chemicals according to their aquatic toxic modes of action. Essentials accounting for a reliable model were all considered carefully. The model predictors were selected by stepwise forward discriminant analysis (LDA) from a combined pool of experimental data and chemical structure-based descriptors calculated by the CODESSA and DRAGON software packages. Model predictive ability was validated both internally and externally. The applicability domain was checked by the leverage approach to verify prediction reliability. The obtained models are simple and easy to interpret. In general, LR performs much better than LDA and seems to be more attractive for the prediction of the more toxic compounds, i.e. compounds that exhibit excess toxicity versus non-polar narcotic compounds and more reactive compounds versus less reactive compounds. In addition, model fit and regression diagnostics was done through the influence plot which reflects the hat-values, studentized residuals, and Cook's distance statistics of each sample. Overdispersion was also checked for the LR model. The relationships between the descriptors and the aquatic toxic behaviour of compounds are also discussed.

28 citations


Journal ArticleDOI
TL;DR: The purpose of this study was to investigate the effect of drug and disease features as well as data fusion to predict drug–disease interactions and proposed a computational method named scored mean kernel fusion (SMKF), which uses a new method to score the average aggregation operator called scored mean.
Abstract: Prediction of drug-disease associations is one of the current fields in drug repositioning that has turned into a challenging topic in pharmaceutical science. Several available computational methods use network-based and machine learning approaches to reposition old drugs for new indications. However, they often ignore features of drugs and diseases as well as the priority and importance of each feature, relation, or interactions between features and the degree of uncertainty. When predicting unknown drug-disease interactions there are diverse data sources and multiple features available that can provide more accurate and reliable results. This information can be collectively mined using data fusion methods and aggregation operators. Therefore, we can use the feature fusion method to make high-level features. We have proposed a computational method named scored mean kernel fusion (SMKF), which uses a new method to score the average aggregation operator called scored mean. To predict novel drug indications, this method systematically combines multiple features related to drugs or diseases at two levels: the drug-drug level and the drug-disease level. The purpose of this study was to investigate the effect of drug and disease features as well as data fusion to predict drug-disease interactions. The method was validated against a well-established drug-disease gold-standard dataset. When compared with the available methods, our proposed method outperformed them and competed well in performance with area under cover (AUC) of 0.91, F-measure of 84.9% and Matthews correlation coefficient of 70.31%.

26 citations


Journal ArticleDOI
TL;DR: The predictive boundaries for the derived models are rigorously defined by using the conformal prediction framework, thus no ambiguity exists as to the level of similarity needed for new compounds to be in or out of the predictive boundaries of thederived models where reliable predictions can be expected.
Abstract: A fundamental element when deriving a robust and predictive in silico model is not only the statistical quality of the model in question but, equally important, the estimate of its predictive bound ...

25 citations


Journal ArticleDOI
TL;DR: It was concluded that a para-substituted benzene ring with bulkier electron-donating groups and aminoalkyl chains are required for higher inhibitory capacity against M. tuberculosis.
Abstract: We earlier reported thiophene-containing trisubstituted methanes (TRSMs) as novel cores carrying anti-tubercular activity, and identified S006-830 as the phenotypic lead with potent bactericidal activity against single- and multi-drug resistant clinical isolates of Mycobacterium tuberculosis (M. tb). In this work, we carried out additional synthesis of several TRSMs. The reaction scheme essentially followed the Grignard reaction and Friedel-Crafts alkylation, followed by insertion of a dialkylaminoethyl chain. We also performed microbiological evaluations including in vitro screening against the virulent strain M. tb H37Rv, cytotoxicity assessment in the Vero C-1008 cell line, and 3D-QSAR studies with comparative molecular field analysis (CoMFA) and comparative molecular similarity index analysis (CoMSIA). CoMFA and CoMSIA models yielded good statistical results in terms of q2 and r2 values, suggesting the validity of the models. It was concluded that a para-substituted benzene ring with bulkier electron-donating groups and aminoalkyl chains are required for higher inhibitory capacity against M. tuberculosis. We believe that these insights will rationally guide the design of newer, optimal, TRSMs.

24 citations


Journal ArticleDOI
TL;DR: A large dataset consisting of 963 organic compounds with acute toxicity towards fathead minnow was studied with a QSAR approach and a global prediction model for compounds without known mode of action and two local models for organic compounds that exhibit narcosis toxicity and excess toxicity were developed.
Abstract: Acute fathead minnow toxicity is an important basis of hazard and risk assessment for compounds in the aquatic environment. In this paper, a large dataset consisting of 963 organic compounds with acute toxicity towards fathead minnow was studied with a QSAR approach. All molecular structures of compounds were optimized by the hybrid density functional theory method. Dragon molecular descriptors and log Kow were selected to describe molecular information. Genetic algorithm and multiple linear regression analysis were combined to develop models. A global prediction model for compounds without known mode of action and two local models for organic compounds that exhibit narcosis toxicity and excess toxicity were developed, respectively. For all developed models, internal validations were performed by cross-validation and external validations were implemented by the setting of validation set. In addition, applicability domains of models were evaluated using a leverage method and outliers were listed and checked using toxicological knowledge.

23 citations


Journal ArticleDOI
TL;DR: This article reports on one method that can be used to build robust models for the prediction of compounds’ properties from their chemical structure that has been developed by combining a genetic algorithm, a counter-propagation artificial neural network and cross-validation.
Abstract: Large worldwide use of chemicals has caused great concern about their possible adverse effects on human health, flora and fauna. Increased production of new chemicals has also increased demand for their risk assessment. Traditionally, results from animal tests have been used to assess toxicity of chemicals. However, such methods are ethically questionable since they involve killing and causing suffering of the test animals. Therefore, new in silico methods are being sought to replace the traditional in vivo and in vitro testing methods. In this article we report on one method that can be used to build robust models for the prediction of compounds' properties from their chemical structure. The method has been developed by combining a genetic algorithm, a counter-propagation artificial neural network and cross-validation. It has been tested using existing data on toxicity to fathead minnow (Pimephales promelas). The results show that the method may give reliable results for chemicals belonging to the applicability domain of the developed models. Therefore, it can aid the risk assessment of chemicals and consequently reduce demand for animal tests.

Journal ArticleDOI
TL;DR: In this article, a quantitative activity-activity relationship (QAAR) model was proposed to predict toxicity of pharmaceutical and personal care products to a particular species using available experimental toxicity data from a different species, thus reducing the tests on organisms of higher trophic level.
Abstract: Pharmaceutical and Personal Care Products (PPCPs) became a class of contaminants of emerging concern because are ubiquitously detected in surface water and soil, where they can affect wildlife. Ecotoxicological data are only available for a few PPCPs, thus modelling approaches are essential tools to maximize the information contained in the existing data. In silico methods may be helpful in filling data gaps for the toxicity of PPCPs towards various ecological indicator organisms. The good correlation between toxicity toward Daphnia magna and those on two fish species (Pimephales promelas and Oncorhynchus mykiss), improved by the addition of one theoretical molecular descriptor, allowed us to develop predictive models to investigate the relationship between toxicities in different species. The aim of this work is to propose quantitative activity-activity relationship (QAAR) models, developed in QSARINS and validated for their external predictivity. Such models can be used to predict the toxicity of PPCPs to a particular species using available experimental toxicity data from a different species, thus reducing the tests on organisms of higher trophic level. Similarly, good QAAR models, implemented by molecular descriptors to improve the quality, are proposed here for fish interspecies. We also comment on the relevance of autocorrelation descriptors in improving all studied interspecies correlations.

Journal ArticleDOI
TL;DR: A round-robin exercise was conducted within the CALEIDOS LIFE project to assess the hazard posed by a substance, applying in silico methods and read-across approaches based on three endpoints: mutagenicity, bioconcentration factor and fish acute toxicity.
Abstract: A round-robin exercise was conducted within the CALEIDOS LIFE project. The participants were invited to assess the hazard posed by a substance, applying in silico methods and read-across approaches. The exercise was based on three endpoints: mutagenicity, bioconcentration factor and fish acute toxicity. Nine chemicals were assigned for each endpoint and the participants were invited to complete a specific questionnaire communicating their conclusions. The interesting aspect of this exercise is the justification behind the answers more than the final prediction in itself. Which tools were used? How did the approach selected affect the final answer?

Journal ArticleDOI
TL;DR: Thirty-seven quinoxaline derivatives were selected to develop a significant pharmacophore model with good certainty and five different leads have been suggested as putative novel candidates for the exploration of potent Gly/NMDA receptor antagonists.
Abstract: The Gly/NMDA receptor has become known as potential target for the management of neurodegenerative diseases. Discovery of Gly/NMDA antagonists has thus attracted much attention in recent years. In the present research, a cheminformatics approach has been used to determine structural requirements for Gly/NMDA antagonism and to identify potential antagonists. Here, 37 quinoxaline derivatives were selected to develop a significant pharmacophore model with good certainty. The selected model was validated by leave-one-out cross-validation, an external test set, decoy set and Y-randomization test. Applicability domain was verified by the standardization approach. The validated 3D-QSAR model was used to screen virtual hits from the ZINC database by pharmacophore mapping. Molecular docking was used for assessment of receptor-ligand binding modes and binding affinities. The GlideScore and molecular interactions with critical amino acids were considered as crucial features to identify final hits. Furthermore, hits were analysed for in silico pharmacokinetic parameters and Lipinski's rule of five, demonstrating their potential as drug-like candidates. The PubChem and SciFinder search tools were used to authenticate the novelty of leads retrieved. Finally, five different leads have been suggested as putative novel candidates for the exploration of potent Gly/NMDA receptor antagonists.

Journal ArticleDOI
TL;DR: The generated 3D-QSAR and HQSAR models, activity cliff analysis, molecular docking and dynamic studies for dual target protein inhibitors provide key structural scaffolds that serve as building blocks in designing drug-like molecules for neurodegenerative diseases.
Abstract: Dual inhibition of A2A and MAO-B is an emerging strategy in neurodegenerative diseases, such as Alzheimer’s disease (AD) and Parkinson’s disease (PD). In this study, atom-based three-dimensional quantitative structure–activity relationship (3D-QSAR) and hologram quantitative structure–activity relationship (HQSAR) models were generated with benzothiazine and deazaxanthine derivatives. Based on activity against A2A and MAO-B, two statistically significant 3D-QSAR models (r2 = 0.96, q2 = 0.76 and r2 = 0.91, q2 = 0.63) and HQSAR models (r2 = 0.93, q2 = 0.68 and r2 = 0.97, q2 = 0.58) were developed. In an activity cliff analysis, structural outliers were identified by calculating the Mahalanobis distance for a pair of compounds with A2A and MAO-B inhibitory activities. The generated 3D-QSAR and HQSAR models, activity cliff analysis, molecular docking and dynamic studies for dual target protein inhibitors provide key structural scaffolds that serve as building blocks in designing drug-like molecules for...

Journal ArticleDOI
Mare Oja1, Uko Maran1
TL;DR: This study focuses on neutral and amphoteric compounds and their membrane permeabilities across the range of pH values found in the human intestine and reveals that membrane permeability depends on multiple structural characteristics: the partition coefficient, hydrogen bond properties and the shape of the molecules.
Abstract: Human intestinal absorption is a key property for orally administered drugs and is dependent on pH. This study focuses on neutral and amphoteric compounds and their membrane permeabilities across the range of pH values found in the human intestine. The membrane permeability values for 15 neutral and 60 amphoteric compounds at pH 3, 5, 7.4 and 9 were measured using the parallel artificial membrane permeability assay (PAMPA). For each data series the quantitative structure–permeability relationships were developed and analysed. The results show that the membrane permeability of neutral compounds is attributed to a single structural characteristic, the hydrogen bond donor ability. Amphoteric compounds are more complex because of their chemical constitution, and therefore require three-parameter models to describe and predict membrane permeability. Analysis of the models for amphoteric compounds reveals that membrane permeability depends on multiple structural characteristics: the partition coefficien...

Journal ArticleDOI
TL;DR: This model gives an insight into the relationships between structural physicochemical properties and aquatic toxicity as well as a satisfactory quantitative structure–property correlation, allowing prediction of aquatic toxicity scores of ILs.
Abstract: VolSurf+ in silico physicochemical descriptors for both the cationic and the anionic counterparts of ionic liquids (ILs) have been derived. These descriptors, suitable for molecular modelling of IL structures which, due to their amphiphilic nature, interact strongly with biological matrices, can be related to aquatic toxicity by means of a partial least squares statistical model. This model gives an insight into the relationships between structural physicochemical properties and aquatic toxicity as well as a satisfactory quantitative structure-property correlation, allowing prediction of aquatic toxicity scores of ILs.

Journal ArticleDOI
TL;DR: The results suggest that the developed QSPR models are reliable for predicting the PPB affinity of structurally diverse chemicals and can be useful for initial screening of candidate molecules in the drug development process.
Abstract: The prediction of the plasma protein binding (PPB) affinity of chemicals is of paramount significance in the drug development process. In this study, ensemble machine learning-based QSPR models have been established for a four-category classification and PPB affinity prediction of diverse compounds using a large PPB dataset of 930 compounds and in accordance with the OECD guidelines. The structural diversity of the chemicals was tested by the Tanimoto similarity index. The external predictive power of the developed QSPR models was evaluated through internal and external validations. In the QSPR models, XLogP was the most important descriptor. In the test data, the classification QSPR models rendered an accuracy of >93%, while the regression QSPR models yielded r(2) of >0.920 between the measured and predicted PPB affinities, with the root mean squared error <9.77. Values of statistical coefficients derived for the test data were above their threshold limits, thus put a high confidence in this analysis. The QSPR models in this study performed better than any of the previous studies. The results suggest that the developed QSPR models are reliable for predicting the PPB affinity of structurally diverse chemicals. They can be useful for initial screening of candidate molecules in the drug development process.

Journal ArticleDOI
TL;DR: The study indicated that the hydrogen bonding ability, atom polarizabilities and ring complexity are predominant factors for inhibitors’ antimalarial activity.
Abstract: Plasmodium falciparum, the most fatal parasite that causes malaria, is responsible for over one million deaths per year. P. falciparum dihydroorotate dehydrogenase (PfDHODH) has been validated as a promising drug development target for antimalarial therapy since it catalyzes the rate-limiting step for DNA and RNA biosynthesis. In this study, we investigated the quantitative structure-activity relationships (QSAR) of the antimalarial activity of PfDHODH inhibitors by generating four computational models using a multilinear regression (MLR) and a support vector machine (SVM) based on a dataset of 255 PfDHODH inhibitors. All the models display good prediction quality with a leave-one-out q(2) >0.66, a correlation coefficient (r) >0.85 on both training sets and test sets, and a mean square error (MSE) <0.32 on training sets and <0.37 on test sets, respectively. The study indicated that the hydrogen bonding ability, atom polarizabilities and ring complexity are predominant factors for inhibitors' antimalarial activity. The models are capable of predicting inhibitors' antimalarial activity and the molecular descriptors for building the models could be helpful in the development of new antimalarial drugs.

Journal ArticleDOI
TL;DR: Five in silico principal properties (PPs) for 218 heterocyclic cations and four PPs for 38 organic and inorganic anionic counterparts of ionic liquids (ILs) were derived by the VolSurf+ approach, indicating the possibility to extend the predictive model to a set of 520 ILs.
Abstract: Five in silico principal properties (PPs) for 218 heterocyclic cations and four PPs for 38 organic and inorganic anionic counterparts of ionic liquids (ILs) were derived by the VolSurf+ approach. VolSurf+ physicochemical descriptors take into account several cationic structural features of ILs such as heterocyclic aromatic and non-aromatic cationic cores, alkyl chain length, presence of oxygen atoms in the substituents as well as the properties of a wide variety of inorganic and organic anions. Combination of these cation and anion PPs can provide descriptors for over 8000 ILs, thus allowing the development of QSPR models for IL cytotoxicity (IPC-81 rat cell line) and enzyme toxicity (acetylcholinesterase inhibition). The adoption of a Partial Least Squares approach, relating PPs and toxicities, provided affordable predictions for ILs in both learning and external validation sets, implying the possibility to extend the predictive model to a set of 520 ILs. This allows us to establish priorities in selecting ILs for experimental hazard assessment as required by the REACH regulation.

Journal ArticleDOI
TL;DR: The proposed QSRR models outperformed the previous reports, and the temperature-dependent models offered a much wider applicability domain and can be useful tools in predicting the reactivities of chemicals towards NO3 radicals in the atmosphere, hence, their persistence and exposure risk assessment.
Abstract: Experimental determinations of the rate constants of the reaction of NO3 with a large number of organic chemicals are tedious, and time and resource intensive; and the development of computational methods has widely been advocated. In this study, we have developed room-temperature (298 K) and temperature-dependent quantitative structure-reactivity relationship (QSRR) models based on the ensemble learning approaches (decision tree forest (DTF) and decision treeboost (DTB)) for predicting the rate constant of the reaction of NO3 radicals with diverse organic chemicals, under OECD guidelines. Predictive powers of the developed models were established in terms of statistical coefficients. In the test phase, the QSRR models yielded a correlation (r(2)) of >0.94 between experimental and predicted rate constants. The applicability domains of the constructed models were determined. An attempt has been made to provide the mechanistic interpretation of the selected features for QSRR development. The proposed QSRR models outperformed the previous reports, and the temperature-dependent models offered a much wider applicability domain. This is the first report presenting a temperature-dependent QSRR model for predicting the nitrate radical reaction rate constant at different temperatures. The proposed models can be useful tools in predicting the reactivities of chemicals towards NO3 radicals in the atmosphere, hence, their persistence and exposure risk assessment.

Journal ArticleDOI
TL;DR: A QSAR model with uncertainty estimates is used to predict biodegradability for a set of substances from a publicly available data set, allowing for a more complete assessment of the model than would be possible through a traditional statistical analysis.
Abstract: The ability to determine the biodegradability of chemicals without resorting to expensive tests is ecologically and economically desirable. Models based on quantitative structure–activity relations (QSAR) provide some promise in this direction. However, QSAR models in the literature rarely provide uncertainty estimates in more detail than aggregated statistics such as the sensitivity and specificity of the model’s predictions. Almost never is there a means of assessing the uncertainty in an individual prediction. Without an uncertainty estimate, it is impossible to assess the trustworthiness of any particular prediction, which leaves the model with a low utility for regulatory purposes. In the present work, a QSAR model with uncertainty estimates is used to predict biodegradability for a set of substances from a publicly available data set. Separation was performed using a partial least squares discriminant analysis model, and the uncertainty was estimated using bootstrapping. The uncertainty pred...

Journal ArticleDOI
TL;DR: The electron-acceptance chemical potential and the maximum positive charge of the hydrogen atom are found to be the most important descriptors for the joint toxicity of aromatic compounds.
Abstract: Four types of reactivity indices were employed to construct quantitative structure–activity relationships for the assessment of toxicity of organic chemical mixtures. Results of analysis indicated that the maximum positive charge of the hydrogen atom and the inverse of the apolar surface area are the most important descriptors for the toxicity of mixture of benzene and its derivatives to Vibrio fischeri. The toxicity of mixture of aromatic compounds to green alga Scenedesmus obliquus is mainly affected by the electron flow and electrostatic interactions. The electron-acceptance chemical potential and the maximum positive charge of the hydrogen atom are found to be the most important descriptors for the joint toxicity of aromatic compounds.

Journal ArticleDOI
TL;DR: This study showed that recursive random forests are very efficient in variable selection and for the development of predictive in silico models of liver toxicity and over 95% redundant descriptors could be reduced from modelling for all the chemical, biological and hybrid models in this study.
Abstract: In this study, recursive random forests were used to build classification models for mouse liver toxicity. The mouse liver toxicity endpoint (67 toxic and 166 non-toxic) was a composition of four in vivo chronic systemic and carcinogenic toxicity endpoints (non-proliferative, neoplastic, proliferative and gross pathology). A multiple under-sampling approach and a shifted classification threshold of 0.288 (non-toxic < 0.288 and toxic ≥ 0.288) were used to cope with the unbalanced data. Our study showed that recursive random forests are very efficient in variable selection and for the development of predictive in silico models. Generally, over 95% redundant descriptors could be reduced from modelling for all the chemical, biological and hybrid models in this study. The predictive performance of chemical models (CCR of 0.73) is comparable with hybrid model performance (CCR of 0.74). Descriptors related to the octanol-water partition coefficient are vital for model performance. The in vitro endpoint of CYP2A2 played a key role in the development and interpretation of hybrid models. Identifying high-throughput screening assays relevant to liver toxicity would be key for improving in silico models of liver toxicity.

Journal ArticleDOI
TL;DR: Hydrogen bond donating interaction (bB) and cavity formation and dispersion interactions (vV) stood out as the two most influential factors controlling the adsorption of SOCs onto SWCNTs.
Abstract: The linear solvation energy relationship (LSER) was applied to predict the adsorption coefficient (K) of synthetic organic compounds (SOCs) on single-walled carbon nanotubes (SWCNTs). A total of 40 log K values were used to develop and validate the LSER model. The adsorption data for 34 SOCs were collected from 13 published articles and the other six were obtained in our experiment. The optimal model composed of four descriptors was developed by a stepwise multiple linear regression (MLR) method. The adjusted r2 (r2adj) and root mean square error (RMSE) were 0.84 and 0.49, respectively, indicating good fitness. The leave-one-out cross-validation Q2 () was 0.79, suggesting the robustness of the model was satisfactory. The external Q2 () and RMSE (RMSEext) were 0.72 and 0.50, respectively, showing the model’s strong predictive ability. Hydrogen bond donating interaction (bB) and cavity formation and dispersion interactions (vV) stood out as the two most influential factors controlling the adsorption...

Journal ArticleDOI
TL;DR: Three-dimensional quantitative structure–activity relationship (3D-QSAR) modelling was conducted on a series of leucine-rich repeat kinase 2 (L RRK2) antagonists using CoMFA and CoMSIA methods to provide useful insights for designing novel and potent LRRK2 inhibitors.
Abstract: Three-dimensional quantitative structure-activity relationship (3D-QSAR) modelling was conducted on a series of leucine-rich repeat kinase 2 (LRRK2) antagonists using CoMFA and CoMSIA methods. The data set, which consisted of 37 molecules, was divided into training and test subsets by using a hierarchical clustering method. Both CoMFA and CoMSIA models were derived using a training set on the basis of the common substructure-based alignment. The optimum PLS model built by CoMFA and CoMSIA provided satisfactory statistical results (q(2) = 0.589 and r(2) = 0.927 and q(2) = 0.473 and r(2) = 0.802, respectively). The external predictive ability of the models was evaluated by using seven compounds. Moreover, an external evaluation set with known experimental data was used to evaluate the external predictive ability of the porposed models. The statistical parameters indicated that CoMFA (after region focusing) has high predictive ability in comparison with standard CoMFA and CoMSIA models. Molecular docking was also performed on the most active compound to investigate the existence of interactions between the most active inhibitor and the LRRK2 receptor. Based on the obtained results and CoMFA contour maps, some features were introduced to provide useful insights for designing novel and potent LRRK2 inhibitors.

Journal ArticleDOI
TL;DR: Pharmacophore-based virtual screening combined with docking study and biological evaluation as a rational strategy for identification of novel hits as antimalarial agents showed good potential to become novel antimalaria agents.
Abstract: Plasmodium falciparum dihydroorotate dehydrogenase (PfDHODH) catalyses the fourth reaction of de novo pyrimidine biosynthesis in parasites, and represents an important target for the treatment of malaria. In this study, we describe pharmacophore-based virtual screening combined with docking study and biological evaluation as a rational strategy for identification of novel hits as antimalarial agents. Pharmacophore models were established from known PfDHODH inhibitors using the GALAHAD module with IC50 values ranging from 0.033 μM to 142 μM. The best pharmacophore model consisted of three hydrogen bond acceptor, one hydrogen bond donor and one hydrophobic features. The pharmacophore models were validated through receiver operating characteristic and Gunere-Henry scoring methods. The best pharmacophore model as a 3D search query was searched against the IBS database. Several compounds with different structures (scaffolds) were retrieved as hit molecules. Among these compounds, those with a QFIT value of more than 81 were docked in the PfDHODH enzyme to further explore the binding modes of these compounds. In silico pharmacokinetic and toxicities were predicted for the best docked molecules. Finally, the identified hits were evaluated in vivo for their antimalarial activity in a parasite inhibition assay. The hits reported here showed good potential to become novel antimalarial agents.

Journal ArticleDOI
Shuling Yu1, Shufang Gao1, Ying Gan1, Yi Zhang1, X Ruan, Yali Wang1, L Yang1, Jiahua Shi1 
TL;DR: Two different methods, which are multiple linear regression based on the descriptors generated using Dragon software and hologram quantitative structure–activity relationships, were employed to predict suspended particulate matter (SPM) derived log KOC and generator column, shake flask and slow stirring methodderived log KOW values of PCBs.
Abstract: Quantitative structure-property relationship modelling can be a valuable alternative method to replace or reduce experimental testing. In particular, some endpoints such as octanol-water (KOW) and organic carbon-water (KOC) partition coefficients of polychlorinated biphenyls (PCBs) are easier to predict and various models have been already developed. In this paper, two different methods, which are multiple linear regression based on the descriptors generated using Dragon software and hologram quantitative structure-activity relationships, were employed to predict suspended particulate matter (SPM) derived log KOC and generator column, shake flask and slow stirring method derived log KOW values of 209 PCBs. The predictive ability of the derived models was validated using a test set. The performances of all these models were compared with EPI Suite™ software. The results indicated that the proposed models were robust and satisfactory, and could provide feasible and promising tools for the rapid assessment of the SPM derived log KOC and generator column, shake flask and slow stirring method derived log KOW values of PCBs.

Journal ArticleDOI
TL;DR: It can be suggested that the proposed N-tuple topological/geometric cutoffs constitute a relevant criteria for generating MDs codifying particular atomic relations, ultimately useful in enhancing the modelling capacity of the QuBiLS-MIDAS 3D-MDs.
Abstract: Novel N-tuple topological/geometric cutoffs to consider specific inter-atomic relations in the QuBiLS-MIDAS framework are introduced in this manuscript. These molecular cutoffs permit the taking into account of relations between more than two atoms by using (dis-)similarity multi-metrics and the concepts related with topological and Euclidean-geometric distances. To this end, the kth two-, three- and four-tuple topological and geometric neighbourhood quotient (NQ) total (or local-fragment) spatial-(dis)similarity matrices are defined, to represent 3D information corresponding to the relations between two, three and four atoms of the molecular structures that satisfy certain cutoff criteria. First, an analysis of a diverse chemical space for the most common values of topological/Euclidean-geometric distances, bond/dihedral angles, triangle/quadrilateral perimeters, triangle area and volume was performed in order to determine the intervals to take into account in the cutoff procedures. A variability analysis based on Shannon's entropy reveals that better distribution patterns are attained with the descriptors based on the cutoffs proposed (QuBiLS-MIDAS NQ-MDs) with regard to the results obtained when all inter-atomic relations are considered (QuBiLS-MIDAS KA-MDs - 'Keep All'). A principal component analysis shows that the novel molecular cutoffs codify chemical information captured by the respective QuBiLS-MIDAS KA-MDs, as well as information not captured by the latter. Lastly, a QSAR study to obtain deeper knowledge of the contribution of the proposed methods was carried out, using four molecular datasets (steroids (STER), angiotensin converting enzyme (ACE), thermolysin inhibitors (THER) and thrombin inhibitors (THR)) widely used as benchmarks in the evaluation of several methodologies. One to four variable QSAR models based on multiple linear regression were developed for each compound dataset following the original division into training and test sets. The results obtained reveal that the novel cutoff procedures yield superior performances relative to those of the QuBiLS-MIDAS KA-MDs in the prediction of the biological activities considered. From the results achieved, it can be suggested that the proposed N-tuple topological/geometric cutoffs constitute a relevant criteria for generating MDs codifying particular atomic relations, ultimately useful in enhancing the modelling capacity of the QuBiLS-MIDAS 3D-MDs.

Journal ArticleDOI
TL;DR: It is revealed that inclusion of structural and physicochemical descriptors such as those in QSAAR models can improve models for extrapolation from acute to chronic toxicity, and be useful for the further development of chronic toxicity estimation models.
Abstract: We constructed models for acute to chronic estimation of the Daphnia magna reproductive toxicities of chemical substances from their Daphnia magna acute immobilization toxicities. The models combined the acute toxicities with structural and physicochemical descriptors. We used multiregression analysis and selected the descriptors for the models by means of a genetic algorithm. Of the best 100 models (as indicated by the lack of fit score), 90% included the following descriptors: acute toxicity (i.e. an activity parameter), distribution coefficient (log D) and structural indicator variables that indicate the presence of -NH2 attached to aromatic carbon and the presence of a chlorine atom. We compared the predictive abilities of five of these quantitative structure-activity-activity relationship (QSAAR) acute to chronic estimation models with the predictive ability of a simple linear regression model. The comparison revealed that inclusion of structural and physicochemical descriptors such as those in QSAAR models can improve models for extrapolation from acute to chronic toxicity. Our results also provide a QSAAR framework that is expected to be useful for the further development of chronic toxicity estimation models.