A Novel Logic‐Based Approach for Quantitative Toxicology Prediction.
Abstract: There is a pressing need for accurate in silico methods to predict the toxicity of molecules that are being introduced into the environment or are being developed into new pharmaceuticals. Predictive toxicology is in the realm of structure activity relationships (SAR), and many approaches have been used to derive such SAR. Previous work has shown that inductive logic programming (ILP) is a powerful approach that circumvents several major difficulties, such as molecular superposition, faced by some other SAR methods. The ILP approach reasons with chemical substructures within a relational framework and yields chemically understandable rules. Here, we report a general new approach, support vector inductive logic programming (SVILP), which extends the essentially qualitative ILP-based SAR to quantitative modeling. First, ILP is used to learn rules, the predictions of which are then used within a novel kernel to derive a support-vector generalization model. For a highly heterogeneous dataset of 576 molecules ...
Summary (1 min read)
- With more than 70 000 chemicals in use today and many more being synthesized, it is vital that there are effective methods to assess the effect of these compounds on the environment and on human health.
- Using a recently available dataset of toxicity DSSTox, 10 which provides the toxicities of 576 chemicals for fathead minnow, the authors show that SVILP yields significantly better accuracies than ILP, regression from chemical descriptors, and an industry standard method TOPKAT.
- Importantly, the learned logic rules are readily amenable to interpretation as chemical substructures related to activity and thereby provide extensive chemical insights.
- The SVILP approach 9 uses ILP for learning logic rules, followed by quantitative modeling based on support vector technology as shown in Figure 1 .
- The logic relations identify the chemical fragments according to the atom and bond details of the MOL2 structures.
- These learned rules form the input for quantitative prediction using the newly developed method SVILP.
- One fold is used as a testing set, and the four other folds are for training.
- All molecules above the mean value of toxicities in the training set are considered to be positive (more toxic), and the remaining are considered to be negative (less toxic).
- The average accuracies of predictions over five folds using chemical descriptor method (CHEM), ILP rules in combination with PLS, and SVILP are given in Table 2a .
- In the second part of this study, the molecules were classified into two groups based on their toxicities: that is, toxic (pLC 50 g mean) and nontoxic (pLC 50 < mean), where "mean" is the average of toxicities of molecules in the training set.
- For majority of rules, the distances between the chemical fragments are also defined, thereby identifying the relative location of the a C is the compression; p and n are the number of positives and negatives covered by the rule, respectively.
- Such chlorinated compounds show toxicity, particularly in aromatic compounds.
- In the previous sections, the authors compared the SVILP with four methods: that is, ILP, CHEM, PLS, and TOPKAT.
- The authors introduced a new quantitative logic-based method, support vector inductive logic programming , which uses the logic-based technology to learn logic rules followed by regression.
- The results of this study on a large, public, and diverse dataset show that SVILP predicts the toxicities with higher accuracy than other tested models.
- One could interpret the higher accuracy of the SVILP and PLS as a consequence of using more features.
- The rules are chemically understandable and describe the chemical alerts which are the cause of activity/toxicity.
- The program automatically and consistently detects chemical substructures and properties by construction of rules which are general.
Did you find this useful? Give us your feedback
Cites background from "A Novel Logic‐Based Approach for Qu..."
...…exists for these technologies (see, for example, Cronin et al., 2003; Veith, 2004; Helma, 2005; Piclin et al., 2006; Simon-Hettich et al., 2006; Amini et al., 2007; Aronov et al., 2007; Bender et al., 2007; Custer et al., 2007; Ecker and Chiba, 2007; Ekins, 2007; Serafimova et al., 2007; Enoch…...
...A large body of both review and research articles exists for these technologies (see, for example, Cronin et al., 2003; Veith, 2004; Helma, 2005; Piclin et al., 2006; Simon-Hettich et al., 2006; Amini et al., 2007; Aronov et al., 2007; Bender et al., 2007; Custer et al., 2007; Ecker and Chiba, 2007; Ekins, 2007; Serafimova et al., 2007; Enoch et al., 2008; Kavlock et al., 2008; Merlot, 2008; Pavan and Worth, 2008; Benfenati et al., 2009; Green and Naven, 2009; Nigsch et al., 2009; Spreafico et al., 2009; Valerio, 2009; Rossato et al., 2010; Cronin and Madden, 2010; Bars et al., 2011; Vuorinen et al., 2013; Gupta et al., 2013; Roncaglioni et al., 2013; Shah and Greene, 2014; Toropov et al., 2014; Schilter et al., 2014; Singh and Gupta, 2014; Ekins, 2014)....
Cites methods from "A Novel Logic‐Based Approach for Qu..."
... described the application of ‘support vector inductive logic programming’ (SVILP) that combined inductive logic programming (ILP) and SVMs in the context of compound toxicity prediction....
...On the basis of the resulting rules, SVR models were trained to facilitate quantitative toxicity predictions ....
Cites methods or result from "A Novel Logic‐Based Approach for Qu..."
...The problem of estimating the toxicity of drugs has been addressed, mainly, from three methods: i) regression from physical-chemical properties; ii) expert systems and; iii) machine learning [10, 11]....
...In  the ILP (Inductive logic programming) approach was used with support vector machines to extends the essentially qualitative ILP-based SAR to quantitative modelling....
...Although similar studies have been reported [3, 4, 10, 14], they did not assess the relevancy of molecular descriptors in terms of toxicity prediction....
...Besides the commercially available programs, other studies have been published using machine learning approaches [3, 13, 10, 6, 11]....