scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Support vector machines for drug discovery.

01 Jan 2014-Expert Opinion on Drug Discovery (Taylor & Francis)-Vol. 9, Iss: 1, pp 93-104
TL;DR: SVMs are currently among the best-performing approaches for chemical and biological property prediction and the computational identification of active compounds and it is anticipated that their use in drug discovery will further increase.
Abstract: Introduction: Support vector machines (SVMs) are supervised machine learning algorithms for binary class label prediction and regression-based prediction of property values. In recent years, SVMs h...
Citations
More filters
Journal ArticleDOI
TL;DR: In silico prediction of ADMET is an important component of pharmaceutical R&D and has advanced alongside the progress of chemoinformatics, which has evolved from traditional chemometrics to advanced machine learning methods.

255 citations

Journal ArticleDOI
TL;DR: In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed and the applications that produce promising results and methods will be reviewed.
Abstract: The advancements of information technology and related processing techniques have created a fertile base for progress in many scientific fields and industries. In the fields of drug discovery and development, machine learning techniques have been used for the development of novel drug candidates. The methods for designing drug targets and novel drug discovery now routinely combine machine learning and deep learning algorithms to enhance the efficiency, efficacy, and quality of developed outputs. The generation and incorporation of big data, through technologies such as high-throughput screening and high through-put computational analysis of databases used for both lead and target discovery, has increased the reliability of the machine learning and deep learning incorporated techniques. The use of these virtual screening and encompassing online information has also been highlighted in developing lead synthesis pathways. In this review, machine learning and deep learning algorithms utilized in drug discovery and associated techniques will be discussed. The applications that produce promising results and methods will be reviewed.

131 citations


Cites background or methods from "Support vector machines for drug di..."

  • ...An optimal hyperplane attained by maximizing margin between classes in N-dimensional space (N is the number of features); it is denoted by a hyperplane, which is used to classify data points by setting decision boundaries [51]....

    [...]

  • ...For drug-target interaction, it is specifically designed for integrating ligands and proteins of interest information as an essential component for SVM modeling [51]....

    [...]

Journal ArticleDOI
TL;DR: The aim of this review is to show how QSAR modeling can be applied in novel drug discovery, design and lead optimization in the absence of 3D structures of specific drug targets.
Abstract: Introduction: Quantitative structure–activity relationship (QSAR) modeling is one of the most popular computer-aided tools employed in medicinal chemistry for drug discovery and lead optimization. It is especially powerful in the absence of 3D structures of specific drug targets. QSAR methods have been shown to draw public attention since they were first introduced.Areas covered: In this review, the authors provide a brief discussion of the basic principles of QSAR, model development and model validation. They also highlight the current applications of QSAR in different fields, particularly in virtual screening, rational drug design and multi-target QSAR. Finally, in view of recent controversies, the authors detail the challenges faced by QSAR modeling and the relevant solutions. The aim of this review is to show how QSAR modeling can be applied in novel drug discovery, design and lead optimization.Expert opinion: QSAR should intentionally be used as a powerful tool for fragment-based drug design platform...

90 citations

Journal ArticleDOI
04 Oct 2017
TL;DR: This work compared SVM and SVR calculations for the same compound data sets to evaluate which features are responsible for predictions and, on the basis of systematic feature weight analysis, rather surprising results were obtained.
Abstract: In computational chemistry and chemoinformatics, the support vector machine (SVM) algorithm is among the most widely used machine learning methods for the identification of new active compounds. In addition, support vector regression (SVR) has become a preferred approach for modeling nonlinear structure–activity relationships and predicting compound potency values. For the closely related SVM and SVR methods, fingerprints (i.e., bit string or feature set representations of chemical structure and properties) are generally preferred descriptors. Herein, we have compared SVM and SVR calculations for the same compound data sets to evaluate which features are responsible for predictions. On the basis of systematic feature weight analysis, rather surprising results were obtained. Fingerprint features were frequently identified that contributed differently to the corresponding SVM and SVR models. The overlap between feature sets determining the predictive performance of SVM and SVR was only very small. Furthermo...

65 citations

Journal ArticleDOI
TL;DR: Testing SVM as a classification tool in a real-life drug discovery problem revealed that it could be a useful method for classification task in early-phase of drug discovery.

63 citations


Cites background or methods from "Support vector machines for drug di..."

  • ...[29] K....

    [...]

  • ...More detailed information can be found in Heikamp and Bajorath [29]....

    [...]

  • ...Over a decade, various ML methods have been applied to biology, chemistry and drug discovery [29]....

    [...]

References
More filters
Book
Vladimir Vapnik1
01 Jan 1995
TL;DR: Setting of the learning problem consistency of learning processes bounds on the rate of convergence ofLearning processes controlling the generalization ability of learning process constructing learning algorithms what is important in learning theory?
Abstract: Setting of the learning problem consistency of learning processes bounds on the rate of convergence of learning processes controlling the generalization ability of learning processes constructing learning algorithms what is important in learning theory?.

40,147 citations


"Support vector machines for drug di..." refers background in this paper

  • ...The balance between these two objectives is important for the generalization performance of SVM models to predict new, previously unseen data [9,10]....

    [...]

Journal ArticleDOI
TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.
Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

37,861 citations

Journal ArticleDOI
TL;DR: There are several arguments which support the observed high accuracy of SVMs, which are reviewed and numerous examples and proofs of most of the key theorems are given.
Abstract: The tutorial starts with an overview of the concepts of VC dimension and structural risk minimization. We then describe linear Support Vector Machines (SVMs) for separable and non-separable data, working through a non-trivial example in detail. We describe a mechanical analogy, and discuss when SVM solutions are unique and when they are global. We describe how support vector training can be practically implemented, and discuss in detail the kernel mapping technique which is used to construct SVM solutions which are nonlinear in the data. We show how Support Vector machines can have very large (even infinite) VC dimension by computing the VC dimension for homogeneous polynomial and Gaussian radial basis function kernels. While very high VC dimension would normally bode ill for generalization performance, and while at present there exists no theory which shows that good generalization performance is guaranteed for SVMs, there are several arguments which support the observed high accuracy of SVMs, which we review. Results of some experiments which were inspired by these arguments are also presented. We give numerous examples and proofs of most of the key theorems. There is new material, and I hope that the reader will find that even old material is cast in a fresh light.

15,696 citations

Proceedings ArticleDOI
01 Jul 1992
TL;DR: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented, applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions.
Abstract: A training algorithm that maximizes the margin between the training patterns and the decision boundary is presented. The technique is applicable to a wide variety of the classification functions, including Perceptrons, polynomials, and Radial Basis Functions. The effective number of parameters is adjusted automatically to match the complexity of the problem. The solution is expressed as a linear combination of supporting patterns. These are the subset of training patterns that are closest to the decision boundary. Bounds on the generalization performance based on the leave-one-out method and the VC-dimension are given. Experimental results on optical character recognition problems demonstrate the good generalization obtained when compared with other learning algorithms.

11,211 citations


"Support vector machines for drug di..." refers methods in this paper

  • ...In order to enable SVM model building in such cases, the socalled kernel trick [12] is applied that facilitates the derivation of a nonlinear decision function in the input space....

    [...]

Journal ArticleDOI
TL;DR: This tutorial gives an overview of the basic ideas underlying Support Vector (SV) machines for function estimation, and includes a summary of currently used algorithms for training SV machines, covering both the quadratic programming part and advanced methods for dealing with large datasets.
Abstract: In this tutorial we give an overview of the basic ideas underlying Support Vector (SV) machines for function estimation. Furthermore, we include a summary of currently used algorithms for training SV machines, covering both the quadratic (or convex) programming part and advanced methods for dealing with large datasets. Finally, we mention some modifications and extensions that have been applied to the standard SV algorithm, and discuss the aspect of regularization from a SV perspective.

10,696 citations