scispace - formally typeset
Open AccessJournal ArticleDOI

Machine learning methods in chemoinformatics

TLDR
This discussion is methods‐based and focused on some algorithms that chemoinformatics researchers frequently use, particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k‐Nearest Neighbors and naïve Bayes classifiers.
Abstract
Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure–activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naive Bayes classifiers. WIREs Comput Mol Sci 2014, 4:468–481. How to cite this article: WIREs Comput Mol Sci 2014, 4:468–481. doi:10.1002/wcms.1183

read more

Citations
More filters
Journal ArticleDOI

SwissADME: A free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules

TL;DR: The new SwissADME web tool is presented that gives free access to a pool of fast yet robust predictive models for physicochemical properties, pharmacokinetics, drug-likeness and medicinal chemistry friendliness, among which in-house proficient methods such as the BOILED-Egg, iLOGP and Bioavailability Radar are presented.
Journal ArticleDOI

MoleculeNet: a benchmark for molecular machine learning

TL;DR: A large scale benchmark for molecular machine learning consisting of multiple public datasets, metrics, featurizations and learning algorithms.
Journal ArticleDOI

Generating Focused Molecule Libraries for Drug Discovery with Recurrent Neural Networks

TL;DR: This work shows that recurrent neural networks can be trained as generative models for molecular structures, similar to statistical language models in natural language processing, and demonstrates that the properties of the generated molecules correlate very well with those of the molecules used to train the model.
Journal ArticleDOI

Machine-learning approaches in drug discovery: methods and applications.

TL;DR: This work focuses on machine-learning techniques within the context of ligand-based VS (LBVS), providing a detailed view of the current state of the art in this field and highlighting not only the problematic issues, but also the successes and opportunities for further advances.
Journal ArticleDOI

Machine Learning in Computer-Aided Synthesis Planning

TL;DR: Two critical aspects of CASP and recent machine learning approaches to both challenges are focused on, including the problem of retrosynthetic planning and anticipating the products of chemical reactions, which can be used to validate proposed reactions in a computer-generated synthesis plan.
References
More filters
Journal Article

R: A language and environment for statistical computing.

R Core Team
- 01 Jan 2014 - 
TL;DR: Copyright (©) 1999–2012 R Foundation for Statistical Computing; permission is granted to make and distribute verbatim copies of this manual provided the copyright notice and permission notice are preserved on all copies.
Journal ArticleDOI

Random Forests

TL;DR: Internal estimates monitor error, strength, and correlation and these are used to show the response to increasing the number of features used in the forest, and are also applicable to regression.
Posted Content

Improving neural networks by preventing co-adaptation of feature detectors

TL;DR: The authors randomly omits half of the feature detectors on each training case to prevent complex co-adaptations in which a feature detector is only helpful in the context of several other specific feature detectors.
Journal ArticleDOI

A comparison of methods for multiclass support vector machines

TL;DR: Decomposition implementations for two "all-together" multiclass SVM methods are given and it is shown that for large problems methods by considering all data at once in general need fewer support vectors.
Journal ArticleDOI

A systematic analysis of performance measures for classification tasks

TL;DR: This paper presents a systematic analysis of twenty four performance measures used in the complete spectrum of Machine Learning classification tasks, i.e., binary, multi-class,multi-labelled, and hierarchical, to produce a measure invariance taxonomy with respect to all relevant label distribution changes in a classification problem.
Related Papers (5)