scispace - formally typeset
Search or ask a question
Topic

Quantitative structure–activity relationship

About: Quantitative structure–activity relationship is a research topic. Over the lifetime, 7560 publications have been published within this topic receiving 144670 citations. The topic is also known as: QSAR & Quantitative Structure-Activity Relationship.


Papers
More filters
Journal ArticleDOI
TL;DR: 21 types of error that continue to be perpetrated in the QSAR/QSPR literature are identified and each is discussed, with examples (including some of the authors' own).
Abstract: Although thousands of quantitative structure–activity and structure–property relationships (QSARs/QSPRs) have been published, as well as numerous papers on the correct procedures for QSAR/QSPR analysis, many analyses are still carried out incorrectly, or in a less than satisfactory manner. We have identified 21 types of error that continue to be perpetrated in the QSAR/QSPR literature, and each of these is discussed, with examples (including some of our own). Where appropriate, we make recommendations for avoiding errors and for improving and enhancing QSAR/QSPR analyses.

389 citations

Book ChapterDOI
14 Feb 2007
TL;DR: Support vector machines represent an extension to nonlinear models of the generalized portrait algorithm developed by Vapnik and Lerner, and are a group of supervised learning methods that can be applied to classification or regression.
Abstract: Kernel-based techniques (such as support vector machines, Bayes point machines, kernel principal component analysis, and Gaussian processes) represent a major development in machine learning algorithms. Support vector machines (SVM) are a group of supervised learning methods that can be applied to classification or regression. In a short period of time, SVM found numerous applications in chemistry, such as in drug design (discriminating between ligands and nonligands, inhibitors and noninhibitors, etc.), quantitative structure-activity relationships (QSAR, where SVM regression is used to predict various physical, chemical, or biological properties), chemometrics (optimization of chromatographic separation or compound concentration prediction from spectral data as examples), sensors (for qualitative and quantitative prediction from sensor data), chemical engineering (fault detection and modeling of industrial processes), and text mining (automatic recognition of scientific information). Support vector machines represent an extension to nonlinear models of the generalized portrait algorithm developed by Vapnik and Lerner. The SVM algorithm is based on the statistical learning theory and the Vapnik–Chervonenkis

375 citations

Journal ArticleDOI
TL;DR: This critical review re-examines the strategy and the output of the modern QSAR modeling approaches and provides examples and arguments suggesting that current methodologies may afford robust and validated models capable of accurate prediction of compound properties for molecules not included in the training sets.
Abstract: Quantitative Structure Activity Relationship (QSAR) modeling has been traditionally applied as an evaluative approach, i.e., with the focus on developing retrospective and explanatory models of existing data. Model extrapolation was considered if only in hypothetical sense in terms of potential modifications of known biologically active chemicals that could improve compounds' activity. This critical review re-examines the strategy and the output of the modern QSAR modeling approaches. We provide examples and arguments suggesting that current methodologies may afford robust and validated models capable of accurate prediction of compound properties for molecules not included in the training sets. We discuss a data-analytical modeling workflow developed in our laboratory that incorporates modules for combinatorial QSAR model development (i.e., using all possible binary combinations of available descriptor sets and statistical data modeling techniques), rigorous model validation, and virtual screening of available chemical databases to identify novel biologically active compounds. Our approach places particular emphasis on model validation as well as the need to define model applicability domains in the chemistry space. We present examples of studies where the application of rigorously validated QSAR models to virtual screening identified computational hits that were confirmed by subsequent experimental investigations. The emerging focus of QSAR modeling on target property forecasting brings it forward as predictive, as opposed to evaluative, modeling approach.

369 citations

Journal ArticleDOI
TL;DR: This review focuses on the methodologies in constructing three main components of QSAR model, namely the methods for describing the molecular structure of compounds, for selection of informative descriptors and for activity prediction.
Abstract: Virtual filtering and screening of combinatorial libraries have recently gained attention as methods complementing the high-throughput screening and combinatorial chemistry. These chemoinformatic techniques rely heavily on quantitative structure-activity relationship (QSAR) analysis, a field with established methodology and successful history. In this review, we discuss the computational methods for building QSAR models. We start with outlining their usefulness in high-throughput screening and identifying the general scheme of a QSAR model. Following, we focus on the methodologies in constructing three main components of QSAR model, namely the methods for describing the molecular structure of compounds, for selection of informative descriptors and for activity prediction. We present both the well-established methods as well as techniques recently introduced into the QSAR domain.

360 citations

Journal ArticleDOI
TL;DR: Serial use of partial least-squares, PLS, regression and a genetic algorithm, GA, is used to perform data reduction and identify the manifold of top 3D-QSAR models for a training set.
Abstract: 4D-QSAR analysis incorporates conformational and alignment freedom into the development of 3D-QSAR models for training sets of structure−activity data by performing ensemble averaging, the fourth “dimension”. The descriptors in 4D-QSAR analysis are the grid cell (spatial) occupancy measures of the atoms composing each molecule in the training set realized from the sampling of conformation and alignment spaces. Grid cell occupancy descriptors can be generated for any atom type, group, and/or model pharmacophore. A single “active” conformation can be postulated for each compound in the training set and combined with the optimal alignment for use in other molecular design applications including other 3D-QSAR methods. The influence of the conformational entropy of each compound on its activity can be estimated. Serial use of partial least-squares, PLS, regression and a genetic algorithm, GA, is used to perform data reduction and identify the manifold of top 3D-QSAR models for a training set. The unique manifo...

359 citations


Network Information
Related Topics (5)
Ligand (biochemistry)
26.5K papers, 1M citations
80% related
Aqueous solution
189.5K papers, 3.4M citations
77% related
Active site
28.6K papers, 1.1M citations
77% related
Alkyl
223.5K papers, 2M citations
77% related
Reaction rate constant
42.9K papers, 1M citations
76% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023510
20221,020
2021284
2020356
2019334
2018313