scispace - formally typeset
Journal ArticleDOI

Beware of q2

Reads0
Chats0
TLDR
It is argued that the high value of LOO q2 appears to be the necessary but not the sufficient condition for the model to have a high predictive power, which is the general property of QSAR models developed using LOO cross-validation.
Abstract
Validation is a crucial aspect of any quantitative structure-activity relationship (QSAR) modeling. This paper examines one of the most popular validation criteria, leave-one-out cross-validated R2 (LOO q2). Often, a high value of this statistical characteristic (q2 > 0.5) is considered as a proof of the high predictive ability of the model. In this paper, we show that this assumption is generally incorrect. In the case of 3D QSAR, the lack of the correlation between the high LOO q2 and the high predictive ability of a QSAR model has been established earlier [Pharm. Acta Helv. 70 (1995) 149; J. Chemomet. 10(1996)95; J. Med. Chem. 41 (1998) 2553]. In this paper, we use two-dimensional (2D) molecular descriptors and k nearest neighbors (kNN) QSAR method for the analysis of several datasets. No correlation between the values of q2 for the training set and predictive ability for the test set was found for any of the datasets. Thus, the high value of LOO q2 appears to be the necessary but not the sufficient condition for the model to have a high predictive power. We argue that this is the general property of QSAR models developed using LOO cross-validation. We emphasize that the external validation is the only way to establish a reliable QSAR model. We formulate a set of criteria for evaluation of predictive ability of QSAR models.

read more

Citations
More filters
Journal ArticleDOI

The Importance of Being Earnest: Validation is the Absolute Essential for Successful Application and Interpretation of QSPR Models

TL;DR: A set of simple guidelines for developing validated and predictive QSPR models is presented, highlighting the need to establish the domain of model applicability in the chemical space to flag molecules for which predictions may be unreliable, and some algorithms that can be used for this purpose.
Journal ArticleDOI

Principles of QSAR models validation: internal and external

TL;DR: Evidence is presented that only models that have been validated externally, after their internal validation, can be considered reliable and applicable for both external prediction and regulatory purposes.
Journal ArticleDOI

Best Practices for QSAR Model Development, Validation, and Exploitation.

TL;DR: Most critical QSAR modeling routines that are regarded as best practices in the field are examined, including procedures used to validate models, both internally and externally, as well as the need to define model applicability domains that should be used when models are employed for the prediction of external compounds or compound libraries.
Journal ArticleDOI

Recent advances and applications of machine learning in solid-state materials science

TL;DR: A comprehensive overview and analysis of the most recent research in machine learning principles, algorithms, descriptors, and databases in materials science, and proposes solutions and future research paths for various challenges in computational materials science.
References
More filters
BookDOI

Applied Statistics A Handbook of Techniques

TL;DR: An English translation of the German version of the book has been published by as mentioned in this paper, which is based on the newly revised fifth edition of the original German version and contains more material than the German original.
Journal ArticleDOI

Graph theory and molecular orbitals. XII. Acyclic polyenes

TL;DR: In this paper, a graph-theoretical study of acyclic polyenes is carried out with an emphasis on the influence of branching on several molecular properties, including thermodynamic stability and reactivity.
Journal ArticleDOI

Novel variable selection quantitative structure--property relationship approach based on the k-nearest-neighbor principle

TL;DR: A novel automated variable selection quantitative structure-activity relationship (QSAR) method, based on the kappa-nearest neighbor principle (kNN-QSar) has been developed, which implies that similar compounds display similar profiles of pharmacological activities.
Journal ArticleDOI

Substituent constants for correlation analysis.

TL;DR: Swain and Lupton's gamma and kappa values have been calculated from the omego constants for pi and omega for a miscellaneous group of aromatic substituents of interest to medicinal chemists.
Related Papers (5)