Journal ArticleDOI
Multivariable prognostic models: issues in developing models, evaluating assumptions and adequacy, and measuring and reducing errors
TLDR
In this article, an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities are discussed, which are particularly needed for binary, ordinal, and time-to-event outcomes.Abstract:
Multivariable regression models are powerful tools that are used frequently in studies of clinical outcomes. These models can use a mixture of categorical and continuous variables and can handle partially observed (censored) responses. However, uncritical application of modelling techniques can result in models that poorly fit the dataset at hand, or, even more likely, inaccurately predict outcomes on new subjects. One must know how to measure qualities of a model's fit in order to avoid poorly fitted or overfitted models. Measurement of predictive accuracy can be difficult for survival time data in the presence of censoring. We discuss an easily interpretable index of predictive discrimination as well as methods for assessing calibration of predicted survival probabilities. Both types of predictive accuracy should be unbiasedly validated using bootstrapping or cross-validation, before using predictions in a new data series. We discuss some of the hazards of poorly fitted and overfitted regression models and present one modelling strategy that avoids many of the problems discussed. The methods described are applicable to all regression models, but are particularly needed for binary, ordinal, and time-to-event outcomes. Methods are illustrated with a survival analysis in prostate cancer using Cox regression.read more
Citations
More filters
Journal ArticleDOI
Prediction of Coronary Heart Disease Using Risk Factor Categories
Peter W.F. Wilson,Ralph B. D'Agostino,Daniel Levy,Albert M. Belanger,Halit Silbershatz,William B. Kannel +5 more
TL;DR: A simple coronary disease prediction algorithm was developed using categorical variables, which allows physicians to predict multivariate CHD risk in patients without overt CHD.
Journal ArticleDOI
Predictive habitat distribution models in ecology
TL;DR: A review of predictive habitat distribution modeling is presented, which shows that a wide array of models has been developed to cover aspects as diverse as biogeography, conservation biology, climate change research, and habitat or species management.
Journal ArticleDOI
General Cardiovascular Risk Profile for Use in Primary Care The Framingham Heart Study
Ralph B. D'Agostino,Ramachandran S. Vasan,Michael J. Pencina,Philip A. Wolf,Mark R. Cobain,Joseph M. Massaro,William B. Kannel +6 more
TL;DR: A sex-specific multivariable risk factor algorithm can be conveniently used to assess general CVD risk and risk of individual CVD events (coronary, cerebrovascular, and peripheral arterial disease and heart failure) and can be used to quantify risk and to guide preventive care.
Journal ArticleDOI
Evaluating the added predictive ability of a new marker: From area under the ROC curve to reclassification and beyond
TL;DR: Two new measures, one based on integrated sensitivity and specificity and the other on reclassification tables, are introduced that offer incremental information over the AUC and are proposed to be considered in addition to the A UC when assessing the performance of newer biomarkers.
Journal ArticleDOI
Validation of Clinical Classification Schemes for Predicting Stroke: Results From the National Registry of Atrial Fibrillation
Brian F. Gage,Amy D. Waterman,William D. Shannon,Michael Boechler,Michael W. Rich,Martha J. Radford +5 more
TL;DR: The 2 existing classification schemes and especially a new stroke risk index, CHADS, can quantify risk of stroke for patients who have AF and may aid in selection of antithrombotic therapy.
References
More filters
Book
An introduction to the bootstrap
Bradley Efron,Robert Tibshirani +1 more
TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.
Book
Applied Logistic Regression
David W. Hosmer,Stanley Lemeshow +1 more
TL;DR: Hosmer and Lemeshow as discussed by the authors provide an accessible introduction to the logistic regression model while incorporating advances of the last decade, including a variety of software packages for the analysis of data sets.
Journal ArticleDOI
The meaning and use of the area under a receiver operating characteristic (ROC) curve.
TL;DR: A representation and interpretation of the area under a receiver operating characteristic (ROC) curve obtained by the "rating" method, or by mathematical predictions based on patient characteristics, is presented and it is shown that in such a setting the area represents the probability that a randomly chosen diseased subject is (correctly) rated or ranked with greater suspicion than a random chosen non-diseased subject.
Book
Principal Component Analysis
TL;DR: In this article, the authors present a graphical representation of data using Principal Component Analysis (PCA) for time series and other non-independent data, as well as a generalization and adaptation of principal component analysis.
Journal ArticleDOI
Robust Locally Weighted Regression and Smoothing Scatterplots
TL;DR: Robust locally weighted regression as discussed by the authors is a method for smoothing a scatterplot, in which the fitted value at z k is the value of a polynomial fit to the data using weighted least squares, where the weight for (x i, y i ) is large if x i is close to x k and small if it is not.