Improvements on Cross-Validation: The 632+ Bootstrap Method

doi:10.1080/01621459.1997.10474007

Journal ArticleDOI

Improvements on Cross-Validation: The 632+ Bootstrap Method

Bradley Efron, +1 more

- 01 Jun 1997 -

Journal of the American Statistical Asso...

- Vol. 92, Iss: 438, pp 548-560

Chats0

TLDR

It is shown that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments and also considers estimating the variability of an error rate estimate.

Abstract:

A training set of data has been used to construct a rule for predicting future responses. What is the error rate of this rule? This is an important question both for comparing models and for assessing a final selected model. The traditional answer to this question is given by cross-validation. The cross-validation estimate of prediction error is nearly unbiased but can be highly variable. Here we discuss bootstrap estimates of prediction error, which can be thought of as smoothed versions of cross-validation. We show that a particular bootstrap method, the .632+ rule, substantially outperforms cross-validation in a catalog of 24 simulation experiments. Besides providing point estimates, we also consider estimating the variability of an error rate estimate. All of the results here are nonparametric and apply to any possible prediction rule; however, we study only classification problems with 0–1 loss in detail. Our simulations include “smooth” prediction rules like Fisher's linear discriminant fun...

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Differential expression analysis for sequence count data.

Simon Anders, +1 more

- 27 Oct 2010 -

Genome Biology

TL;DR: A method based on the negative binomial distribution, with variance and mean linked by local regression, is proposed and an implementation, DESeq, as an R/Bioconductor package is presented.

...read moreread less

Journal ArticleDOI

Least angle regression

Bradley Efron, +19 more

- 01 Apr 2004 -

Annals of Statistics

TL;DR: A publicly available algorithm that requires only the same order of magnitude of computational effort as ordinary least squares applied to the full set of covariates is described.

...read moreread less

BookDOI

Regression Modeling Strategies

Jr. Frank E. Harrell

TL;DR: Regression models are frequently used to develop diagnostic, prognostic, and health resource utilization models in clinical, health services, outcomes, pharmacoeconomic, and epidemiologic research, and in a multitude of non-health-related areas.

...read moreread less

Book

Applied Predictive Modeling

Max Kuhn, +1 more

TL;DR: This research presents a novel and scalable approach called “Smartfitting” that automates the very labor-intensive and therefore time-heavy and therefore expensive and expensive process of designing and implementing statistical models for regression models.

...read moreread less

Journal ArticleDOI

An introduction to kernel-based learning algorithms

Klaus-Robert Müller, +4 more

- 01 Mar 2001 -

IEEE Transactions on Neural Networks

TL;DR: This paper provides an introduction to support vector machines, kernel Fisher discriminant analysis, and kernel principal component analysis, as examples for successful kernel-based learning methods.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

An introduction to the bootstrap

Bradley Efron, +1 more

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.

...read moreread less

Journal ArticleDOI

Classification and Regression Trees.

John Van Ryzin, +4 more

- 01 Mar 1986 -

Journal of the American Statistical Asso...

Journal ArticleDOI

Bagging predictors

Leo Breiman

TL;DR: Tests on real and simulated data sets using classification and regression trees and subset selection in linear regression show that bagging can give substantial gains in accuracy.

...read moreread less

Book

Classification and regression trees

Leo Breiman

TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.

...read moreread less

Journal ArticleDOI

Bootstrap Methods: Another Look at the Jackknife

Bradley Efron

- 01 Jan 1979 -

Annals of Statistics

TL;DR: In this article, the authors discuss the problem of estimating the sampling distribution of a pre-specified random variable R(X, F) on the basis of the observed data x.

...read moreread less

Improvements on Cross-Validation: The 632+ Bootstrap Method

Citations

Differential expression analysis for sequence count data.

Least angle regression

Regression Modeling Strategies

Applied Predictive Modeling

An introduction to kernel-based learning algorithms

References

An introduction to the bootstrap

Classification and Regression Trees.

Bagging predictors

Classification and regression trees

Bootstrap Methods: Another Look at the Jackknife

Related Papers (5)

Random Forests

An introduction to the bootstrap

Bagging predictors

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Classification and Regression Trees.