scispace - formally typeset
Open AccessJournal ArticleDOI

A bias correction for the minimum error rate in cross-validation

Ryan J. Tibshirani, +1 more
- 01 Jun 2009 - 
- Vol. 3, Iss: 2, pp 822-829
TLDR
A simple method is proposed for the estimation of the minimum value of the cross-validation error which can be biased downward as an estimate of the test error at that samevalue of the tuning parameter.
Abstract
Tuning parameters in supervised learning problems are often estimated by cross-validation. The minimum value of the cross-validation error can be biased downward as an estimate of the test error at that same value of the tuning parameter. We propose a simple method for the estimation of this bias that uses information from the cross-validation process. As a result, it requires essentially no additional computation. We apply our bias estimate to a number of popular classifiers in various settings, and examine its performance.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Cross-validation pitfalls when selecting and assessing regression and classification models

TL;DR: An algorithm for repeated grid-search V-fold cross-validation for parameter tuning in classification and regression, and a repeated nested cross- validation algorithm for model assessment are described and evaluated.
Journal ArticleDOI

A survey of Bayesian predictive methods for model assessment, selection and comparison

TL;DR: A unified review of Bayesian predictive model assessment and selection methods, and of methods closely related to them, with an emphasis on how each method approximates the expected utility of using a Bayesian model for the purpose of predicting future data.
Journal ArticleDOI

Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction

TL;DR: Technical aspects are not the focus of Principles of Applied Statistics, so this also explains why it does not dwell intently on nonparametric models.
Journal ArticleDOI

Automatic classification of prostate cancer Gleason scores from multiparametric magnetic resonance images

TL;DR: Machine learning-based automatic classification of prostate cancer aggressiveness by combining apparent diffusion coefficient (ADC) and T2-weighted (T2-w) MRI-based texture features together with sample augmentation can help to obtain reasonably accurate classification of Gleason patterns are presented.
Journal ArticleDOI

Shrinking the cross-section

TL;DR: In this paper, a robust stochastic discount factor (SDF) summarizing the joint explanatory power of a large number of cross-sectional stock return predictors is proposed.
References
More filters
Book

An introduction to the bootstrap

TL;DR: This article presents bootstrap methods for estimation, using simple arguments, with Minitab macros for implementing these methods, as well as some examples of how these methods could be used for estimation purposes.
Book

Classification and regression trees

Leo Breiman
TL;DR: The methodology used to construct tree structured rules is the focus of a monograph as mentioned in this paper, covering the use of trees as a data analysis method, and in a more mathematical framework, proving some of their fundamental properties.
Journal ArticleDOI

Bootstrap Methods: Another Look at the Jackknife

TL;DR: In this article, the authors discuss the problem of estimating the sampling distribution of a pre-specified random variable R(X, F) on the basis of the observed data x.
Journal ArticleDOI

An Introduction to the Bootstrap

Scott D. Grimshaw
- 01 Aug 1995 - 
TL;DR: Statistical theory attacks the problem from both ends as discussed by the authors, and provides optimal methods for finding a real signal in a noisy background, and also provides strict checks against the overinterpretation of random patterns.