scispace - formally typeset
Open AccessJournal ArticleDOI

Predictive inference with the jackknife

TLDR
In this article, the authors introduce the jackknife+ method for constructing predictive confidence intervals, which is based on the leave-one-out predictions at the test point to account for the variability in the fitted regression function Assuming exchangeable training samples, this crucial modification permits rigorous coverage guarantees regardless of the distribution of the data points, for any algorithm that treats the training points symmetrically.
Abstract
This paper introduces the jackknife+, which is a novel method for constructing predictive confidence intervals Whereas the jackknife outputs an interval centered at the predicted response of a test point, with the width of the interval determined by the quantiles of leave-one-out residuals, the jackknife+ also uses the leave-one-out predictions at the test point to account for the variability in the fitted regression function Assuming exchangeable training samples, we prove that this crucial modification permits rigorous coverage guarantees regardless of the distribution of the data points, for any algorithm that treats the training points symmetrically Such guarantees are not possible for the original jackknife and we demonstrate examples where the coverage rate may actually vanish Our theoretical and empirical analysis reveals that the jackknife and the jackknife+ intervals achieve nearly exact coverage and have similar lengths whenever the fitting algorithm obeys some form of stability Further, we extend the jackknife+ to $K$-fold cross validation and similarly establish rigorous coverage properties Our methods are related to cross-conformal prediction proposed by Vovk (Ann Math Artif Intell 74 (2015) 9–28) and we discuss connections

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Classification with Valid and Adaptive Coverage

TL;DR: A novel conformity score is developed, which is explicitly demonstrate to be powerful and intuitive for classification problems, but whose underlying principle is potentially far more general.
Posted Content

The Augmented Synthetic Control Method

TL;DR: Augmented SCM as mentioned in this paper uses an outcome model to estimate the bias due to imperfect pre-treatment fit and then de-biases the original SCM estimate, which can be expressed as a solution to a modified synthetic control problem that allows negative weights on some donor units.
Journal ArticleDOI

Distribution-Free, Risk-Controlling Prediction Sets

TL;DR: In this paper, a black-box predictor is used to generate set-valued predictions from a black box predictor that control the expected loss on future test points at a user-specified level.
Posted Content

Nested conformal prediction and quantile out-of-bag ensemble methods

TL;DR: This work provides an alternate view of conformal prediction that starts with a sequence of nested sets and calibrates them to find a valid prediction region, and uses the framework to derive a new algorithm that combines four ideas: quantile regression, cross-conformalization, ensemble methods and out-of-bag predictions.
Posted Content

Conformal Inference of Counterfactuals and Individual Treatment Effects

TL;DR: This work proposes a conformal inference-based approach that can produce reliable interval estimates for counterfactuals and individual treatment effects under the potential outcome framework and achieves the desired coverage with reasonably short intervals.
References
More filters
Journal Article

Scikit-learn: Machine Learning in Python

TL;DR: Scikit-learn is a Python module integrating a wide range of state-of-the-art machine learning algorithms for medium-scale supervised and unsupervised problems, focusing on bringing machine learning to non-specialists using a general-purpose high-level language.
Journal ArticleDOI

Bootstrap Methods: Another Look at the Jackknife

TL;DR: In this article, the authors discuss the problem of estimating the sampling distribution of a pre-specified random variable R(X, F) on the basis of the observed data x.
Journal ArticleDOI

Cross-Validatory Choice and Assessment of Statistical Predictions

TL;DR: In this article, a generalized form of the cross-validation criterion is applied to the choice and assessment of prediction using the data-analytic concept of a prescription, and examples used to illustrate the application are drawn from the problem areas of univariate estimation, linear regression and analysis of variance.
Related Papers (5)