High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification

Open AccessPosted Content

High-Dimensional Asymptotics of Prediction: Ridge Regression and Classification

- 10 Jul 2015 -

TLDR

A unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model in a high-dimensional asymptotic regime and finds that predictive accuracy has a nuanced dependence on the eigenvalue distribution of the covariance matrix.

Abstract:

We provide a unified analysis of the predictive risk of ridge regression and regularized discriminant analysis in a dense random effects model. We work in a high-dimensional asymptotic regime where $p, n \to \infty$ and $p/n \to \gamma \in (0, \, \infty)$, and allow for arbitrary covariance among the features. For both methods, we provide an explicit and efficiently computable expression for the limiting predictive risk, which depends only on the spectrum of the feature-covariance matrix, the signal strength, and the aspect ratio $\gamma$. Especially in the case of regularized discriminant analysis, we find that predictive accuracy has a nuanced dependence on the eigenvalue distribution of the covariance matrix, suggesting that analyses based on the operator norm of the covariance matrix may not be sharp. Our results also uncover several qualitative insights about both methods: for example, with ridge regression, there is an exact inverse relation between the limiting predictive risk and the limiting estimation risk given a fixed signal strength. Our analysis builds on recent advances in random matrix theory.

Citations

PDF

Open Access

More filters

Posted Content

Surprises in High-Dimensional Ridgeless Least Squares Interpolation.

Trevor Hastie, +3 more

- 19 Mar 2019 -

arXiv: Statistics Theory

TL;DR: This paper recovers---in a precise quantitative way---several phenomena that have been observed in large-scale neural networks and kernel machines, including the "double descent" behavior of the prediction risk, and the potential benefits of overparametrization.

...read moreread less

Journal ArticleDOI

High-dimensional regression adjustments in randomized experiments

Stefan Wager, +3 more

- 08 Nov 2016 -

Proceedings of the National Academy of S...

TL;DR: This work studies the problem of treatment effect estimation in randomized experiments with high-dimensional covariate information and shows that essentially any risk-consistent regression adjustment can be used to obtain efficient estimates of the average treatment effect.

...read moreread less

Posted Content

Optimal Regularization Can Mitigate Double Descent

Preetum Nakkiran, +3 more

- 04 Mar 2020 -

arXiv: Learning

TL;DR: This work proves that for certain linear regression models with isotropic data distribution, optimally-tuned $\ell_2$ regularization achieves monotonic test performance as the authors grow either the sample size or the model size, and demonstrates empirically that optimalsized regularization can mitigate double descent for more general models, including neural networks.

...read moreread less

Posted Content

Benign overfitting in ridge regression

Alexander Tsigler, +1 more

- 29 Sep 2020 -

arXiv: Statistics Theory

TL;DR: This work provides non-asymptotic generalization bounds for overparametrized ridge regression that depend on the arbitrary covariance structure of the data, and shows that those bounds are tight for a range of regularization parameter values.

...read moreread less

Posted Content

lassopack: Model selection and prediction with regularized regression in Stata

Achim Ahrens, +2 more

- 16 Jan 2019 -

arXiv: Econometrics

TL;DR: Lassopack as discussed by the authors is a suite of programs for regularized regression in Stata, which implements lasso, square-root lasso and elastic net, ridge regression and post-estimation OLS.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

ImageNet Large Scale Visual Recognition Challenge

Olga Russakovsky, +11 more

- 01 Dec 2015 -

International Journal of Computer Vision

TL;DR: The ImageNet Large Scale Visual Recognition Challenge (ILSVRC) as mentioned in this paper is a benchmark in object category classification and detection on hundreds of object categories and millions of images, which has been run annually from 2010 to present, attracting participation from more than fifty institutions.

...read moreread less

Book

An Introduction to Multivariate Statistical Analysis

T. W. Anderson

TL;DR: In this article, the distribution of the Mean Vector and the Covariance Matrix and the Generalized T2-Statistic is analyzed. But the distribution is not shown to be independent of sets of Variates.

...read moreread less

Journal ArticleDOI

Ridge regression: biased estimation for nonorthogonal problems

Arthur E. Hoerl, +1 more

- 01 Feb 2000 -

Technometrics

TL;DR: In this paper, an estimation procedure based on adding small positive quantities to the diagonal of X′X was proposed, which is a method for showing in two dimensions the effects of nonorthogonality.

...read moreread less

Book ChapterDOI

On the Uniform Convergence of Relative Frequencies of Events to Their Probabilities

Vladimir Vapnik, +1 more

- 01 Jan 1971 -

Theory of Probability and Its Applicatio...

TL;DR: This chapter reproduces the English translation by B. Seckler of the paper by Vapnik and Chervonenkis in which they gave proofs for the innovative results they had obtained in a draft form in July 1966 and announced in 1968 in their note in Soviet Mathematics Doklady.

...read moreread less

Journal ArticleDOI

The Dantzig selector: Statistical estimation when p is much larger than n

Emmanuel J. Candès, +1 more

- 01 Dec 2007 -

Annals of Statistics

TL;DR: In many important statistical applications, the number of variables or parameters p is much larger than the total number of observations n as discussed by the authors, and it is possible to estimate β reliably based on the noisy data y.

...read moreread less