The sparsity and bias of the Lasso selection in high-dimensional linear regression

doi:10.1214/07-AOS520

Open AccessJournal ArticleDOI

The sparsity and bias of the Lasso selection in high-dimensional linear regression

Cun-Hui Zhang, +1 more

- 01 Aug 2008 -

Annals of Statistics

- Vol. 36, Iss: 4, pp 1567-1594

TLDR

This article showed that the LASSO selects a model of the correct order of dimensionality, controls the bias of the selected model at a level determined by the contributions of small regression coefficients and threshold bias, and selects all coefficients of greater order than the bias.

Abstract:

Meinshausen and Buhlmann [Ann. Statist. 34 (2006) 1436–1462] showed that, for neighborhood selection in Gaussian graphical models, under a neighborhood stability condition, the LASSO is consistent, even when the number of variables is of greater order than the sample size. Zhao and Yu [(2006) J. Machine Learning Research 7 2541–2567] formalized the neighborhood stability condition in the context of linear regression as a strong irrepresentable condition. That paper showed that under this condition, the LASSO selects exactly the set of nonzero regression coefficients, provided that these coefficients are bounded away from zero at a certain rate. In this paper, the regression coefficients outside an ideal model are assumed to be small, but not necessarily zero. Under a sparse Riesz condition on the correlation of design variables, we prove that the LASSO selects a model of the correct order of dimensionality, controls the bias of the selected model at a level determined by the contributions of small regression coefficients and threshold bias, and selects all coefficients of greater order than the bias of the selected model. Moreover, as a consequence of this rate consistency of the LASSO in model selection, it is proved that the sum of error squares for the mean response and the lα-loss for the regression coefficients converge at the best possible rates under the given conditions. An interesting aspect of our results is that the logarithm of the number of variables can be of the same order as the sample size for certain random dependent designs.

The sparsity and bias of the Lasso selection in high-dimensional linear regression

Citations

Sure independence screening for ultrahigh dimensional feature space

An Overview of Multi-Task Learning in Deep Neural Networks

Extended Bayesian information criteria for model selection with large model spaces

Discussion on "Stability Selection" by Meinshausen and Buhlmann

A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers

References

Regression Shrinkage and Selection via the Lasso

Least angle regression

The adaptive lasso and its oracle properties

High-dimensional graphs and variable selection with the Lasso

On Model Selection Consistency of Lasso

Related Papers (5)

Regression Shrinkage and Selection via the Lasso

On Model Selection Consistency of Lasso

Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties

The adaptive lasso and its oracle properties

High-dimensional graphs and variable selection with the Lasso