Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls
Reads0
Chats0
TLDR
In this paper, the authors studied the minimax rates of convergence for estimating β* in either l 2-loss and l2-prediction loss, assuming that β* belongs to an l q -ball \BBBq(Rq) for some q ∈ [0, 1].Abstract:
Consider the high-dimensional linear regression model y = X β* + w, where y ∈ \BBRn is an observation vector, X ∈ \BBRn × d is a design matrix with d >; n, β* ∈ \BBRd is an unknown regression vector, and w ~ N(0, σ2I) is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating β* in either l2-loss and l2-prediction loss, assuming that β* belongs to an lq -ball \BBBq(Rq) for some q ∈ [0,1]. It is shown that under suitable regularity conditions on the design matrix X, the minimax optimal rate in l2-loss and l2-prediction loss scales as Θ(Rq ([(logd)/(n)])1-q/2). The analysis in this paper reveals that conditions on the design matrix X enter into the rates for l2-error and l2-prediction error in complementary ways in the upper and lower bounds. Our proofs of the lower bounds are information theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls \BBBq(Rq), whereas our proofs of the upper bounds are constructive, involving direct analysis of least squares over lq-balls. For the special case q=0, corresponding to models with an exact sparsity constraint, our results show that although computationally efficient l1-based methods can achieve the minimax rates up to constant factors, they require slightly stronger assumptions on the design matrix X than optimal algorithms involving least-squares over the l0-ball.read more
Citations
More filters
Proceedings Article
A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers
TL;DR: A unified framework for establishing consistency and convergence rates for regularized M-estimators under high-dimensional scaling is provided and one main theorem is state and shown how it can be used to re-derive several existing results, and also to obtain several new results.
Journal ArticleDOI
A Unified Framework for High-Dimensional Analysis of $M$-Estimators with Decomposable Regularizers
TL;DR: In this paper, a unified framework for establishing consistency and convergence rates for regularized M$-estimators under high-dimensional scaling was provided, which can be used to re-derive some existing results.
Book
High-Dimensional Statistics: A Non-Asymptotic Viewpoint
TL;DR: This book provides a self-contained introduction to the area of high-dimensional statistics, aimed at the first-year graduate level, and includes chapters that are focused on core methodology and theory - including tail bounds, concentration inequalities, uniform laws and empirical process, and random matrices.
Journal ArticleDOI
Restricted Eigenvalue Properties for Correlated Gaussian Designs
TL;DR: This paper proves directly that the restricted nullspace and eigenvalue conditions hold with high probability for quite general classes of Gaussian matrices for which the predictors may be highly dependent, and hence restricted isometry conditions can be violated with high probabilities.
Journal ArticleDOI
Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions
TL;DR: In this paper, a class of estimators based on convex relaxation for solving high-dimensional matrix decomposition problems is analyzed. But the results are restricted to matrix decompositions.
References
More filters
Book
Elements of information theory
Thomas M. Cover,Joy A. Thomas +1 more
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Journal ArticleDOI
Regression Shrinkage and Selection via the Lasso
TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI
Atomic Decomposition by Basis Pursuit
TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
Journal ArticleDOI
High-dimensional graphs and variable selection with the Lasso
TL;DR: It is shown that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs and is hence equivalent to variable selection for Gaussian linear models.
Journal ArticleDOI
The Dantzig selector: Statistical estimation when p is much larger than n
Emmanuel J. Candès,Terence Tao +1 more
TL;DR: In many important statistical applications, the number of variables or parameters p is much larger than the total number of observations n as discussed by the authors, and it is possible to estimate β reliably based on the noisy data y.
Related Papers (5)
The Dantzig selector: Statistical estimation when p is much larger than n
Emmanuel J. Candès,Terence Tao +1 more
Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties
Jianqing Fan,Runze Li +1 more