scispace - formally typeset
Open AccessJournal ArticleDOI

Minimax Rates of Estimation for High-Dimensional Linear Regression Over $\ell_q$ -Balls

Reads0
Chats0
TLDR
In this paper, the authors studied the minimax rates of convergence for estimating β* in either l 2-loss and l2-prediction loss, assuming that β* belongs to an l q -ball \BBBq(Rq) for some q ∈ [0, 1].
Abstract
Consider the high-dimensional linear regression model y = X β* + w, where y ∈ \BBRn is an observation vector, X ∈ \BBRn × d is a design matrix with d >; n, β* ∈ \BBRd is an unknown regression vector, and w ~ N(0, σ2I) is additive Gaussian noise. This paper studies the minimax rates of convergence for estimating β* in either l2-loss and l2-prediction loss, assuming that β* belongs to an lq -ball \BBBq(Rq) for some q ∈ [0,1]. It is shown that under suitable regularity conditions on the design matrix X, the minimax optimal rate in l2-loss and l2-prediction loss scales as Θ(Rq ([(logd)/(n)])1-q/2). The analysis in this paper reveals that conditions on the design matrix X enter into the rates for l2-error and l2-prediction error in complementary ways in the upper and lower bounds. Our proofs of the lower bounds are information theoretic in nature, based on Fano's inequality and results on the metric entropy of the balls \BBBq(Rq), whereas our proofs of the upper bounds are constructive, involving direct analysis of least squares over lq-balls. For the special case q=0, corresponding to models with an exact sparsity constraint, our results show that although computationally efficient l1-based methods can achieve the minimax rates up to constant factors, they require slightly stronger assumptions on the design matrix X than optimal algorithms involving least-squares over the l0-ball.

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings Article

A unified framework for high-dimensional analysis of M-estimators with decomposable regularizers

TL;DR: A unified framework for establishing consistency and convergence rates for regularized M-estimators under high-dimensional scaling is provided and one main theorem is state and shown how it can be used to re-derive several existing results, and also to obtain several new results.
Journal ArticleDOI

A Unified Framework for High-Dimensional Analysis of $M$-Estimators with Decomposable Regularizers

TL;DR: In this paper, a unified framework for establishing consistency and convergence rates for regularized M$-estimators under high-dimensional scaling was provided, which can be used to re-derive some existing results.
Book

High-Dimensional Statistics: A Non-Asymptotic Viewpoint

TL;DR: This book provides a self-contained introduction to the area of high-dimensional statistics, aimed at the first-year graduate level, and includes chapters that are focused on core methodology and theory - including tail bounds, concentration inequalities, uniform laws and empirical process, and random matrices.
Journal ArticleDOI

Restricted Eigenvalue Properties for Correlated Gaussian Designs

TL;DR: This paper proves directly that the restricted nullspace and eigenvalue conditions hold with high probability for quite general classes of Gaussian matrices for which the predictors may be highly dependent, and hence restricted isometry conditions can be violated with high probabilities.
Journal ArticleDOI

Noisy matrix decomposition via convex relaxation: Optimal rates in high dimensions

TL;DR: In this paper, a class of estimators based on convex relaxation for solving high-dimensional matrix decomposition problems is analyzed. But the results are restricted to matrix decompositions.
References
More filters
Book

Elements of information theory

TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Journal ArticleDOI

Regression Shrinkage and Selection via the Lasso

TL;DR: A new method for estimation in linear models called the lasso, which minimizes the residual sum of squares subject to the sum of the absolute value of the coefficients being less than a constant, is proposed.
Journal ArticleDOI

Atomic Decomposition by Basis Pursuit

TL;DR: Basis Pursuit (BP) is a principle for decomposing a signal into an "optimal" superposition of dictionary elements, where optimal means having the smallest l1 norm of coefficients among all such decompositions.
Journal ArticleDOI

High-dimensional graphs and variable selection with the Lasso

TL;DR: It is shown that neighborhood selection with the Lasso is a computationally attractive alternative to standard covariance selection for sparse high-dimensional graphs and is hence equivalent to variable selection for Gaussian linear models.
Journal ArticleDOI

The Dantzig selector: Statistical estimation when p is much larger than n

TL;DR: In many important statistical applications, the number of variables or parameters p is much larger than the total number of observations n as discussed by the authors, and it is possible to estimate β reliably based on the noisy data y.
Related Papers (5)