Sparse PCA: Optimal rates and adaptive estimation
Reads0
Chats0
TLDR
In this paper, the authors considered both minimax and adaptive estimation of the principal subspace in the high dimensional setting and established the optimal rates of convergence for estimating the subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in terms of the convergence rate.Abstract:
Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano’s lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.read more
Citations
More filters
Journal ArticleDOI
A useful variant of the Davis--Kahan theorem for statisticians
TL;DR: In this paper, the authors present a variant of the Davis-Kahan theorem that relies only on a population eigenvalue separation condition, making it more natural and convenient for direct application in statistical contexts, and provide an improvement in many cases to the usual bound.
Posted Content
Fast low-rank estimation by projected gradient descent: General statistical and algorithmic guarantees
Yudong Chen,Martin J. Wainwright +1 more
TL;DR: This work provides a simple set of conditions under which projected gradient descent, when given a suitable initialization, converges geometrically to a statistically useful solution to the factorized optimization problem with rank constraints.
Journal Article
Truncated power method for sparse eigenvalue problems
Xiao-Tong Yuan,Tong Zhang +1 more
TL;DR: In this paper, the authors proposed a truncated power method that can approximately solve the underlying nonconvex optimization problem of sparse eigenvalue problem, which is to extract dominant (largest) sparse Eigenvectors with at most k non-zero components.
Journal ArticleDOI
An overview of the estimation of large covariance and precision matrices
Jianqing Fan,Yuan Liao,Han Liu +2 more
TL;DR: In this article, the authors provide a selective review of several recent developments on the estimation of large covariance and precision matrices, focusing on two general approaches: a rank-based method and a factor-model based method.
Proceedings Article
Complexity Theoretic Lower Bounds for Sparse Principal Component Detection
TL;DR: The performance of a test is measured by the smallest signal strength that it can detect and a computationally efficient method based on semidefinite programming is proposed and it is proved that the statistical performance of this test cannot be strictly improved by any computationallyefficient method.
References
More filters
Journal ArticleDOI
Pca consistency in high dimension, low sample size context
TL;DR: In this paper, the authors investigate the asymptotic behavior of the Principal Component (PC) directions in HDLSS data and show that if the first few eigenvalues of a population covariance matrix are large enough compared to the others, then the corresponding estimated PC directions are consistent or converge to the appropriate subspace (subspace consistency).
Journal ArticleDOI
High-dimensional analysis of semidefinite relaxations for sparse principal components
TL;DR: This paper analyzes a simple and computationally inexpensive diagonal cut-off method, and establishes a threshold of the order thetasdiag = n/[k2 log(p-k)] separating success from failure, and proves that a more complex semidefinite programming (SDP) relaxation due to dpsilaAspremont et al., succeeds once the sample size is of theorder thetassdp.
Journal ArticleDOI
PCA consistency in high dimension, low sample size context
TL;DR: This work investigates the asymptotic behavior of the Principal Component (PC) directions and shows that if the first few eigenvalues of a population covariance matrix are large enough compared to the others, then the corresponding estimated PC directions are consistent or converge to the appropriate subspace (subspace consistency) and most otherPC directions are strongly inconsistent.
Journal ArticleDOI
Optimal rates of convergence for sparse covariance matrix estimation
T. Tony Cai,Harrison H. Zhou +1 more
TL;DR: In this paper, a rate sharp minimax lower bound for estimating sparse covariance matrices under a range of matrix operator norm and Bregman divergence losses was derived, and a thresholding estimator was shown to attain the optimal rate of convergence under the spectral norm.
Journal ArticleDOI
Optimal detection of sparse principal components in high dimension
TL;DR: In this paper, a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix is performed, based on a sparse eigenvalue statistic.