Sparse PCA: Optimal rates and adaptive estimation
Reads0
Chats0
TLDR
In this paper, the authors considered both minimax and adaptive estimation of the principal subspace in the high dimensional setting and established the optimal rates of convergence for estimating the subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in terms of the convergence rate.Abstract:
Principal component analysis (PCA) is one of the most commonly used statistical procedures with a wide range of applications. This paper considers both minimax and adaptive estimation of the principal subspace in the high dimensional setting. Under mild technical conditions, we first establish the optimal rates of convergence for estimating the principal subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in term of the convergence rate. The lower bound is obtained by calculating the local metric entropy and an application of Fano’s lemma. The rate optimal estimator is constructed using aggregation, which, however, might not be computationally feasible. We then introduce an adaptive procedure for estimating the principal subspace which is fully data driven and can be computed efficiently. It is shown that the estimator attains the optimal rates of convergence simultaneously over a large collection of the parameter spaces. A key idea in our construction is a reduction scheme which reduces the sparse PCA problem to a high-dimensional multivariate regression problem. This method is potentially also useful for other related problems.read more
Citations
More filters
Posted Content
Symmetry, Saddle Points, and Global Geometry of Nonconvex Matrix Factorization
TL;DR: A general theory for studying the geometry of nonconvex objective functions with underlying symmetric structures is proposed and the locations of stationary points and the null space of the associated Hessian matrices are characterized via the lens of invariant groups.
Proceedings Article
Tighten after Relax: Minimax-Optimal Sparse PCA in Polynomial Time
Zhaoran Wang,Huanran Lu,Han Liu +2 more
TL;DR: This paper proposes a two-stage sparse PCA procedure that attains the optimal principal subspace estimator in polynomial time and motivates a general paradigm of tackling nonconvex statistical learning problems with provable statistical guarantees.
Journal ArticleDOI
Sparsistency and agnostic inference in sparse PCA
Jing Lei,Vincent Q. Vu +1 more
TL;DR: The properties of the recently proposed Fantope projection and selection (FPS) method in the high-dimensional setting are investigated and it is shown that FPS provides a sparse, linear dimension-reducing transformation that is close to the best possible in terms of maximizing the predictive covariance.
Dissertation
Principled approaches to robust machine learning and beyond
TL;DR: This thesis devise two novel, but similarly inspired, algorithmic paradigms for estimation in high dimensions in the presence of a small number of adversarially added data points, both of which are the first efficient algorithms which achieve (nearly) optimal error bounds for a number fundamental statistical tasks such as mean estimation and covariance estimation.
Posted Content
Nonconvex Statistical Optimization: Minimax-Optimal Sparse PCA in Polynomial Time.
Zhaoran Wang,Huanran Lu,Han Liu +2 more
TL;DR: This framework motivates a general paradigm for solving many complex statistical problems which involve nonconvex optimization with provable guarantees and applies to the non-spiked covariance models, and adapts to non-Gaussianity as well as dependent data settings.
References
More filters
Book
Elements of information theory
Thomas M. Cover,Joy A. Thomas +1 more
TL;DR: The author examines the role of entropy, inequality, and randomness in the design of codes and the construction of codes in the rapidly changing environment.
Book
Matrix Analysis
Roger A. Horn,Charles R. Johnson +1 more
TL;DR: In this article, the authors present results of both classic and recent matrix analyses using canonical forms as a unifying theme, and demonstrate their importance in a variety of applications, such as linear algebra and matrix theory.
Book
An Introduction to Multivariate Statistical Analysis
TL;DR: In this article, the distribution of the Mean Vector and the Covariance Matrix and the Generalized T2-Statistic is analyzed. But the distribution is not shown to be independent of sets of Variates.
Journal ArticleDOI
Introduction to Multivariate Statistical Analysis.
William G. Madow,T. W. Anderson +1 more
Journal ArticleDOI
Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization
TL;DR: It is shown that if a certain restricted isometry property holds for the linear transformation defining the constraints, the minimum-rank solution can be recovered by solving a convex optimization problem, namely, the minimization of the nuclear norm over the given affine space.