scispace - formally typeset
Open AccessJournal ArticleDOI

Optimal detection of sparse principal components in high dimension

Quentin Berthet, +1 more
- 01 Aug 2013 - 
- Vol. 41, Iss: 4, pp 1780-1815
Reads0
Chats0
TLDR
In this paper, a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix is performed, based on a sparse eigenvalue statistic.
Abstract
We perform a finite sample analysis of the detection levels for sparse principal components of a high-dimensional covariance matrix. Our minimax optimal test is based on a sparse eigenvalue statistic. Alas, computing this test is known to be NP-complete in general, and we describe a computationally efficient alternative test using convex relaxations. Our relaxation is also proved to detect sparse principal components at near optimal detection levels, and it performs well on simulated datasets. Moreover, using polynomial time reductions from theoretical computer science, we bring significant evidence that our results cannot be improved, thus revealing an inherent trade off between statistical and computational performance.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Machine learning: Trends, perspectives, and prospects

TL;DR: The adoption of data-intensive machine-learning methods can be found throughout science, technology and commerce, leading to more evidence-based decision-making across many walks of life, including health care, manufacturing, education, financial modeling, policing, and marketing.
Book

Community Detection and Stochastic Block Models

TL;DR: The recent developments that establish the fundamental limits for community detection in the stochastic block model are surveyed, both with respect to information-theoretic and computational thresholds, and for various recovery requirements such as exact, partial and weak recovery.
Journal ArticleDOI

Sparse PCA: Optimal rates and adaptive estimation

TL;DR: In this paper, the authors considered both minimax and adaptive estimation of the principal subspace in the high dimensional setting and established the optimal rates of convergence for estimating the subspace which are sharp with respect to all the parameters, thus providing a complete characterization of the difficulty of the estimation problem in terms of the convergence rate.
Proceedings Article

Complexity Theoretic Lower Bounds for Sparse Principal Component Detection

TL;DR: The performance of a test is measured by the smallest signal strength that it can detect and a computationally efficient method based on semidefinite programming is proposed and it is proved that the statistical performance of this test cannot be strictly improved by any computationallyefficient method.
Journal ArticleDOI

Estimating structured high-dimensional covariance and precision matrices: Optimal rates and adaptive estimation

TL;DR: Minimax rates of convergence for estimating several classes of structured covariance and precision matrices, including bandable, Toeplitz, and sparse covariance matrices as well as sparse precisionMatrices, are given under the spectral norm loss.
References
More filters
Book

Convex Optimization

TL;DR: In this article, the focus is on recognizing convex optimization problems and then finding the most appropriate technique for solving them, and a comprehensive introduction to the subject is given. But the focus of this book is not on the optimization problem itself, but on the problem of finding the appropriate technique to solve it.

Reducibility Among Combinatorial Problems.

TL;DR: Throughout the 1960s I worked on combinatorial optimization problems including logic circuit design with Paul Roth and assembly line balancing and the traveling salesman problem with Mike Held, which made me aware of the importance of distinction between polynomial-time and superpolynomial-time solvability.
Journal ArticleDOI

Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays.

TL;DR: In this paper, a two-way clustering algorithm was applied to both the genes and the tissues, revealing broad coherent patterns that suggest a high degree of organization underlying gene expression in these tissues.
Journal ArticleDOI

Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming

TL;DR: This algorithm gives the first substantial progress in approximating MAX CUT in nearly twenty years, and represents the first use of semidefinite programming in the design of approximation algorithms.
Book

Interior-Point Polynomial Algorithms in Convex Programming

TL;DR: This book describes the first unified theory of polynomial-time interior-point methods, and describes several of the new algorithms described, e.g., the projective method, which have been implemented, tested on "real world" problems, and found to be extremely efficient in practice.
Related Papers (5)