Showing papers by "Joel A. Tropp published in 2011"

PDF

Open Access

Journal Article•DOI•

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

[...]

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp¹•Institutions (1)

01 May 2011-Siam Review

TL;DR: This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.

...read moreread less

Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the $k$ dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

...read moreread less

3,248 citations

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate

[...]

Matrix Decompositions, Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp

01 Jan 2011

TL;DR: In this article, the authors present a modular framework for constructing randomized algorithms that compute partial matrix decompositions, which use random sampling to identify a subspace that captures most of the action of a matrix.

...read moreread less

Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that ran- domization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multi- processor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

...read moreread less

494 citations

Journal Article•DOI•

Improved analysis of the subsampled randomized hadamard transform

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

20 Nov 2011-Advances in Adaptive Data Analysis

TL;DR: In this article, an improved analysis of a structured dimension reduction map called the subsampled randomized Hadamard transform is presented, and the new proof is much simpler than previous approaches, and it offers optimal constants in the estimate on the number of dimensions required for the embedding.

...read moreread less

Abstract: This paper presents an improved analysis of a structured dimension-reduction map called the subsampled randomized Hadamard transform. This argument demonstrates that the map preserves the Euclidean geometry of an entire subspace of vectors. The new proof is much simpler than previous approaches, and it offers — for the first time — optimal constants in the estimate on the number of dimensions required for the embedding.

...read moreread less

350 citations

Journal Article•DOI•

Freedman's inequality for matrix martingales

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

23 May 2011-Electronic Communications in Probability

TL;DR: Oliveira et al. as mentioned in this paper showed that the large deviation behavior of a martingale is controlled by the predictable quadratic variation and a uniform upper bound for the Martingale difference sequence.

...read moreread less

Abstract: Freedman's inequality is a martingale counterpart to Bernstein's inequality. This result shows that the large-deviation behavior of a martingale is controlled by the predictable quadratic variation and a uniform upper bound for the martingale difference sequence. Oliveira has recently established a natural extension of Freedman's inequality that provides tail bounds for the maximum singular value of a matrix-valued martingale. This note describes a different proof of the matrix Freedman inequality that depends on a deep theorem of Lieb from matrix analysis. This argument delivers sharp constants in the matrix Freedman inequality, and it also yields tail bounds for other types of matrix martingales. The new techniques are adapted from recent work by the present author.

...read moreread less

148 citations

Journal Article•DOI•

Two proposals for robust PCA using semidefinite programming

[...]

Michael B. McCoy, Joel A. Tropp

01 Jan 2011-Electronic Journal of Statistics

TL;DR: In this article, the authors proposed two approaches for robust principal component analysis based on semidefinite programming, which seek directions of large spread in the data while damping the effect of outliers.

...read moreread less

Abstract: The performance of principal component analysis suffers badly in the presence of outliers. This paper proposes two novel approaches for robust principal component analysis based on semidefinite programming. The first method, maximum mean absolute deviation rounding, seeks directions of large spread in the data while damping the effect of outliers. The second method produces a low-leverage decomposition of the data that attempts to form a low-rank model for the data by separating out corrupted observations. This paper also presents efficient computational methods for solving these semidefinite programs. Numerical experiments confirm the value of these new techniques.

...read moreread less

125 citations

Posted Content•

Freedman's inequality for matrix martingales

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

16 Jan 2011-arXiv: Probability

TL;DR: Oliveira et al. as discussed by the authors showed that the large deviation behavior of a martingale is controlled by the predictable quadratic variation and a uniform upper bound for the Martingale difference sequence.

...read moreread less

106 citations

Report•DOI•

User-Friendly Tail Bounds for Matrix Martingales

[...]

Joel A. Tropp

16 Jan 2011

TL;DR: In this article, the authors present probability inequalities for sums of adapted sequences of random, self-adjoint matrices, and they frame simple, easily verifiable hypotheses on the summands, and yield strong conclusions about the large deviation behavior of the maximum eigenvalue of the ∆-sum.

...read moreread less

Abstract: This report presents probability inequalities for sums of adapted sequences of random, self-adjoint matrices. The results frame simple, easily verifiable hypotheses on the summands, and they yield strong conclusions about the large-deviation behavior of the maximum eigenvalue of the sum. The methods also specialize to sums of independent random matrices.

...read moreread less

76 citations

Journal Article•DOI•

The restricted isometry property for time-frequency structured random matrices

[...]

Götz E. Pfander¹, Holger Rauhut², Joel A. Tropp³•Institutions (3)

Jacobs University Bremen¹, University of Bonn², California Institute of Technology³

16 Jun 2011-Probability Theory and Related Fields

TL;DR: This paper establishes the restricted isometry property for a Gabor system generated by n2 time–frequency shifts of a random window function in n dimensions by establishing the sth order restricted isometric constant of the associated n × n2 Gabor synthesis matrix is small.

...read moreread less

Abstract: This paper establishes the restricted isometry property for a Gabor system generated by n2 time–frequency shifts of a random window function in n dimensions. The sth order restricted isometry constant of the associated n × n2 Gabor synthesis matrix is small provided that s ≤ cn2/3 / log2n. This bound provides a qualitative improvement over previous estimates, which achieve only quadratic scaling of the sparsity s with respect to n. The proof depends on an estimate for the expected supremum of a second-order chaos.

...read moreread less

67 citations

Report•DOI•

Tail bounds for all eigenvalues of a sum of random matrices

[...]

Alex Gittens¹, Joel A. Tropp•Institutions (1)

California Institute of Technology¹

21 Jul 2011-arXiv: Probability

TL;DR: The minimax Laplace transform (MLT) as discussed by the authors is a modification of the cumulant-based matrix Laplace Transform (CBLT) that yields both upper and lower bounds on each eigenvalue of a sum of random self-adjoint matrices.

...read moreread less

Abstract: This work introduces the minimax Laplace transform method, a modification of the cumulant-based matrix Laplace transform method developed in [Tro11c] that yields both upper and lower bounds on each eigenvalue of a sum of random self-adjoint matrices. This machinery is used to derive eigenvalue analogs of the classical Chernoff, Bennett, and Bernstein bounds. Two examples demonstrate the efficacy of the minimax Laplace transform. The first concerns the effects of column sparsification on the spectrum of a matrix with orthonormal rows. Here, the behavior of the singular values can be described in terms of coherence-like quantities. The second example addresses the question of relative accuracy in the estimation of eigenvalues of the covariance matrix of a random process. Standard results on the convergence of sample covariance matrices provide bounds on the number of samples needed to obtain relative accuracy in the spectral norm, but these results only guarantee relative accuracy in the estimate of the maximum eigenvalue. The minimax Laplace transform argument establishes that if the lowest eigenvalues decay sufficiently fast, Ω(e^(-2)κ^2_l l log p) samples, where κ_l = λ_1(C)/λ_l(C), are sufficient to ensure that the dominant l eigenvalues of the covariance matrix of a N(0,C) random vector are estimated to within a factor of 1 ± e with high probability.

...read moreread less

56 citations

Posted Content•

The Masked Sample Covariance Estimator: An Analysis via Matrix Concentration Inequalities

[...]

Richard Chen¹, Alex Gittens¹, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

08 Sep 2011-arXiv: Probability

TL;DR: In this paper, a short analysis of the masked covariance estimator by means of a matrix concentration inequality is provided, and it is shown that n = O(B log 2 p) samples suffice to estimate a banded covariance matrix with bandwidth B up to a relative spectral norm error.

...read moreread less

Abstract: Covariance estimation becomes challenging in the regime where the number p of variables outstrips the number n of samples available to construct the estimate. One way to circumvent this problem is to assume that the covariance matrix is nearly sparse and to focus on estimating only the significant entries. To analyze this approach, Levina and Vershynin (2011) introduce a formalism called masked covariance estimation, where each entry of the sample covariance estimator is reweighted to reflect an a priori assessment of its importance. This paper provides a short analysis of the masked sample covariance estimator by means of a matrix concentration inequality. The main result applies to general distributions with at least four moments. Specialized to the case of a Gaussian distribution, the theory offers qualitative improvements over earlier work. For example, the new results show that n = O(B log^2 p) samples suffice to estimate a banded covariance matrix with bandwidth B up to a relative spectral-norm error, in contrast to the sample complexity n = O(B log^5 p) obtained by Levina and Vershynin.

...read moreread less

41 citations

Journal Article•DOI•

From joint convexity of quantum relative entropy to a concavity theorem of Lieb

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

05 Jan 2011-arXiv: Information Theory

TL;DR: A succinct proof of a 1973 theorem of Lieb that establishes the concavity of a certain trace function is provided that relies on a deep result from quantum information theory, the joint convexity of quantum relative entropy, and a recent argument due to Carlen and Lieb.

...read moreread less

Abstract: This note provides a succinct proof of a 1973 theorem of Lieb that establishes the concavity of a certain trace function. The development relies on a deep result from quantum information theory, the joint convexity of quantum relative entropy, as well as a recent argument due to Carlen and Lieb.

...read moreread less

Journal Article•DOI•

A comparison principle for functions of a uniformly random subspace

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

02 Feb 2011-arXiv: Probability

TL;DR: In this article, it was shown that it is possible to bound the expectation of a random matrix drawn from the Stiefel manifold in terms of the expected norm of a standard Gaussian matrix with the same dimensions.

...read moreread less

Abstract: This note demonstrates that it is possible to bound the expectation of an arbitrary norm of a random matrix drawn from the Stiefel manifold in terms of the expected norm of a standard Gaussian matrix with the same dimensions. A related comparison holds for any convex function of a random matrix drawn from the Stiefel manifold. For certain norms, a reversed inequality is also valid.

...read moreread less

Journal Article•DOI•

The restricted isometry property for time-frequency structured random matrices

[...]

Götz E. Pfander¹, Holger Rauhut², Joel A. Tropp³•Institutions (3)

Jacobs University Bremen¹, University of Bonn², California Institute of Technology³

16 Jun 2011-arXiv: Information Theory

TL;DR: In this article, the authors established the restricted isometry property for finite dimensional Gabor systems, that is, for families of time shifts of a randomly chosen window function, and developed bounds for a corresponding chaos process.

...read moreread less

Abstract: We establish the restricted isometry property for finite dimensional Gabor systems, that is, for families of time--frequency shifts of a randomly chosen window function. We show that the $s$-th order restricted isometry constant of the associated $n \times n^2$ Gabor synthesis matrix is small provided $s \leq c \, n^{2/3} / \log^2 n$. This improves on previous estimates that exhibit quadratic scaling of $n$ in $s$. Our proof develops bounds for a corresponding chaos process.

...read moreread less