Showing papers by "Joel A. Tropp published in 2012"

PDF

Open Access

Journal Article•DOI•

User-Friendly Tail Bounds for Sums of Random Matrices

[...]

Joel A. Tropp¹•Institutions (1)

01 Aug 2012-Foundations of Computational Mathematics

TL;DR: This paper presents new probability inequalities for sums of independent, random, self-adjoint matrices and provides noncommutative generalizations of the classical bounds associated with the names Azuma, Bennett, Bernstein, Chernoff, Hoeffding, and McDiarmid.

...read moreread less

Abstract: This paper presents new probability inequalities for sums of independent, random, self-adjoint matrices. These results place simple and easily verifiable hypotheses on the summands, and they deliver strong conclusions about the large-deviation behavior of the maximum eigenvalue of the sum. Tail bounds for the norm of a sum of random rectangular matrices follow as an immediate corollary. The proof techniques also yield some information about matrix-valued martingales. In other words, this paper provides noncommutative generalizations of the classical bounds associated with the names Azuma, Bennett, Bernstein, Chernoff, Hoeffding, and McDiarmid. The matrix inequalities promise the same diversity of application, ease of use, and strength of conclusion that have made the scalar inequalities so valuable.

...read moreread less

1,675 citations

Journal Article•DOI•

Restricted isometries for partial random circulant matrices

[...]

Holger Rauhut¹, Justin Romberg², Joel A. Tropp³•Institutions (3)

University of Bonn¹, Georgia Institute of Technology², California Institute of Technology³

01 Mar 2012-Applied and Computational Harmonic Analysis

TL;DR: This paper demonstrates that the sth-order restricted isometry constant is small when the number m of samples satisfies m ≳ (s logn)^(3/2), where n is the length of the pulse.

...read moreread less

212 citations

Posted Content•

Factoring nonnegative matrices with linear programs

[...]

Victor Bittorf, Benjamin Recht, Christopher Ré, Joel A. Tropp

06 Jun 2012-arXiv: Optimization and Control

TL;DR: In this article, a data-driven nonnegative matrix factorization (NMF) algorithm based on linear programming is proposed, where the most salient features in the data are used to express the remaining features.

...read moreread less

Abstract: This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C such that X approximately equals CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes.

...read moreread less

200 citations

Proceedings Article•

Factoring nonnegative matrices with linear programs

[...]

Ben Recht¹, Christopher Ré¹, Joel A. Tropp², Victor Bittorf¹•Institutions (2)

University of Wisconsin-Madison¹, California Institute of Technology²

03 Dec 2012

TL;DR: A data-driven model for the factorization where the most salient features in the data are used to express the remaining features and this method extends to more general noise models and leads to efficient, scalable algorithms.

...read moreread less

Abstract: This paper describes a new approach, based on linear programming, for computing nonnegative matrix factorizations (NMFs). The key idea is a data-driven model for the factorization where the most salient features in the data are used to express the remaining features. More precisely, given a data matrix X, the algorithm identifies a matrix C that satisfies X ≈ CX and some linear constraints. The constraints are chosen to ensure that the matrix C selects features; these features can then be used to find a low-rank NMF of X. A theoretical analysis demonstrates that this approach has guarantees similar to those of the recent NMF algorithm of Arora et al. (2012). In contrast with this earlier work, the proposed method extends to more general noise models and leads to efficient, scalable algorithms. Experiments with synthetic and real datasets provide evidence that the new approach is also superior in practice. An optimized C++ implementation can factor a multigigabyte matrix in a matter of minutes.

...read moreread less

119 citations

Journal Article•DOI•

Robust computation of linear models by convex relaxation

[...]

Gilad Lerman¹, Michael B. McCoy, Joel A. Tropp², Teng Zhang¹•Institutions (2)

University of Minnesota¹, California Institute of Technology²

18 Feb 2012-arXiv: Information Theory

TL;DR: In this article, a convex optimization problem, called REAPER, is described, which can reliably fit a low-dimensional model to this type of data, and it uses a relaxation of the set of orthogonal projectors to reach the convex formulation.

...read moreread less

Abstract: Consider a dataset of vector-valued observations that consists of noisy inliers, which are explained well by a low-dimensional subspace, along with some number of outliers. This work describes a convex optimization problem, called REAPER, that can reliably fit a low-dimensional model to this type of data. This approach parameterizes linear subspaces using orthogonal projectors, and it uses a relaxation of the set of orthogonal projectors to reach the convex formulation. The paper provides an efficient algorithm for solving the REAPER problem, and it documents numerical experiments which confirm that REAPER can dependably find linear structure in synthetic and natural data. In addition, when the inliers lie near a low-dimensional subspace, there is a rigorous theory that describes when REAPER can approximate this subspace.

...read moreread less

98 citations

Posted Content•

Sharp recovery bounds for convex demixing, with applications

[...]

Michael B. McCoy¹, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

08 May 2012-arXiv: Information Theory

TL;DR: In this paper, a convex demixing framework based on convex optimization is proposed to solve the problem of identifying two structured signals given only the sum of the two signals and prior information about their structures.

...read moreread less

Abstract: Demixing refers to the challenge of identifying two structured signals given only the sum of the two signals and prior information about their structures. Examples include the problem of separating a signal that is sparse with respect to one basis from a signal that is sparse with respect to a second basis, and the problem of decomposing an observed matrix into a low-rank matrix plus a sparse matrix. This paper describes and analyzes a framework, based on convex optimization, for solving these demixing problems, and many others. This work introduces a randomized signal model which ensures that the two structures are incoherent, i.e., generically oriented. For an observation from this model, this approach identifies a summary statistic that reflects the complexity of a particular signal. The difficulty of separating two structured, incoherent signals depends only on the total complexity of the two structures. Some applications include (i) demixing two signals that are sparse in mutually incoherent bases; (ii) decoding spread-spectrum transmissions in the presence of impulsive errors; and (iii) removing sparse corruptions from a low-rank matrix. In each case, the theoretical analysis of the convex demixing method closely matches its empirical behavior.

...read moreread less

85 citations

Report•DOI•

Robust Computation of Linear Models, or How to Find a Needle in a Haystack

[...]

Gilad Lerman, Michael B. McCoy, Joel A. Tropp, Teng Zhang

17 Feb 2012

TL;DR: The paper provides an efficient algorithm for solving the reaper problem, and it documents numerical experiments which confirm that reaper can dependably linear structure in synthetic and natural data.

...read moreread less

Abstract: : Consider a dataset of vector-valued observations that consists of a modest number of noisy inliers, which are explained well by a low-dimensional subspace, along with a large number of outliers, which have no linear structure. This work describes a convex optimization problem, called reaper, that can reliably t a low-dimensional model to this type of data. The paper provides an efficient algorithm for solving the reaper problem, and it documents numerical experiments which confirm that reaper can dependably nd linear structure in synthetic and natural data. In addition, when the inliers are contained in a low-dimensional subspace, there is a rigorous theory that describes when reaper can recover the subspace exactly.

...read moreread less

57 citations

Posted Content•

Sharp recovery bounds for convex deconvolution, with applications

[...]

Michael B. McCoy, Joel A. Tropp

08 May 2012

TL;DR: This work introduces a randomized signal model which ensures that the two structures are incoherent, i.e., generically oriented, and describes and analyzes a framework, based on convex optimization, for solving deconvolution problems and many others.

...read moreread less

44 citations

Journal Article•DOI•

The Masked Sample Covariance Estimator: An Analysis via Matrix Concentration Inequalities

[...]

Richard Chen¹, Alex Gittens¹, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

01 Dec 2012-Information and Inference: A Journal of the IMA

TL;DR: In this paper, a short analysis of the masked covariance estimator by means of a matrix concentration inequality is provided, and it is shown that n = O(B log 2 p) samples suffice to estimate a banded covariance matrix with bandwidth B up to a relative spectral norm error.

...read moreread less

Abstract: Covariance estimation becomes challenging in the regime where the number p of variables outstrips the number n of samples available to construct the estimate. One way to circumvent this problem is to assume that the covariance matrix is nearly sparse and to focus on estimating only the significant entries. To analyze this approach, Levina and Vershynin (2011) introduce a formalism called masked covariance estimation, where each entry of the sample covariance estimator is reweighted to reflect an a priori assessment of its importance. This paper provides a short analysis of the masked sample covariance estimator by means of a matrix concentration inequality. The main result applies to general distributions with at least four moments. Specialized to the case of a Gaussian distribution, the theory offers qualitative improvements over earlier work. For example, the new results show that n = O(B log^2 p) samples suffice to estimate a banded covariance matrix with bandwidth B up to a relative spectral-norm error, in contrast to the sample complexity n = O(B log^5 p) obtained by Levina and Vershynin.

...read moreread less

41 citations

Journal Article•DOI•

From joint convexity of quantum relative entropy to a concavity theorem of Lieb

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

01 May 2012

TL;DR: In this article, the authors provide a succinct proof of a 1973 theorem of Lieb that establishes the concavity of a trace function, relying on a deep result from quantum information theory, the joint convexity of quantum relative entropy, as well as a recent argument due to Carlen and Lieb.

...read moreread less

Abstract: This paper provides a succinct proof of a 1973 theorem of Lieb that establishes the concavity of a certain trace function. The development relies on a deep result from quantum information theory, the joint convexity of quantum relative entropy, as well as a recent argument due to Carlen and Lieb.

...read moreread less

37 citations

Journal Article•DOI•

Matrix concentration inequalities via the method of exchangeable pairs

[...]

Lester Mackey, Michael I. Jordan, Richard Chen, Brendan Farrell, Joel A. Tropp - Show less +1 more

28 Jan 2012-arXiv: Probability

TL;DR: In this paper, a matrix extension of the scalar concentration theory developed by Sourav Chatterjee using Stein's method of exchangeable pairs is presented. But it is not a generalization of the classical inequalities due to Hoeffding, Bernstein, Khintchine and Rosenthal.

...read moreread less

Abstract: This paper derives exponential concentration inequalities and polynomial moment inequalities for the spectral norm of a random matrix. The analysis requires a matrix extension of the scalar concentration theory developed by Sourav Chatterjee using Stein's method of exchangeable pairs. When applied to a sum of independent random matrices, this approach yields matrix generalizations of the classical inequalities due to Hoeffding, Bernstein, Khintchine and Rosenthal. The same technique delivers bounds for sums of dependent random matrices and more general matrix-valued functions of dependent random variables.

...read moreread less

Report•DOI•

The Masked Sample Covariance Estimator: An Analysis via the Matrix Laplace Transform

[...]

Richard Chen, Alex Gittens, Joel A. Tropp

01 Feb 2012

TL;DR: In this article, a new analysis of the masked sample covariance estimator based on the matrix Laplace transform method is presented, which is applied to general subgaussian distributions.

...read moreread less

Abstract: Covariance estimation becomes challenging in the regime where the number p of variables outstrips the number n of samples available to construct the estimate. One way to circumvent this problem is to assume that the covariance matrix is nearly sparse and to focus on estimating only the significant entries. To analyze this approach, Levina and Vershynin (2011) introduce a formalism called masked covariance estimation, where each entry of the sample covariance estimator is reweighed to reflect an a priori assessment of its importance. This paper provides a new analysis of the masked sample covariance estimator based on the matrix Laplace transform method. The main result applies to general subgaussian distributions. Specialized to the case of a Gaussian distribution, the theory offers qualitative improvements over earlier work. For example, the new results show that n = O(B log ^2 p) samples suffice to estimate a banded covariance matrix with bandwidth B up to a relative spectral-norm error, in contrast to the sample complexity n = O(B log ^5 p) obtained by Levina and Vershynin.

...read moreread less

Journal Article•DOI•

A comparison principle for functions of a uniformly random subspace

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

01 Aug 2012-Probability Theory and Related Fields

TL;DR: In this paper, it was shown that it is possible to bound the expectation of a random matrix drawn from the Stiefel manifold in terms of the expected norm of a standard Gaussian matrix with the same dimensions.

...read moreread less

Abstract: This note demonstrates that it is possible to bound the expectation of an arbitrary norm of a random matrix drawn from the Stiefel manifold in terms of the expected norm of a standard Gaussian matrix with the same dimensions. A related comparison holds for any convex function of a random matrix drawn from the Stiefel manifold. For certain norms, a reversed inequality is also valid.

...read moreread less

Journal Article•

A faster fourier transform: A mathematical upgrade promises a speedier digital world

[...]

Richard G. Baraniuk¹, Anna C. Gilbert², Martin J. Strauss², Joel A. Tropp³, Mark A. Iwen⁴ - Show less +1 more•Institutions (4)

Rice University¹, University of Michigan², California Institute of Technology³, Duke University⁴

01 May 2012-Technology Review