Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

doi:10.1137/090771806

Open AccessJournal ArticleDOI

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

Nathan Halko, +2 more

- 01 May 2011 -

Siam Review

- Vol. 53, Iss: 2, pp 217-288

TLDR

This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.

Abstract:

Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the $k$ dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

Citations

Exact matrix completion via convex optimization

User-Friendly Tail Bounds for Sums of Random Matrices

Ising formulations of many NP problems

Modal Analysis of Fluid Flows: An Overview

RASL: Robust Alignment by Sparse and Low-Rank Decomposition for Linearly Correlated Images

References

Matrix computations

An introduction to probability theory and its applications

Matrix Analysis

The Elements of Statistical Learning: Data Mining, Inference, and Prediction

Compressed sensing

Related Papers (5)

Matrix computations

Improved Approximation Algorithms for Large Matrices via Random Projections

Robust principal component analysis

Efficient algorithms for computing a strong rank-revealing QR factorization

The approximation of one matrix by another of lower rank

Trending Questions (1)