scispace - formally typeset
Open Access

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate

Reads0
Chats0
TLDR
In this article, the authors present a modular framework for constructing randomized algorithms that compute partial matrix decompositions, which use random sampling to identify a subspace that captures most of the action of a matrix.
Abstract
Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that ran- domization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multi- processor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

read more

Citations
More filters
Proceedings ArticleDOI

Low rank approximation and regression in input sparsity time

TL;DR: The fastest known algorithms for overconstrained least-squares regression, low-rank approximation, approximating all leverage scores, and lp-regression are obtained.
Journal ArticleDOI

Low-Rank Approximation and Regression in Input Sparsity Time

TL;DR: A new distribution over m × n matrices S is designed so that, for any fixed n × d matrix A of rank r, with probability at least 9/10, ∥SAx∥2 = (1 ± ε)∥Ax∢2 simultaneously for all x ∈ Rd.
Journal ArticleDOI

Reducing Snapshots to Points: A Visual Analytics Approach to Dynamic Network Exploration

TL;DR: With this approach users are enabled to detect stable states, recurring states, outlier topologies, and gain knowledge about the transitions between states and the network evolution in general, by applying it to artificial and real-world dynamic networks.
Proceedings ArticleDOI

Fine-grained visual categorization via multi-stage metric learning

TL;DR: Zhang et al. as mentioned in this paper proposed a multi-stage metric learning framework that divides the large-scale high-dimensional learning problem to a series of simple subproblems, achieving O(d) computational complexity.
Posted Content

Low Rank Approximation and Regression in Input Sparsity Time

TL;DR: In this article, a new distribution over poly(r \eps^{-1}) \times n$ matrices called sparse embedding matrices (S$) was proposed, which can be computed in O(n) + poly(d \eps}) time.
References
More filters
Book

Matrix Analysis

TL;DR: In this article, the authors present results of both classic and recent matrix analyses using canonical forms as a unifying theme, and demonstrate their importance in a variety of applications, such as linear algebra and matrix theory.
Book

Compressed sensing

TL;DR: It is possible to design n=O(Nlog(m)) nonadaptive measurements allowing reconstruction with accuracy comparable to that attainable with direct knowledge of the N most important coefficients, and a good approximation to those N important coefficients is extracted from the n measurements by solving a linear program-Basis Pursuit in signal processing.
Journal ArticleDOI

The Monte Carlo method.

TL;DR: In this paper, the authors present a statistical approach to the study of integro-differential equations that occur in various branches of the natural sciences, such as biology and chemistry.
Related Papers (5)