Home
/
Authors
/
Nathan Halko

Author

Nathan Halko

Bio: Nathan Halko is an academic researcher from University of Colorado Boulder. The author has contributed to research in topics: Singular value decomposition & QR decomposition. The author has an hindex of 6, co-authored 6 publications receiving 5561 citations.

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

[...]

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

01 May 2011-Siam Review

TL;DR: This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation, and presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions.

...read moreread less

Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that randomization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the $k$ dominant components of the singular value decomposition of an $m \times n$ matrix. (i) For a dense input matrix, randomized algorithms require $\bigO(mn \log(k))$ floating-point operations (flops) in contrast to $ \bigO(mnk)$ for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multiprocessor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to $\bigO(k)$ passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

...read moreread less

3,248 citations

Posted Content•

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

[...]

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

22 Sep 2009-arXiv: Numerical Analysis

TL;DR: In this article, a modular framework for constructing randomized algorithms that compute partial matrix decompositions is presented, which uses random sampling to identify a subspace that captures most of the action of a matrix and then the input matrix is compressed to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization.

...read moreread less

2,356 citations

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate

[...]

Matrix Decompositions, Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp

01 Jan 2011

TL;DR: In this article, the authors present a modular framework for constructing randomized algorithms that compute partial matrix decompositions, which use random sampling to identify a subspace that captures most of the action of a matrix.

...read moreread less

Abstract: Low-rank matrix approximations, such as the truncated singular value decomposition and the rank-revealing QR decomposition, play a central role in data analysis and scientific computing. This work surveys and extends recent research which demonstrates that ran- domization offers a powerful tool for performing low-rank matrix approximation. These techniques exploit modern computational architectures more fully than classical methods and open the possibility of dealing with truly massive data sets. This paper presents a modular framework for constructing randomized algorithms that compute partial matrix decompositions. These methods use random sampling to identify a subspace that captures most of the action of a matrix. The input matrix is then compressed—either explicitly or implicitly—to this subspace, and the reduced matrix is manipulated deterministically to obtain the desired low-rank factorization. In many cases, this approach beats its classical competitors in terms of accuracy, robustness, and/or speed. These claims are supported by extensive numerical experiments and a detailed error analysis. The specific benefits of randomized techniques depend on the computational environment. Consider the model problem of finding the k dominant components of the singular value decomposition of an m × n matrix. (i) For a dense input matrix, randomized algorithms require O(mn log(k)) floating-point operations (flops) in contrast to O(mnk) for classical algorithms. (ii) For a sparse input matrix, the flop count matches classical Krylov subspace methods, but the randomized approach is more robust and can easily be reorganized to exploit multi- processor architectures. (iii) For a matrix that is too large to fit in fast memory, the randomized techniques require only a constant number of passes over the data, as opposed to O(k) passes for classical algorithms. In fact, it is sometimes possible to perform matrix approximation with a single pass over the data.

...read moreread less

494 citations

Journal Article•DOI•

An Algorithm for the Principal Component Analysis of Large Data Sets

[...]

Nathan Halko, Per-Gunnar Martinsson, Yoel Shkolnisky, Mark Tygert¹•Institutions (1)

Yale University¹

01 Sep 2011-SIAM Journal on Scientific Computing

TL;DR: This work adapts one of these randomized methods for principal component analysis (PCA) for use with data sets that are too large to be stored in random-access memory (RAM), and reports on the performance of the algorithm.

...read moreread less

Abstract: Recently popularized randomized methods for principal component analysis (PCA) efficiently and reliably produce nearly optimal accuracy—even on parallel processors—unlike the classical (deterministic) alternatives. We adapt one of these randomized methods for use with data sets that are too large to be stored in random-access memory (RAM). (The traditional terminology is that our procedure works efficiently out-of-core.) We illustrate the performance of the algorithm via several numerical examples. For example, we report on the PCA of a data set stored on disk that is so large that less than a hundredth of it can fit in our computer's RAM.

...read moreread less

281 citations

Randomized methods for computing low-rank approximations of matrices

[...]

Per-Gunnar Martinsson¹, Nathan Halko¹•Institutions (1)

University of Colorado Boulder¹

01 Jan 2012

TL;DR: The dissertation describes a set of randomized techniques for rapidly constructing a low-rank approximation to a matrix and presents a parallelized randomized scheme for computing a reduced rank Singular Value Decomposition.

...read moreread less

Abstract: Randomized sampling techniques have recently proved capable of efficiently solving many standard problems in linear algebra, and enabling computations at scales far larger than what was previously possible. The new algorithms are designed from the bottom up to perform well in modern computing environments where the expense of communication is the primary constraint. In extreme cases, the algorithms can even be made to work in a streaming environment where the matrix is not stored at all, and each element can be seen only once. The dissertation describes a set of randomized techniques for rapidly constructing a low-rank approximation to a matrix. The algorithms are presented in a modular framework that first computes an approximation to the range of the matrix via randomized sampling. Secondly, the matrix is projected to the approximate range, and a factorization (SVD, QR, LU, etc.) of the resulting low-rank matrix is computed via variations of classical deterministic methods. Theoretical performance bounds are provided. Particular attention is given to very large scale computations where the matrix does not fit in RAM on a single workstation. Algorithms are developed for the case where the original matrix must be stored out-of-core but where the factors of the approximation fit in RAM. Numerical examples are provided that perform Principal Component Analysis of a data set that is so large that less than one hundredth of it can fit in the RAM of a standard laptop computer. Furthermore, the dissertation presents a parallelized randomized scheme for computing a reduced rank Singular Value Decomposition. By parallelizing and distributing both the randomized sampling stage and the processing of the factors in the approximate factorization, the method requires an amount of memory per node which is independent of both dimensions of the input matrix. Numerical experiments are performed on Hadoop clusters of computers in Amazon's Elastic Compute Cloud with up to 64 total cores. Finally, we directly compare the performance and accuracy of the randomized algorithm with the classical Lanczos method on extremely large, sparse matrices and substantiate the claim that randomized methods are superior in this environment.

...read moreread less

43 citations

Cited by

PDF

Open Access

More filters

Journal Article•DOI•

Finding Structure with Randomness: Probabilistic Algorithms for Constructing Approximate Matrix Decompositions

[...]

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

01 May 2011-Siam Review

...read moreread less

3,248 citations

Posted Content•

Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions

[...]

Nathan Halko, Per-Gunnar Martinsson, Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

22 Sep 2009-arXiv: Numerical Analysis

...read moreread less

2,356 citations

Journal Article•DOI•

Exact matrix completion via convex optimization

[...]

Emmanuel J. Candès¹, Benjamin Recht²•Institutions (2)

Stanford University¹, University of Wisconsin-Madison²

01 Jun 2012-Communications of The ACM

TL;DR: In this paper, a convex programming problem is used to find the matrix with the minimum nuclear norm that is consistent with the observed entries in a low-rank matrix, which is then used to recover all the missing entries from most sufficiently large subsets.

...read moreread less

Abstract: Suppose that one observes an incomplete subset of entries selected from a low-rank matrix. When is it possible to complete the matrix and recover the entries that have not been seen? We demonstrate that in very general settings, one can perfectly recover all of the missing entries from most sufficiently large subsets by solving a convex programming problem that finds the matrix with the minimum nuclear norm agreeing with the observed entries. The techniques used in this analysis draw upon parallels in the field of compressed sensing, demonstrating that objects other than signals and images can be perfectly reconstructed from very limited information.

...read moreread less

2,327 citations

Journal Article•DOI•

User-Friendly Tail Bounds for Sums of Random Matrices

[...]

Joel A. Tropp¹•Institutions (1)

California Institute of Technology¹

01 Aug 2012-Foundations of Computational Mathematics

TL;DR: This paper presents new probability inequalities for sums of independent, random, self-adjoint matrices and provides noncommutative generalizations of the classical bounds associated with the names Azuma, Bennett, Bernstein, Chernoff, Hoeffding, and McDiarmid.

...read moreread less

Abstract: This paper presents new probability inequalities for sums of independent, random, self-adjoint matrices. These results place simple and easily verifiable hypotheses on the summands, and they deliver strong conclusions about the large-deviation behavior of the maximum eigenvalue of the sum. Tail bounds for the norm of a sum of random rectangular matrices follow as an immediate corollary. The proof techniques also yield some information about matrix-valued martingales. In other words, this paper provides noncommutative generalizations of the classical bounds associated with the names Azuma, Bennett, Bernstein, Chernoff, Hoeffding, and McDiarmid. The matrix inequalities promise the same diversity of application, ease of use, and strength of conclusion that have made the scalar inequalities so valuable.

...read moreread less

1,675 citations

Journal Article•DOI•

Ising formulations of many NP problems

[...]

Andrew Lucas¹•Institutions (1)

Harvard University¹

01 Feb 2014-Frontiers of Physics in China

TL;DR: This work collects and extends mappings to the Ising model from partitioning, covering and satisfiability, and provides Ising formulations for many NP-complete and NP-hard problems, including all of Karp's 21NP-complete problems.

...read moreread less

Abstract: We provide Ising formulations for many NP-complete and NP-hard problems, including all of Karp's 21 NP-complete problems This collects and extends mappings to the Ising model from partitioning, covering and satisfiability In each case, the required number of spins is at most cubic in the size of the problem This work may be useful in designing adiabatic quantum optimization algorithms

...read moreread less

1,604 citations

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200

Collapse