Journal ArticleDOI
Parallel QR Decomposition of a rectangular matrix
Reads0
Chats0
TLDR
In this article, the authors show that the greedy algorithm introduced in [1] and [5] to perform the parallel QR decomposition of a dense rectangular matrix of sizem×n is optimal.Abstract:
We show that the greedy algorithm introduced in [1] and [5] to perform the parallel QR decomposition of a dense rectangular matrix of sizem×n is optimal. Then we assume thatm/n2 tends to zero asm andn go to infinity, and prove that the complexity of such a decomposition is asymptotically2n, when an unlimited number of processors is available.read more
Citations
More filters
Journal ArticleDOI
Communication-optimal Parallel and Sequential QR and LU Factorizations
TL;DR: Two parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform, and just as stable as Householder QR are presented.
Posted Content
Communication-optimal parallel and sequential QR and LU factorizations
TL;DR: In this article, the authors present parallel and sequential dense QR factorization algorithms that are both optimal (up to polylogarithmic factors) in the amount of communication they perform, and just as stable as Householder QR.
Proceedings ArticleDOI
Algorithm-based fault tolerance for dense matrix factorizations
TL;DR: A new hybrid approach, based on Algorithm-Based Fault Tolerance (ABFT), to help matrix factorizations algorithms survive fail-stop failures and theoretical analysis shows that the fault tolerance overhead sharply decreases with the scaling in the number of computing units and the problem size.
Journal ArticleDOI
Distributed orthogonal factorization: givens and householder algorithms
Alex Pothen,Padma Raghavan +1 more
TL;DR: The hybrid algorithm is the fastest algorithm overall, since its arithmetic cost is lower than the Householder algorithms and its communication cost does not increase with the column length of the matrix.
Posted Content
Communication-avoiding parallel and sequential QR factorizations
TL;DR: Both parallel and sequential performance results show that TSQR outperforms competing methods, and CAQR (Communication-Avoiding QR), factors general rectangular matrices distributed in a two-dimensional block cyclic layout, removes a latency bottleneck in ScaLAPACK's current parallel approach.
References
More filters
Journal ArticleDOI
Very high-speed computing systems
TL;DR: In this paper, the authors classified very high-speed computers as follows: 1) Single Instruction Stream-Single Data Stream (SISD) 2) SIMD 3) MIMD 4) MISD-MIMD.
Journal ArticleDOI
A Survey of Parallel Algorithms in Numerical Linear Algebra.
TL;DR: A comprehensive survey of parallel techniques for problems in linear algebra is given, specific topics include: relevant computer models and their consequences for programs, evaluation of arithmetic expressions, solution of general and special linear systems of equations, and computation of eigenvalues.
Journal ArticleDOI
On Stable Parallel Linear System Solvers
Ahmed H. Sameh,David J. Kuck +1 more
TL;DR: Three stable parallel algorithms for solving dense and tndlagonai systems of lmear equations are discussed and one of the algorithms presented here is superior to the best previous algorithm in that with a modest increase in time.
Journal ArticleDOI
Solving Linear Algebraic Equations on an MIMD Computer
TL;DR: Two pracUcal parallel algorithms for solving systems of dense linear equations on an MIMD computer are presented, based on Gaussian elunmation and Givens transformations, which are numerically stable and have been tested on the Denelcor HEP machine.
Journal ArticleDOI
An alternative givens ordering
J. J. Modi,M. R. Clarke +1 more
TL;DR: In this paper, a new Givens ordering was proposed, empirically and by an approximate theoretical analysis, to take appreciably fewer stages than the standard GivENS ordering.