Journal ArticleDOI
A direct tridiagonal solver based on Givens rotations for GPU architectures
Ioannis E. Venetis,Alexandros Kouris,Alexandros Sobczyk,Efstratios Gallopoulos,Ahmed H. Sameh +4 more
- Vol. 49, pp 101-116
Reads0
Chats0
TLDR
A parallel solver for general tridiagonal irreducible systems and its CUDA implementation are described, indicating that g-Spike is competitive in runtime with existing GPU methods, and can provide acceptable results when other methods cannot be applied or fail.Abstract:
A parallel solver for general tridiagonal irreducible systems is described.Solver based on Spike framework and Givens-QR with occasional low-rank modification.Modifications handle singularities exposed by QR in blocks of the parallel partition.The GPU implementation has similar performance to existing methods.Method returns accurate results when current GPU tridiagonal solvers fail. g-Spike, a parallel algorithm for solving general nonsymmetric tridiagonal systems for the GPU, and its CUDA implementation are described. The solver is based on the Spike framework, applying Givens rotations and QR factorization without pivoting. It also implements a low-rank modification strategy to compute the Spike DS decomposition even when the partitioning defines singular submatrices along the diagonal. The method is also used to solve the reduced system resulting from the Spike partitioning. Numerical experiments with problems of high order indicate that g-Spike is competitive in runtime with existing GPU methods, and can provide acceptable results when other methods cannot be applied or fail.read more
Citations
More filters
Journal Article
Parallel Algorithms for Banded Linear Systems.
TL;DR: In this paper, a partitioned Gaussian elimination algorithm with partial pivoting is proposed for multiprocessors with small to moderate numbers of processing elements, where the submatrices in the chosen partitioning may be rank-deficient and the algorithm more complex than those which have been proposed for diagonally dominant and symmetric positive-definite systems.
Journal ArticleDOI
Solving Large Problem Sizes of Index-Digit Algorithms on GPU: FFT and Tridiagonal System Solvers
TL;DR: A tuning strategy has been applied to develop flexible Multi-Stage (MS) algorithms for the Fast Fourier Transform (FFT) algorithm and a tridiagonal system solver on the GPU that outperforms other well-known and commonly used state-of-the-art libraries.
Journal ArticleDOI
A parallel multithreaded sparse triangular linear system solver
İlke Çuğu,Murat Manguoglu +1 more
TL;DR: A parallel sparse triangular linear system solver based on the Spike algorithm that outperforms Intel’s Math Kernel Library (MKL) on a multicore architecture and shows the effect of various sparse matrix reordering schemes.
DissertationDOI
Dense and sparse parallel linear algebra algorithms on graphics processing units
TL;DR: This thesis studies the use of graphics processing units as computer accelerators and applies it to the field of linear algebra, and implemented several algorithms to solve linear systems of equations for the specific case of matrices with a block-tridiagonal structure that are run on GPU.
Journal ArticleDOI
MPI-CUDA parallel linear solvers for block-tridiagonal matrices in the context of SLEPc's eigensolvers
A. Lamas Daviña,Jose E. Roman +1 more
TL;DR: This work aims to compare different direct linear solvers that can exploit the block-tridiagonal structure and develops a parallel implementation based on MPI in the context of the SLEPc library.
References
More filters
Book
Programming Massively Parallel Processors: A Hands-on Approach
David B. Kirk,Wen-mei W. Hwu +1 more
TL;DR: Programming Massively Parallel Processors: A Hands-on Approach as discussed by the authors shows both student and professional alike the basic concepts of parallel programming and GPU architecture, and various techniques for constructing parallel programs are explored in detail.
GPU Computing
TL;DR: The background, hardware, and programming model for GPU computing is described, the state of the art in tools and techniques are summarized, and four GPU computing successes in game physics and computational biophysics that deliver order-of-magnitude performance gains over optimized CPU applications are presented.
Book
ScaLAPACK Users' Guide
L. S. Blackford,Jae-Young Choi,A. Cleary,Eduardo D'Azevedo,James Demmel,Inderjit S. Dhillon,Jack Dongarra,Sven Hammarling,Greg Henry,A. Petitet,K. Stanley,David W. Walker,R. C. Whaley +12 more
TL;DR: This book is very referred for you because it gives not only the experience but also lesson, it is about this book that will give wellness for all people from many societies.
Journal ArticleDOI
Methods for modifying matrix factorizations.
TL;DR: Several methods are described for modifying Cholesky factors and a new algorithm is presented for modifying the complete orthogonal factorization of a general matrix, from which the conventional QR factors are obtained as a special case.
Journal ArticleDOI
On Stable Parallel Linear System Solvers
Ahmed H. Sameh,David J. Kuck +1 more
TL;DR: Three stable parallel algorithms for solving dense and tndlagonai systems of lmear equations are discussed and one of the algorithms presented here is superior to the best previous algorithm in that with a modest increase in time.