scispace - formally typeset
Journal ArticleDOI

A direct tridiagonal solver based on Givens rotations for GPU architectures

Reads0
Chats0
TLDR
A parallel solver for general tridiagonal irreducible systems and its CUDA implementation are described, indicating that g-Spike is competitive in runtime with existing GPU methods, and can provide acceptable results when other methods cannot be applied or fail.
Abstract
A parallel solver for general tridiagonal irreducible systems is described.Solver based on Spike framework and Givens-QR with occasional low-rank modification.Modifications handle singularities exposed by QR in blocks of the parallel partition.The GPU implementation has similar performance to existing methods.Method returns accurate results when current GPU tridiagonal solvers fail. g-Spike, a parallel algorithm for solving general nonsymmetric tridiagonal systems for the GPU, and its CUDA implementation are described. The solver is based on the Spike framework, applying Givens rotations and QR factorization without pivoting. It also implements a low-rank modification strategy to compute the Spike DS decomposition even when the partitioning defines singular submatrices along the diagonal. The method is also used to solve the reduced system resulting from the Spike partitioning. Numerical experiments with problems of high order indicate that g-Spike is competitive in runtime with existing GPU methods, and can provide acceptable results when other methods cannot be applied or fail.

read more

Citations
More filters
Journal Article

Parallel Algorithms for Banded Linear Systems.

TL;DR: In this paper, a partitioned Gaussian elimination algorithm with partial pivoting is proposed for multiprocessors with small to moderate numbers of processing elements, where the submatrices in the chosen partitioning may be rank-deficient and the algorithm more complex than those which have been proposed for diagonally dominant and symmetric positive-definite systems.
Journal ArticleDOI

Solving Large Problem Sizes of Index-Digit Algorithms on GPU: FFT and Tridiagonal System Solvers

TL;DR: A tuning strategy has been applied to develop flexible Multi-Stage (MS) algorithms for the Fast Fourier Transform (FFT) algorithm and a tridiagonal system solver on the GPU that outperforms other well-known and commonly used state-of-the-art libraries.
Journal ArticleDOI

A parallel multithreaded sparse triangular linear system solver

TL;DR: A parallel sparse triangular linear system solver based on the Spike algorithm that outperforms Intel’s Math Kernel Library (MKL) on a multicore architecture and shows the effect of various sparse matrix reordering schemes.
DissertationDOI

Dense and sparse parallel linear algebra algorithms on graphics processing units

TL;DR: This thesis studies the use of graphics processing units as computer accelerators and applies it to the field of linear algebra, and implemented several algorithms to solve linear systems of equations for the specific case of matrices with a block-tridiagonal structure that are run on GPU.
Journal ArticleDOI

MPI-CUDA parallel linear solvers for block-tridiagonal matrices in the context of SLEPc's eigensolvers

TL;DR: This work aims to compare different direct linear solvers that can exploit the block-tridiagonal structure and develops a parallel implementation based on MPI in the context of the SLEPc library.
References
More filters
Book

Programming Massively Parallel Processors: A Hands-on Approach

TL;DR: Programming Massively Parallel Processors: A Hands-on Approach as discussed by the authors shows both student and professional alike the basic concepts of parallel programming and GPU architecture, and various techniques for constructing parallel programs are explored in detail.

GPU Computing

TL;DR: The background, hardware, and programming model for GPU computing is described, the state of the art in tools and techniques are summarized, and four GPU computing successes in game physics and computational biophysics that deliver order-of-magnitude performance gains over optimized CPU applications are presented.
Book

ScaLAPACK Users' Guide

TL;DR: This book is very referred for you because it gives not only the experience but also lesson, it is about this book that will give wellness for all people from many societies.
Journal ArticleDOI

Methods for modifying matrix factorizations.

TL;DR: Several methods are described for modifying Cholesky factors and a new algorithm is presented for modifying the complete orthogonal factorization of a general matrix, from which the conventional QR factors are obtained as a special case.
Journal ArticleDOI

On Stable Parallel Linear System Solvers

TL;DR: Three stable parallel algorithms for solving dense and tndlagonai systems of lmear equations are discussed and one of the algorithms presented here is superior to the best previous algorithm in that with a modest increase in time.
Related Papers (5)