scispace - formally typeset
P

Piyush Sao

Researcher at Oak Ridge National Laboratory

Publications -  20
Citations -  288

Piyush Sao is an academic researcher from Oak Ridge National Laboratory. The author has contributed to research in topics: Computer science & Solver. The author has an hindex of 6, co-authored 16 publications receiving 213 citations. Previous affiliations of Piyush Sao include Georgia Institute of Technology.

Papers
More filters
Proceedings ArticleDOI

Self-stabilizing iterative solvers

TL;DR: It is shown how to use the idea of self-stabilization, which originates in the context of distributed control, to make fault-tolerant iterative solvers, and has promise to become a useful tool for constructing resilient solvers more generally.
Book ChapterDOI

A Distributed CPU-GPU Sparse Direct Solver

TL;DR: This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting.
Journal ArticleDOI

Traversing large graphs on GPUs with unified memory

TL;DR: A lightweight offline graph reordering algorithm, HALO (Harmonic Locality Ordering), is proposed that can be used as a pre-processing step for static graphs and specifically aims to cover large directed real world graphs in addition to undirected graphs whereas prior methods only account for the latter.
Proceedings ArticleDOI

A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices

TL;DR: A new algorithm to improve the strong scalability of right-looking sparse LU factorization on distributed memory systems using a three-dimensional MPI process grid, aggressively exploits elimination tree parallelism and trades off increased memory for reduced per-process communication.
Proceedings ArticleDOI

A supernodal all-pairs shortest path algorithm

TL;DR: The key idea in this approach is to use the known algebraic relationship between Floyd-Warshall and Gaussian elimination, and import several algorithmic techniques from sparse Cholesky factorization, namely, fill-in reducing ordering, symbolic analysis, supernodal traversal, and elimination tree parallelism, which reduce computation, improve locality and enhance parallelism.