P
Piyush Sao
Researcher at Oak Ridge National Laboratory
Publications - 20
Citations - 288
Piyush Sao is an academic researcher from Oak Ridge National Laboratory. The author has contributed to research in topics: Computer science & Solver. The author has an hindex of 6, co-authored 16 publications receiving 213 citations. Previous affiliations of Piyush Sao include Georgia Institute of Technology.
Papers
More filters
Proceedings ArticleDOI
Self-stabilizing iterative solvers
Piyush Sao,Richard Vuduc +1 more
TL;DR: It is shown how to use the idea of self-stabilization, which originates in the context of distributed control, to make fault-tolerant iterative solvers, and has promise to become a useful tool for constructing resilient solvers more generally.
Book ChapterDOI
A Distributed CPU-GPU Sparse Direct Solver
TL;DR: This paper presents the first hybrid MPI+OpenMP+CUDA implementation of a distributed memory right-looking unsymmetric sparse direct solver (i.e., sparse LU factorization) that uses static pivoting.
Journal ArticleDOI
Traversing large graphs on GPUs with unified memory
TL;DR: A lightweight offline graph reordering algorithm, HALO (Harmonic Locality Ordering), is proposed that can be used as a pre-processing step for static graphs and specifically aims to cover large directed real world graphs in addition to undirected graphs whereas prior methods only account for the latter.
Proceedings ArticleDOI
A Communication-Avoiding 3D LU Factorization Algorithm for Sparse Matrices
TL;DR: A new algorithm to improve the strong scalability of right-looking sparse LU factorization on distributed memory systems using a three-dimensional MPI process grid, aggressively exploits elimination tree parallelism and trades off increased memory for reduced per-process communication.
Proceedings ArticleDOI
A supernodal all-pairs shortest path algorithm
TL;DR: The key idea in this approach is to use the known algebraic relationship between Floyd-Warshall and Gaussian elimination, and import several algorithmic techniques from sparse Cholesky factorization, namely, fill-in reducing ordering, symbolic analysis, supernodal traversal, and elimination tree parallelism, which reduce computation, improve locality and enhance parallelism.