Distributed Memory Graph Coloring Algorithms for Multiple GPUs
Ian Bogle,Erik G. Boman,Karen D. Devine,Sivasankaran Rajamanickam,George M. Slota +4 more
- pp 54-62
Reads0
Chats0
TLDR
In this article, the authors present several MPI+GPU coloring approaches that use implementations of the distributed coloring algorithms of Gebremedhin et al and the shared-memory algorithms of Deveci et al. The on-node parallel coloring uses implementations in KokkosKernels, which provide parallelization for both multicore CPUs and GPUs.Abstract:
Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments; it identifies sets of independent data that can be updated in parallel. Many algorithms exist for graph coloring on a single GPU or in distributed memory, but hybrid MPI+GPU algorithms have been unexplored until this work, to the best of our knowledge. We present several MPI+GPU coloring approaches that use implementations of the distributed coloring algorithms of Gebremedhin et al. and the shared-memory algorithms of Deveci et al. The on-node parallel coloring uses implementations in KokkosKernels, which provide parallelization for both multicore CPUs and GPUs. We further extend our approaches to solve for distance-2 coloring, giving the first known distributed and multi-GPU algorithm for this problem. In addition, we propose novel methods to reduce communication in distributed graph coloring. Our experiments show that our approaches operate efficiently on inputs too large to fit on a single GPU and scale up to graphs with 76.7 billion edges running on 128 GPUs.read more
Citations
More filters
Journal ArticleDOI
EXAGRAPH: Graph and combinatorial methods for enabling exascale applications:
Seher Acer,Ariful Azad,Erik G. Boman,Aydin Buluc,Karen D. Devine,S. M. Ferdous,Nitin A. Gawande,Nitin A. Gawande,Sayan Ghosh,Mahantesh Halappanavar,Mahantesh Halappanavar,Ananth Kalyanaraman,Ananth Kalyanaraman,Arif O. Khan,Marco Minutoli,Alex Pothen,Sivasankaran Rajamanickam,Oguz Selvitopi,Nathan R. Tallent,Antonino Tumeo +19 more
TL;DR: This paper surveys the algorithmic and software development activities performed under the auspices of ExaGraph from both a combinatorial and an algebraic perspective, and details the recent efforts in porting the algorithms to manycore accelerator (GPU) architectures.
Proceedings ArticleDOI
Parallel Vertex Color Update on Large Dynamic Networks
TL;DR: In this article , a GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks is presented. But the algorithm is limited to a single GPU and is not suitable for large networks.
References
More filters
Journal ArticleDOI
The university of Florida sparse matrix collection
Timothy A. Davis,Yifan Hu +1 more
TL;DR: The University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications, is described and a new multilevel coarsening scheme is proposed to facilitate this task.
Journal ArticleDOI
New methods to color the vertices of a graph
TL;DR: An exact method is given which performs better than the Randall-Brown algorithm and is able to color larger graphs and the new heuristic methods, the classical methods, and the exact method are compared.
Journal ArticleDOI
An overview of the Trilinos project
Michael A. Heroux,Roscoe A. Bartlett,Vicki E. Howle,Robert J. Hoekstra,Jonathan Joseph Hu,Tamara G. Kolda,Richard B. Lehoucq,Kevin Long,Roger P. Pawlowski,Eric T. Phipps,Andrew G. Salinger,Heidi K. Thornquist,Raymond S. Tuminaro,James M. Willenbring,Alan B. Williams,Kendall S. Stanley +15 more
TL;DR: The overall Trilinos design is presented, describing the use of abstract interfaces and default concrete implementations and how packages can be combined to rapidly develop new algorithms.
Journal ArticleDOI
On Colouring the Nodes of a Network
TL;DR: Let N be a network (or linear graph) such that at each node not more than n lines meet (where n > 2), and no line has both ends at the same node.
Journal ArticleDOI
Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
TL;DR: Kokkos’ abstractions are described, its application programmer interface (API) is summarized, performance results for unit-test kernels and mini-applications are presented, and an incremental strategy for migrating legacy C++ codes to Kokkos is outlined.