A block-asynchronous relaxation method for graphics processing units
Reads0
Chats0
TLDR
This paper develops asynchronous iteration algorithms in CUDA and compares them with parallel implementations of synchronous relaxation methods on CPU- or GPU-based systems and identifies the high potential of the asynchronous methods for Exascale computing.About:
This article is published in Journal of Parallel and Distributed Computing.The article was published on 2013-12-01 and is currently open access. It has received 28 citations till now. The article focuses on the topics: Asynchronous communication & CUDA.read more
Citations
More filters
Proceedings ArticleDOI
Self-stabilizing iterative solvers
Piyush Sao,Richard Vuduc +1 more
TL;DR: It is shown how to use the idea of self-stabilization, which originates in the context of distributed control, to make fault-tolerant iterative solvers, and has promise to become a useful tool for constructing resilient solvers more generally.
Book ChapterDOI
Iterative Sparse Triangular Solves for Preconditioning
TL;DR: This work proposes using an iterative approach for solving sparse triangular systems when an approximation is suitable, and demonstrates the performance gains that this approach can have on GPUs in the context of solving sparse linear systems with a preconditioned Krylov subspace method.
Journal ArticleDOI
Automatic Recognition of Acute Myelogenous Leukemia in Blood Microscopic Images Using K-means Clustering and Support Vector Machine.
TL;DR: The results show that the proposed algorithm has achieved an acceptable performance for diagnosis of AML and its common subtypes and can be used as an assistant diagnostic tool for pathologists.
Book ChapterDOI
Asynchronous Iterative Algorithm for Computing Incomplete Factorizations on GPUs
TL;DR: This paper presents a GPU implementation of an asynchronous iterative algorithm for computing incomplete factorizations that considers several non-traditional techniques that can be important for asynchronous algorithms to optimize convergence and data locality.
Journal ArticleDOI
Linear Algebra Software for Large-Scale Accelerated Multicore Computing
Ahmad Abdelfattah,Hartwig Anzt,Jack Dongarra,Mark Gates,Azzam Haidar,Jakub Kurzak,Piotr Luszczek,Stanimire Tomov,Ichitaro Yamazaki,Asim YarKhan +9 more
TL;DR: The state-of-the-art design and implementation practices for the acceleration of the predominant linear algebra algorithms on large-scale accelerated multicore systems are presented and the development of innovativelinear algebra algorithms using three technologies – mixed precision arithmetic, batched operations, and asynchronous iterations – that are currently of high interest for accelerated multicores systems are emphasized.
References
More filters
Journal ArticleDOI
Toward Exascale Resilience
TL;DR: This white paper synthesizes the motivations, observations and research issues considered as determinant of several complimentary experts of HPC in applications, programming models, distributed systems and system management.
Journal ArticleDOI
On asynchronous iterations
Andreas Frommer,Daniel B. Szyld +1 more
TL;DR: Certain models of asynchronous iterations, using a common theoretical framework, are reviewed, including nonsingular linear systems, nonlinear systems, and initial value problems that arise naturally on parallel computers.
MonographDOI
Parallel Scientific Computing in C++ and MPI: A Seamless Approach to Parallel Algorithms and their Implementation
TL;DR: This book provides a seamless approach to numerical algorithms, modern programming techniques and parallel computing and places equal emphasis on the discretization of partial differential equations and on solvers.
Proceedings ArticleDOI
Architectural core salvaging in a multi-core processor for hard-error tolerance
TL;DR: It is shown that even if some individual cores cannot execute certain operations, a CPU die can be instruction-set-architecture (ISA) compliant, that is execute all of the instructions required by its ISA, by exploiting natural cross-core redundancy.