Kokkos: Enabling manycore performance portability through polymorphic memory access patterns
Reads0
Chats0
TLDR
Kokkos’ abstractions are described, its application programmer interface (API) is summarized, performance results for unit-test kernels and mini-applications are presented, and an incremental strategy for migrating legacy C++ codes to Kokkos is outlined.About:
This article is published in Journal of Parallel and Distributed Computing.The article was published on 2014-12-01 and is currently open access. It has received 682 citations till now. The article focuses on the topics: Software portability & Data parallelism.read more
Citations
More filters
Fast parallel algorithms for short-range molecular dynamics
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Journal ArticleDOI
Quantum ESPRESSO toward the exascale.
Paolo Giannozzi,Oscar Baseggio,Pietro Bonfà,Davide Brunato,Roberto Car,Ivan Carnimeo,Carlo Cavazzoni,Stefano de Gironcoli,Pietro Delugas,Fabrizio Ferrari Ruffino,Andrea Ferretti,Nicola Marzari,Iurii Timrov,Andrea Urru,Stefano Baroni +14 more
TL;DR: A motivation and brief review of the ongoing effort to port Quantum ESPRESSO onto heterogeneous architectures based on hardware accelerators, which will overcome the energy constraints that are currently hindering the way toward exascale computing are presented.
Journal ArticleDOI
Quantum ESPRESSO toward the exascale.
Paolo Giannozzi,Oscar Baseggio,Pietro Bonfà,Davide Brunato,Roberto Car,Ivan Carnimeo,Carlo Cavazzoni,Stefano de Gironcoli,Pietro Delugas,Fabrizio Ferrari Ruffino,Andrea Ferretti,Nicola Marzari,Iurii Timrov,Andrea Urru,Stefano Baroni +14 more
TL;DR: Quantum ESPRESSO as mentioned in this paper is an open-source distribution of computer codes for quantum-mechanical materials modeling, based on density-functional theory, pseudopotentials, and plane waves.
Journal ArticleDOI
The Athena++ Adaptive Mesh Refinement Framework: Design and Magnetohydrodynamic Solvers
TL;DR: The design and implementation of a new framework for adaptive mesh refinement (AMR) calculations is described, intended primarily for applications in astrophysical fluid dynamics, but its flexible and modular design enables its use for a wide variety of physics.
Journal ArticleDOI
Direct simulation Monte Carlo on petaflop supercomputers and beyond
Steven J. Plimpton,Stan Gerald Moore,Arnaud Borner,A. K. Stagg,Timothy P. Koehler,John R. Torczynski,Michail A. Gallis +6 more
TL;DR: SPARTA as mentioned in this paper is an implementation of the Direct Simulation Monte Carlo (DSMC) method for modeling rarefied gas dynamics in a variety of scenarios, and it can operate in parallel at the scale of many billions of particles or grid cells.
References
More filters
Journal ArticleDOI
Fast parallel algorithms for short-range molecular dynamics
TL;DR: In this article, three parallel algorithms for classical molecular dynamics are presented, which can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors.
Fast parallel algorithms for short-range molecular dynamics
TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Book ChapterDOI
Efficient management of parallelism in object-oriented numerical software libraries
TL;DR: The PETSc 2.0 package as discussed by the authors uses object-oriented programming to conceal the details of the message passing, without concealing the parallelism, in a high-quality set of numerical software libraries.
Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries.
TL;DR: The concepts discussed are appropriate for all scalable computing systems and provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability.
Journal ArticleDOI
StarPU: a unified platform for task scheduling on heterogeneous multicore architectures
TL;DR: StarPU as mentioned in this paper is a runtime system that provides a high-level unified execution model for numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware and easily develop and tune powerful scheduling algorithms.