Kokkos: Enabling manycore performance portability through polymorphic memory access patterns

doi:10.1016/J.JPDC.2014.07.003

Open AccessJournal ArticleDOI

Kokkos: Enabling manycore performance portability through polymorphic memory access patterns

H. Carter Edwards, +2 more

- 01 Dec 2014 -

Journal of Parallel and Distributed Comp...

- Vol. 74, Iss: 12, pp 3202-3216

Chats0

TLDR

Kokkos’ abstractions are described, its application programmer interface (API) is summarized, performance results for unit-test kernels and mini-applications are presented, and an incremental strategy for migrating legacy C++ codes to Kokkos is outlined.

About:

This article is published in Journal of Parallel and Distributed Computing.The article was published on 2014-12-01 and is currently open access. It has received 682 citations till now. The article focuses on the topics: Software portability & Data parallelism.

Citations

PDF

Open Access

More filters

Fast parallel algorithms for short-range molecular dynamics

Steven J. Plimpton

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.

...read moreread less

Journal ArticleDOI

Quantum ESPRESSO toward the exascale.

Paolo Giannozzi, +14 more

- 21 Apr 2020 -

Journal of Chemical Physics

TL;DR: A motivation and brief review of the ongoing effort to port Quantum ESPRESSO onto heterogeneous architectures based on hardware accelerators, which will overcome the energy constraints that are currently hindering the way toward exascale computing are presented.

...read moreread less

Journal ArticleDOI

The Athena++ Adaptive Mesh Refinement Framework: Design and Magnetohydrodynamic Solvers

James M. Stone, +4 more

- 26 Jun 2020 -

Astrophysical Journal Supplement Series

TL;DR: The design and implementation of a new framework for adaptive mesh refinement (AMR) calculations is described, intended primarily for applications in astrophysical fluid dynamics, but its flexible and modular design enables its use for a wide variety of physics.

...read moreread less

Journal ArticleDOI

Direct simulation Monte Carlo on petaflop supercomputers and beyond

Steven J. Plimpton, +6 more

- 01 Aug 2019 -

Physics of Fluids

TL;DR: SPARTA as mentioned in this paper is an implementation of the Direct Simulation Monte Carlo (DSMC) method for modeling rarefied gas dynamics in a variety of scenarios, and it can operate in parallel at the scale of many billions of particles or grid cells.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Fast parallel algorithms for short-range molecular dynamics

Steven J. Plimpton

- 01 Mar 1995 -

Journal of Computational Physics

TL;DR: In this article, three parallel algorithms for classical molecular dynamics are presented, which can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors.

...read moreread less

Fast parallel algorithms for short-range molecular dynamics

Steven J. Plimpton

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.

...read moreread less

Book ChapterDOI

Efficient management of parallelism in object-oriented numerical software libraries

Satish Balay, +3 more

TL;DR: The PETSc 2.0 package as discussed by the authors uses object-oriented programming to conceal the details of the message passing, without concealing the parallelism, in a high-quality set of numerical software libraries.

...read moreread less

Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries.

Satish Balay, +3 more

TL;DR: The concepts discussed are appropriate for all scalable computing systems and provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability.

...read moreread less

Journal ArticleDOI

StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

Cédric Augonnet, +3 more

TL;DR: StarPU as mentioned in this paper is a runtime system that provides a high-level unified execution model for numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware and easily develop and tune powerful scheduling algorithms.

...read moreread less