scispace - formally typeset
Open AccessJournal ArticleDOI

Kokkos: Enabling manycore performance portability through polymorphic memory access patterns

Reads0
Chats0
TLDR
Kokkos’ abstractions are described, its application programmer interface (API) is summarized, performance results for unit-test kernels and mini-applications are presented, and an incremental strategy for migrating legacy C++ codes to Kokkos is outlined.
About
This article is published in Journal of Parallel and Distributed Computing.The article was published on 2014-12-01 and is currently open access. It has received 682 citations till now. The article focuses on the topics: Software portability & Data parallelism.

read more

Citations
More filters

Fast parallel algorithms for short-range molecular dynamics

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Journal ArticleDOI

Quantum ESPRESSO toward the exascale.

TL;DR: A motivation and brief review of the ongoing effort to port Quantum ESPRESSO onto heterogeneous architectures based on hardware accelerators, which will overcome the energy constraints that are currently hindering the way toward exascale computing are presented.
Journal ArticleDOI

Quantum ESPRESSO toward the exascale.

TL;DR: Quantum ESPRESSO as mentioned in this paper is an open-source distribution of computer codes for quantum-mechanical materials modeling, based on density-functional theory, pseudopotentials, and plane waves.
Journal ArticleDOI

The Athena++ Adaptive Mesh Refinement Framework: Design and Magnetohydrodynamic Solvers

TL;DR: The design and implementation of a new framework for adaptive mesh refinement (AMR) calculations is described, intended primarily for applications in astrophysical fluid dynamics, but its flexible and modular design enables its use for a wide variety of physics.
Journal ArticleDOI

Direct simulation Monte Carlo on petaflop supercomputers and beyond

TL;DR: SPARTA as mentioned in this paper is an implementation of the Direct Simulation Monte Carlo (DSMC) method for modeling rarefied gas dynamics in a variety of scenarios, and it can operate in parallel at the scale of many billions of particles or grid cells.
References
More filters
Journal ArticleDOI

Fast parallel algorithms for short-range molecular dynamics

TL;DR: In this article, three parallel algorithms for classical molecular dynamics are presented, which can be implemented on any distributed-memory parallel machine which allows for message-passing of data between independently executing processors.

Fast parallel algorithms for short-range molecular dynamics

TL;DR: Comparing the results to the fastest reported vectorized Cray Y-MP and C90 algorithm shows that the current generation of parallel machines is competitive with conventional vector supercomputers even for small problems.
Book ChapterDOI

Efficient management of parallelism in object-oriented numerical software libraries

TL;DR: The PETSc 2.0 package as discussed by the authors uses object-oriented programming to conceal the details of the message passing, without concealing the parallelism, in a high-quality set of numerical software libraries.

Efficient Management of Parallelism in Object-Oriented Numerical Software Libraries.

TL;DR: The concepts discussed are appropriate for all scalable computing systems and provide many of the data structures and numerical kernels required for the scalable solution of PDEs, offering performance portability.
Journal ArticleDOI

StarPU: a unified platform for task scheduling on heterogeneous multicore architectures

TL;DR: StarPU as mentioned in this paper is a runtime system that provides a high-level unified execution model for numerical kernel designers with a convenient way to generate parallel tasks over heterogeneous hardware and easily develop and tune powerful scheduling algorithms.
Related Papers (5)