SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory

doi:10.1109/ICPPW.2017.32

Open AccessProceedings ArticleDOI

SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory

- pp 145-154

TLDR

This work proposes and develops the programming abstraction called SHARed data-structure centric Programming abstraction (SharP), a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and a unified programming abstraction for Big-Compute and Big-Data applications.

Abstract:

The pre-exascale systems are expected to have a significant amount of hierarchical and heterogeneous on-node memory, and this trend of system architecture in extreme-scale systems is expected to continue into the exascale era. Along with hierarchical-heterogeneous memory, the system typically has a high-performing network and a compute accelerator. This system architecture is not only effective for running traditional High Performance Computing (HPC) applications (Big-Compute), but also running data-intensive HPC applications and Big-Data applications. As a consequence, there is a growing desire to have a single system serve the needs of both Big-Compute and Big-Data applications. Though the system architecture supports the convergence of the Big-Compute and Big-Data, the programming models have yet to evolve to support either hierarchical-heterogeneous memory systems or the convergence. In this work, we propose and develop the programming abstraction called SHARed data-structure centric Programming abstraction (SharP) to address both of these goals, i.e., provide (1) a simple, usable, and portable abstraction for hierarchical-heterogeneous memory and (2) a unified programming abstraction for Big-Compute and Big-Data applications. To evaluate SharP, we implement a Stencil benchmark using SharP, port QMCPack, a petascale-capable application, and adapt Memcached ecosystem, a popular Big-Data framework, to use SharP, and quantify the performance and productivity advantages. Additionally, we demonstrate the simplicity of using SharP on different memories including DRAM, High-bandwidth Memory (HBM), and non-volatile random access memory (NVRAM).

SharP: Towards Programming Extreme-Scale Systems with Hierarchical Heterogeneous Memory

Citations

A Case For Intra-rack Resource Disaggregation in HPC

Approaches of enhancing interoperations among high performance computing and big data analytics via augmentation

SharP Hash: A High-Performing Distributed Hash for Extreme-Scale Systems

Optimizing Data Aggregation by Leveraging the Deep Memory Hierarchy on Large-scale Systems

Efficient Intra-Rack Resource Disaggregation for HPC Using Co-Packaged DWDM Photonics

References

Co-array Fortran for parallel programming

MPI: A Message-Passing Interface

hwloc: A Generic Framework for Managing Hardware Affinities in HPC Applications

Exascale computing technology challenges

Global Arrays: a portable "shared-memory" programming model for distributed memory computers

Related Papers (5)

Combining shared and distributed memory programming models on clusters of symmetric multiprocessors: some basic

Decentralized Offload-based Execution on Memory-centric Compute Cores

Data Subsetting: A Data-Centric Approach to Approximate Computing

Dtcraft: a distributed execution engine for compute-intensive applications

Implications of hierarchical N-body methods for multiprocessor architectures