Main Memory in HPC: Do We Need More or Could We Live with Less?
Darko Zivanovic,Milan Pavlovic,Milan Radulovic,Hyun-Sung Shin,Jong-Pil Son,Sally A. McKee,Paul M. Carpenter,Petar Radojković,Eduard Ayguadé +8 more
Reads0
Chats0
TLDR
In this article, the authors analyzed the memory capacity requirements of important HPC benchmarks and applications and found that most of the HPC applications under study have per-core memory footprints in the range of hundreds of megabytes, but also detect applications and use cases that require gigabytes per core.Abstract:
An important aspect of High-Performance Computing (HPC) system design is the choice of main memory capacity. This choice becomes increasingly important now that 3D-stacked memories are entering the market. Compared with conventional Dual In-line Memory Modules (DIMMs), 3D memory chiplets provide better performance and energy efficiency but lower memory capacities. Therefore, the adoption of 3D-stacked memories in the HPC domain depends on whether we can find use cases that require much less memory than is available now.This study analyzes the memory capacity requirements of important HPC benchmarks and applications. We find that the High-Performance Conjugate Gradients (HPCG) benchmark could be an important success story for 3D-stacked memories in HPC, but High-Performance Linpack (HPL) is likely to be constrained by 3D memory capacity. The study also emphasizes that the analysis of memory footprints of production HPC applications is complex and that it requires an understanding of application scalability and target category, i.e., whether the users target capability or capacity computing. The results show that most of the HPC applications under study have per-core memory footprints in the range of hundreds of megabytes, but we also detect applications and use cases that require gigabytes per core. Overall, the study identifies the HPC applications and use cases with memory footprints that could be provided by 3D-stacked memory chiplets, making a first step toward adoption of this novel technology in the HPC domain.read more
Citations
More filters
Proceedings ArticleDOI
Quantifying Memory Underutilization in HPC Systems and Using it to Improve Performance via Architecture Support
Gagandeep Panwar,Da Zhang,Yihan Pang,Mai Dahshan,Nathan DeBardeleben,Binoy Ravindran,Xun Jian +6 more
TL;DR: This paper performs the first large-scale study of system-level memory utilization in the context of HPC systems and proposes the first exploration of architectural techniques to improve memory utilization specifically for HPC Systems.
Journal ArticleDOI
A Case For Intra-rack Resource Disaggregation in HPC
George Michelogiannakis,Benjamin Klenk,Brandon Cook,Min Yee Teh,Madeleine Glick,Larry R. Dennison,Keren Bergman,John Shalf +7 more
TL;DR: It is shown that for a rack (cabinet) configuration and applications similar to Cori, a central processing unit with intra-rack disaggregation has a 99.5% probability to find all resources it requires inside its rack.
Journal ArticleDOI
Pricing schemes for energy-efficient HPC systems: Design and exploration
TL;DR: Energy efficiency is of paramount importance for the sustainability of high performance computing (HPC) systems as mentioned in this paper, and energy consumption limits the peak performance of supercomputers and accounts for a...
Book ChapterDOI
A Survey of Application Memory Usage on a National Supercomputer: An Analysis of Memory Requirements on ARCHER
TL;DR: Analysis of memory use by software application type reveals differences in memory use between periodic electronic structure, atomistic N-body, grid-based climate modelling, and grid- based CFD applications.
Journal ArticleDOI
Pricing Schemes for Energy-Efficient HPC Systems: Design and Exploration
TL;DR: In this article, the authors present a parametrized model to analyze the impact of frequency scaling on energy and to assess the potential total cost benefits for the HPC facility and the user.
References
More filters
Proceedings ArticleDOI
The SPLASH-2 programs: characterization and methodological considerations
TL;DR: This paper quantitatively characterize the SPLASH-2 programs in terms of fundamental properties and architectural interactions that are important to understand them well, including the computational load balance, communication to computation ratio and traffic needs, important working set sizes, and issues related to spatial locality.
Proceedings ArticleDOI
The PARSEC benchmark suite: characterization and architectural implications
TL;DR: This paper presents and characterizes the Princeton Application Repository for Shared-Memory Computers (PARSEC), a benchmark suite for studies of Chip-Multiprocessors (CMPs), and shows that the benchmark suite covers a wide spectrum of working sets, locality, data sharing, synchronization and off-chip traffic.
The Landscape of Parallel Computing Research: A View from Berkeley
Krste Asanovic,Ras Bodik,Bryan Catanzaro,Joseph Gebis,Parry Husbands,Kurt Keutzer,David A. Patterson,William Plishker,John Shalf,Samuel Williams,Katherine Yelick +10 more
TL;DR: The parallel landscape is frame with seven questions, and the following are recommended to explore the design space rapidly: • The overarching goal should be to make it easy to write programs that execute efficiently on highly parallel computing systems • The target should be 1000s of cores per chip, as these chips are built from processing elements that are the most efficient in MIPS (Million Instructions per Second) per watt, MIPS per area of silicon, and MIPS each development dollar.
Journal ArticleDOI
The LINPACK Benchmark: past, present and future
TL;DR: Aside from the LINPACK Benchmark suite, the TOP500 and the HPL codes are presented and information is given on how to interpret the results of the benchmark and how the results fit into the performance evaluation process.