scispace - formally typeset
Proceedings ArticleDOI

BATMAN: techniques for maximizing system bandwidth of memory systems with stacked-DRAM

Reads0
Chats0
TLDR
Bandwidth-Aware Tiered-Memory Management (BATMAN), a runtime mechanism that manages the distribution of memory accesses in a tiered-memory system by explicitly controlling data movement and incurs only an eight-byte hardware overhead and requires negligible software modification is proposed.
Abstract
Tiered-memory systems consist of high-bandwidth 3D-DRAM and high-capacity commodity-DRAM. Conventional designs attempt to improve system performance by maximizing the number of memory accesses serviced by 3D-DRAM. However, when the commodity-DRAM bandwidth is a significant fraction of overall system bandwidth, the techniques inefficiently utilize the total bandwidth offered by the tiered-memory system and yields sub-optimal performance. In such situations, the performance can be improved by distributing memory accesses that are proportional to the bandwidth of each memory. Ideally, we want a simple and effective runtime mechanism that achieves the desired access distribution without requiring significant hardware or software support. This paper proposes Bandwidth-Aware Tiered-Memory Management (BATMAN), a runtime mechanism that manages the distribution of memory accesses in a tiered-memory system by explicitly controlling data movement. BATMAN monitors the number of accesses to both memories, and when the number of 3D-DRAM accesses exceeds the desired threshold, BATMAN disallows data movement from the commodity-DRAM to 3D-DRAM and proactively moves data from 3D-DRAM to commodity-DRAM. We demonstrate BATMAN on systems that architect the 3D-DRAM as either a hardware-managed cache (cache mode) or a part of the OS-visible memory space (flat mode). Our evaluations on a system with 4GB 3D-DRAM and 32GB commodity-DRAM show that BATMAN improves performance by an average of 11% and 10% and energy-delay product by 13% and 11% for systems in the cache and flat modes, respectively. BATMAN incurs only an eight-byte hardware overhead and requires negligible software modification.

read more

Citations
More filters
Proceedings ArticleDOI

Nimble Page Management for Tiered Memory Systems

TL;DR: This work proposes and implements a general purpose OS-integrated multi-level memory management system that reuses current OS page tracking structures to tier pages directly between memories with no additional monitoring overhead and augments this system with four additional optimizations.
Journal ArticleDOI

A survey of optimization techniques for thermal-aware 3D processors

TL;DR: This paper shows that the thermal impact on 3D processors is manageable by adopting thermal-aware techniques, thus making3D processors into the mainstream in the near future.
Proceedings ArticleDOI

Kleio: A Hybrid Memory Page Scheduler with Machine Intelligence

TL;DR: Kleio is a hybrid page scheduler that combines existing, lightweight, history-based data tiering methods for hybrid memory, with novel intelligent placement decisions based on deep neural networks, which provides hybrid memory systems with fast and effective neural network training and prediction accuracy levels.
Proceedings ArticleDOI

PageSeer: Using Page Walks to Trigger Page Swaps in Hybrid Memory Systems

TL;DR: The scheme PageSeer is proposed, which effectively hides the swap overhead and services many requests from the DRAM, and initiates other types of page swaps, building a complete solution for hybrid memory.
Proceedings ArticleDOI

Hybrid2: Combining Caching and Migration in Hybrid Memory Systems

TL;DR: Hybrid2 is proposed, a new hybrid memory system architecture that combines a DRAM cache with a migration scheme that alleviates the metadata overheads of both DRAM caches and migration using a common mechanism.
References
More filters
Book

Matrix computations

Gene H. Golub
Journal ArticleDOI

Pin: building customized program analysis tools with dynamic instrumentation

TL;DR: The goals are to provide easy-to-use, portable, transparent, and efficient instrumentation, and to illustrate Pin's versatility, two Pintools in daily use to analyze production software are described.
Journal ArticleDOI

SPEC CPU2006 benchmark descriptions

TL;DR: On August 24, 2006, the Standard Performance Evaluation Corporation (SPEC) announced CPU2006, which replaces CPU2000, and the SPEC CPU benchmarks are widely used in both industry and academia.
Journal ArticleDOI

Knights Landing: Second-Generation Intel Xeon Phi Product

TL;DR: The architecture of Knights Landing, the second-generation Intel Xeon Phi product family, which targets high-performance computing and other highly parallel workloads, provides a significant increase in scalar and vector performance and a big boost in memory bandwidth compared to the prior generation, called Knights Corner.
Proceedings ArticleDOI

Using SimPoint for accurate and efficient simulation

TL;DR: How to use the SimPoint tool, and an improved SimPoint algorithm designed to significantly reduce the simulation time required when the simulation environment relies upon fast-forwarding are described.
Related Papers (5)