Proceedings ArticleDOI
PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches
Yuejian Xie,Gabriel H. Loh +1 more
- Vol. 37, Iss: 3, pp 174-183
Reads0
Chats0
TLDR
This work proposes a new cache management approach that combines dynamic insertion and promotion policies to provide the benefits of cache partitioning, adaptive insertion, and capacity stealing all with a single mechanism.Abstract:
Many multi-core processors employ a large last-level cache (LLC) shared among the multiple cores. Past research has demonstrated that sharing-oblivious cache management policies (e.g., LRU) can lead to poor performance and fairness when the multiple cores compete for the limited LLC capacity. Different memory access patterns can cause cache contention in different ways, and various techniques have been proposed to target some of these behaviors. In this work, we propose a new cache management approach that combines dynamic insertion and promotion policies to provide the benefits of cache partitioning, adaptive insertion, and capacity stealing all with a single mechanism. By handling multiple types of memory behaviors, our proposed technique outperforms techniques that target only either capacity partitioning or adaptive insertion.read more
Citations
More filters
Proceedings ArticleDOI
High performance cache replacement using re-reference interval prediction (RRIP)
TL;DR: This paper proposes Static RRIP that is scan-resistant and Dynamic RRIP (DRRIP) that is both scan- resistant and thrash-resistant that require only 2-bits per cache block and easily integrate into existing LRU approximations found in modern processors.
Proceedings ArticleDOI
Q-clouds: managing performance interference effects for QoS-aware clouds
TL;DR: Q-Clouds, a QoS-aware control framework that tunes resource allocations to mitigate performance interference effects, is developed, which uses online feedback to build a multi-input multi-output (MIMO) model that captures performance interference interactions, and uses it to perform closed loop resource management.
Proceedings ArticleDOI
Bubble-Up: increasing utilization in modern warehouse scale computers via sensible co-locations
TL;DR: Bubble-Up is presented, a characterization methodology that enables the accurate prediction of the performance degradation that results from contention for shared resources in the memory subsystem and can predict the performance interference between co-locate applications with an accuracy within 1% to 2% of the actual performance degradation.
Proceedings ArticleDOI
Heracles: improving resource efficiency at scale
TL;DR: Heracles is presented, a feedback-based controller that enables the safe colocation of best-effort tasks alongside a latency-critical service and dynamically manages multiple hardware and software isolation mechanisms to ensure that the latency-sensitive job meets latency targets while maximizing the resources given to best- Effort tasks.
Proceedings ArticleDOI
SHiP: signature-based hit predictor for high performance caching
Carole-Jean Wu,Aamer Jaleel,William C. Hasenplaugh,Margaret Martonosi,Simon C. Steely,Joel Emer +5 more
TL;DR: This paper proposes a novel Signature-based Hit Predictor (SHiP) to learn the re-reference behavior of cache lines belonging to each signature, and finds that SHiP offers substantial improvements over the baseline LRU replacement and state-of-the-art replacement policy proposals.
References
More filters
Proceedings ArticleDOI
MiBench: A free, commercially representative embedded benchmark suite
Matthew R. Guthaus,Jeff Ringenberg,Daniel J. Ernst,Todd Austin,Trevor Mudge,Richard B. Brown +5 more
TL;DR: A new version of SimpleScalar that has been adapted to the ARM instruction set is used to characterize the performance of the benchmarks using configurations similar to current and next generation embedded processors.
Proceedings ArticleDOI
MediaBench: a tool for evaluating and synthesizing multimedia and communications systems
TL;DR: The MediaBench benchmark suite as discussed by the authors is a benchmark suite that has been designed to fill the gap between the compiler community and embedded applications developers, which has been constructed through a three-step process: intuition and market driven initial selection, experimental measurement, and integration with system synthesis algorithms to establish usefulness.
Journal ArticleDOI
SimpleScalar: an infrastructure for computer system modeling
TL;DR: The SimpleScalar tool set provides an infrastructure for simulation and architectural modeling that can model a variety of platforms ranging from simple unpipelined processors to detailed dynamically scheduled microarchitectures with multiple-level memory hierarchies.
Proceedings ArticleDOI
Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches
TL;DR: In this article, the authors propose a low-overhead, runtime mechanism that partitions a shared cache between multiple applications depending on the reduction in cache misses that each application is likely to obtain for a given amount of cache resources.
Journal ArticleDOI
Drowsy caches: simple techniques for reducing leakage power
TL;DR: It is argued that the use of drowsy caches can simplify the design and control of low-leakage caches, and avoid the need to completely turn off selected cache lines and lose their state.