Dynamic cache management in multi-core architectures through run-time adaptation
Fazal Hameed,Lars Bauer,Jorg Henkel +2 more
- pp 485-490
Reads0
Chats0
TLDR
This work proposes a dynamic cache management scheme for LLC in NUCA-based architectures, which reduces inter-partition contention and provides efficient cache sharing by adapting migration, insertion, and promotion policies in response to the dynamic requirements of the individual applications with different cache access behaviors.Abstract:
Non-Uniform Cache Access (NUCA) architectures provide a potential solution to reduce the average latency for the last-level-cache (LLC), where the cache is organized into per-core local and remote partitions. Recent research has demonstrated the benefits of cooperative cache sharing among local and remote partitions. However, ignoring cache access patterns of concurrently executing applications sharing the local and remote partitions can cause inter-partition contention that reduces the overall instruction throughput. We propose a dynamic cache management scheme for LLC in NUCA-based architectures, which reduces inter-partition contention. Our proposed scheme provides efficient cache sharing by adapting migration, insertion, and promotion policies in response to the dynamic requirements of the individual applications with different cache access behaviors. Our adaptive cache management scheme allows individual cores to steal cache capacity from remote partitions to achieve better resource utilization. On average, our proposed scheme increases the performance (instructions per cycle) by 28% (minimum 8.4%, maximum 75%) compared to a private LLC organization.read more
Citations
More filters
Proceedings ArticleDOI
Reliable software for unreliable hardware: embedded code generation aiming at reliability
TL;DR: A compilation technique for reliability-aware software transformations is presented, which incurs 60%-80% lower application failures, averaged over various fault injection scenarios and fault rates.
Proceedings ArticleDOI
Reducing inter-core cache contention with an adaptive bank mapping policy in DRAM cache
TL;DR: This work proposes an adaptive bank mapping policy in response to the diverse requirements of applications with different cache access behaviors that - as a result - reduces inter-core cache contention in DRAM-based cache architectures.
Journal ArticleDOI
Dynamic Cache Pooling in 3D Multicore Processors
TL;DR: A 3D multicore architecture that provides poolable cache resources and a runtime management policy to improve energy efficiency in 3D systems by utilizing the flexible heterogeneity of cache resources are introduced.
Journal ArticleDOI
Sharing and Hit based Prioritizing Replacement Algorithm for Multi-Threaded Applications
Muthukumar S,Jawahar P. K +1 more
TL;DR: A Sharing and Hit Based Prioritizing (SHP) replacement strategy that takes the sharing status of the data elements into consideration while making replacement decisions and shows an average improvement in the overall hit rate when compared to LRU algorithm.
Journal ArticleDOI
Comparison of Cache Page Replacement Techniques to Enhance Cache Memory Performance
TL;DR: The purpose is to simulate FIFO, LRU, RANDOM and SECOND CHANCE policies and to compare the results for various applications such as bzip, swim and gcc traces (taken from SPEC2000 benchmark for simulation) etc.
References
More filters
Proceedings ArticleDOI
Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches
TL;DR: In this article, the authors propose a low-overhead, runtime mechanism that partitions a shared cache between multiple applications depending on the reduction in cache misses that each application is likely to obtain for a given amount of cache resources.
Proceedings ArticleDOI
Predicting inter-thread cache contention on a chip multi-processor architecture
TL;DR: Three performance models are proposed that predict the impact of cache sharing on co-scheduled threads and the most accurate model, the inductive probability model, achieves an average error of only 3.9%.
Journal ArticleDOI
Interconnections in Multi-Core Architectures: Understanding Mechanisms, Overheads and Scaling
TL;DR: Examination of the area, power, performance, and design issues for the on-chip interconnects on a chip multiprocessor shows that designs that treat interconnect as an entity that can be independently architected and optimized would not arrive at the best multi-core design.
Proceedings ArticleDOI
Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems
TL;DR: This paper has comprehensively evaluated several representative cache partitioning schemes with different optimization objectives, including performance, fairness, and quality of service (QoS) and provides new insights into dynamic behaviors and interaction effects.
Proceedings ArticleDOI
PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches
Yuejian Xie,Gabriel H. Loh +1 more
TL;DR: This work proposes a new cache management approach that combines dynamic insertion and promotion policies to provide the benefits of cache partitioning, adaptive insertion, and capacity stealing all with a single mechanism.
Related Papers (5)
A flexible data to L2 cache mapping approach for future multicore processors
Lei Jin,Hyunjin Lee,Sangyeun Cho +2 more