scispace - formally typeset
Open AccessJournal ArticleDOI

Cache Exclusivity and Sharing: Theory and Optimization

Reads0
Chats0
TLDR
A new metric called the victim footprint (VFP) is presented, measured once per program in its solo execution and can be combined to compute the performance of any exclusive cache hierarchy, replacing parallel testing with theoretical analysis.
Abstract
A problem on multicore systems is cache sharing, where the cache occupancy of a program depends on the cache usage of peer programs. Exclusive cache hierarchy as used on AMD processors is an effective solution to allow processor cores to have a large private cache while still benefitting from shared cache. The shared cache stores the “victims” (i.e., data evicted from private caches). The performance depends on how victims of co-run programs interact in shared cache.This article presents a new metric called the victim footprint (VFP). It is measured once per program in its solo execution and can then be combined to compute the performance of any exclusive cache hierarchy, replacing parallel testing with theoretical analysis. The work evaluates the VFP by using it to analyze cache sharing by parallel mixes of sequential programs, comparing the accuracy of the theory to hardware counter results, and measuring the benefit of exclusivity-aware analysis and optimization.

read more

Citations
More filters
Proceedings ArticleDOI

DCAPS: dynamic cache allocation with partial sharing

TL;DR: This paper proposes Dynamic Cache Allocation with Partial Sharing (DCAPS), a framework that dynamically monitors and predicts a multi-programmed workload's cache demand, and reallocates LLC given a performance target and is able to optimize for a wide range of performance targets and can scale to a large core count.
Proceedings ArticleDOI

Locality analysis through static parallel sampling

TL;DR: A new approach to locality analysis based on static parallel sampling that can predict precise cache line granularity miss ratio curves for complex loops with non-linear array references and even branches is described.
Journal ArticleDOI

A Relational Theory of Locality

TL;DR: This article categorizes locality definitions in three groups and shows whether and how they can be interconverted, and gives a new measurement algorithm that is asymptotically more time/space efficient than previous approaches.
Book ChapterDOI

HiFlipVX: An Open Source High-Level Synthesis FPGA Library for Image Processing

TL;DR: This work presents a highly optimized, parametrizable and streaming capable HLS open-source library for FPGAs called HiFlipVX that achieves an efficient resource utilization and a significant scalability, also in comparison to the reference (xfOpenCV), as shown in the evaluation.
Journal ArticleDOI

Working Set Analytics

TL;DR: This tutorial traces the development of working set theory from its origins to the present day, and presents the powerful, linear-time algorithms for computing working set statistics and applying them to the design of memory systems.
References
More filters
Journal ArticleDOI

Pin: building customized program analysis tools with dynamic instrumentation

TL;DR: The goals are to provide easy-to-use, portable, transparent, and efficient instrumentation, and to illustrate Pin's versatility, two Pintools in daily use to analyze production software are described.
Proceedings ArticleDOI

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

TL;DR: In this article, a hardware technique to improve the performance of caches is presented, where a small fully-associative cache between a cache and its refill path is used to place prefetched data and not in the cache.
Journal ArticleDOI

Evaluation techniques for storage hierarchies

TL;DR: A new and efficient method of determining, in one pass of an address trace, performance measures for a large class of demand-paged, multilevel storage systems utilizing a variety of mapping schemes and replacement algorithms.
Journal ArticleDOI

The working set model for program behavior

TL;DR: A new model, the “working set model,” is developed, defined to be the collection of its most recently used pages, which provides knowledge vital to the dynamic management of paged memories.
Proceedings ArticleDOI

Adaptive insertion policies for high performance caching

TL;DR: A Dynamic Insertion Policy (DIP) is proposed to choose between BIP and the traditional LRU policy depending on which policy incurs fewer misses, and shows that DIP reduces the average MPKI of the baseline 1MB 16-way L2 cache by 21%, bridging two-thirds of the gap between LRU and OPT.