Performance evaluation of exclusive cache hierarchies

doi:10.1109/ISPASS.2004.1291359

Proceedings ArticleDOI

Performance evaluation of exclusive cache hierarchies

- pp 89-96

TLDR

The results of two-level cache memory simulations are presented and the impact of exclusive caching on system performance is examined and it is indicated that significant performance advantages can be gained for some benchmark through the use of an exclusive organization.

Abstract:

Memory hierarchy performance, specifically cache memory capacity, is a constraining factor in the performance of modern computers. This paper presents the results of two-level cache memory simulations and examines the impact of exclusive caching on system performance. Exclusive caching enables higher capacity with the same cache area by eliminating redundant copies. The experiments presented compare an exclusive cache hierarchy with an inclusive cache hierarchy utilizing similar L1 and L2 parameters. Experiments indicate that significant performance advantages can be gained for some benchmark through the use of an exclusive organization. The performance differences are illustrated using the L2 cache misses and execution time metrics. The most significant improvement shown is a 16% reduction in execution time, with an average reduction of 8% for the smallest cache configuration tested. With equal size victim buffer and victim cache for exclusive and inclusive cache hierarchies respectively, some benchmarks show increased execution time for exclusive caches because a victim cache can reduce conflict misses significantly while a victim buffer can introduce worst-case penalties. Considering the inconsistent performance improvement, the increased complexity of an exclusive cache hierarchy needs to be justified based upon the specifics of the application and system.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies

Aamer Jaleel, +4 more

TL;DR: This work proposes Temporal Locality Aware (TLA) cache management policies to allow an inclusive LLC to be aware of the temporal locality of lines in the core caches and shows that these policies improve inclusive cache performance without requiring any additional hardware structures.

...read moreread less

Proceedings ArticleDOI

Bypass and insertion algorithms for exclusive last-level caches

Jayesh Gaur, +2 more

TL;DR: Detailed execution-driven simulation results show that a combination of the best insertion and bypass policies delivers an improvement of up to 61.2% and on average 3.5% in terms of instructions retired per cycle for single-threaded dynamic instruction traces running on a 2 MB 16-way exclusive LLC compared to a baseline exclusive design in the presence of well-tuned multi-stream hardware prefetchers.

...read moreread less

Proceedings ArticleDOI

High performing cache hierarchies for server workloads: Relaxing inclusion to capture the latency benefits of exclusive caches

Aamer Jaleel, +4 more

TL;DR: This paper investigates increasing the size of smaller private caches in the hierarchy as opposed to increasing the shared LLC to improve average cache access latency for workloads whose working set fits into the larger private cache while retaining the benefits of a shared LLC.

...read moreread less

Journal ArticleDOI

FLEXclusion: balancing cache capacity and on-chip bandwidth via flexible exclusion

Jaewoong Sim, +3 more

TL;DR: FLEXclusion is proposed, a design that dynamically selects between exclusion and non-inclusion depending on workload behavior and reduces the on-chip LLC insertion traffic by 72.6% and improves performance by 5.9% when implemented with negligible hardware changes.

...read moreread less

Non-Inclusion Property in Multi-level Caches Revisited

Mohamed M. Zahran, +2 more

TL;DR: This paper argues that the inclusion property, a prime candidate for simplifying memory coherence protocols in multiprocessor systems, makes inefficient use of cache memory real estate on the chip due to duplication of data on multiple levels of cache.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Computer Architecture: A Quantitative Approach

John L. Hennessy, +1 more

TL;DR: This best-selling title, considered for over a decade to be essential reading for every serious student and practitioner of computer design, has been updated throughout to address the most important trends facing computer designers today.

...read moreread less

Journal ArticleDOI

Cache Memories

Alan Jay Smith

- 01 Sep 1982 -

ACM Computing Surveys

TL;DR: Specific aspects of cache memories investigated include: the cache fetch algorithm (demand versus prefetch), the placement and replacement algorithms, line size, store-through versus copy-back updating of main memory, cold-start versus warm-start miss ratios, mulhcache consistency, the effect of input /output through the cache, the behavior of split data/instruction caches, and cache size.

...read moreread less

Book

Parallel Computer Architecture: A Hardware/Software Approach

David E. Culler, +2 more

TL;DR: This book explains the forces behind this convergence of shared-memory, message-passing, data parallel, and data-driven computing architectures and provides comprehensive discussions of parallel programming for high performance and of workload-driven evaluation, based on understanding hardware-software interactions.

...read moreread less

Proceedings ArticleDOI

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

Norman P. Jouppi

TL;DR: In this article, a hardware technique to improve the performance of caches is presented, where a small fully-associative cache between a cache and its refill path is used to place prefetched data and not in the cache.

...read moreread less

ReportDOI

My Cache or Yours? Making Storage More Exclusive

Theodore M. Wong, +1 more

TL;DR: In this article, the authors explore the benefits of a simple scheme to achieve exclusive caching, in which a data block is cached at either a client or the disk array, but not both.

...read moreread less

Performance evaluation of exclusive cache hierarchies

Citations

Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies

Bypass and insertion algorithms for exclusive last-level caches

High performing cache hierarchies for server workloads: Relaxing inclusion to capture the latency benefits of exclusive caches

FLEXclusion: balancing cache capacity and on-chip bandwidth via flexible exclusion

Non-Inclusion Property in Multi-level Caches Revisited

References

Computer Architecture: A Quantitative Approach

Cache Memories

Parallel Computer Architecture: A Hardware/Software Approach

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

My Cache or Yours? Making Storage More Exclusive

Related Papers (5)

Tradeoffs in two-level on-chip caching

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

High performance cache replacement using re-reference interval prediction (RRIP)

Achieving Non-Inclusive Cache Performance with Inclusive Caches: Temporal Locality Aware (TLA) Cache Management Policies

Adaptive insertion policies for high performance caching