Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Learning-based optimization of cache content in a small cell base station

[...]

Pol Blasco¹, Deniz Gunduz¹•Institutions (1)

Imperial College London¹

10 Jun 2014

TL;DR: In this article, the authors studied the optimal cache content placement in a wireless small cell base station (sBS) with limited backhaul capacity, where the cache content content placement is optimized based on the demand history.

...read moreread less

Abstract: Optimal cache content placement in a wireless small cell base station (sBS) with limited backhaul capacity is studied. The sBS has a large cache memory and provides content-level selective offloading by delivering high data rate contents to users in its coverage area. The goal of the sBS content controller (CC) is to store the most popular contents in the sBS cache memory such that the maximum amount of data can be fetched directly form the sBS, not relying on the limited backhaul resources during peak traffic periods. If the popularity profile is known in advance, the problem reduces to a knapsack problem. However, it is assumed in this work that, the popularity profile of the files is not known by the CC, and it can only observe the instantaneous demand for the cached content. Hence, the cache content placement is optimised based on the demand history. By refreshing the cache content at regular time intervals, the CC tries to learn the popularity profile, while exploiting the limited cache capacity in the best way possible. Three algorithms are studied for this cache content placement problem, leading to different exploitation-exploration trade-offs. We provide extensive numerical simulations in order to study the time-evolution of these algorithms, and the impact of the system parameters, such as the number of files, the number of users, the cache size, and the skewness of the popularity profile, on the performance. It is shown that the proposed algorithms quickly learn the popularity profile for a wide range of system parameters.

...read moreread less

322 citations

Book Chapter•DOI•

Reducing Memory and Traffic Requirements for Scalable Directory-Based Cache Coherence Schemes*

[...]

Anoop Gupta¹, Wolf-Dietrich Weber¹, Todd C. Mowry¹•Institutions (1)

Stanford University¹

01 Jan 1992

TL;DR: As multiprocessors are scaled beyond single bus systems, there is renewed interest in directory-based cache coherence schemes that use a limited number of pointers per directory entry to keep track of all processors caching a memory block.

...read moreread less

Abstract: As multiprocessors are scaled beyond single bus systems, there is renewed interest in directory-based cache coherence schemes. These schemes rely on a directory to keep track of all processors caching a memory block. When a write to that block occurs, point-to-point invalidation messages are sent to keep the caches coherent. A straightforward way of recording the identities of processors caching a memory block is to use a bit vector per memory block, with one bit per processor. Unfortunately, when the main memory grows linearly with the number of processors, the total size of the directory memory grows as the square of the number of processors, which is prohibitive for large machines. To remedy this problem several schemes that use a limited number of pointers per directory entry have been suggested. These schemes often cause excessive invalidation traffic.

...read moreread less

321 citations

Proceedings Article•DOI•

Adaptive insertion policies for managing shared caches

[...]

Aamer Jaleel¹, William C. Hasenplaugh¹, Moinuddin K. Qureshi², Julien Sebot¹, Simon C. Steely¹, Joel Emer¹ - Show less +2 more•Institutions (2)

Intel¹, IBM²

25 Oct 2008

TL;DR: This paper proposes Thread-Aware Dynamic Insertion Policy (TADIP), a adaptive insertion policy that can take into account the memory requirements of each of the concurrently executing applications and provides performance benefits similar to doubling the size of an LRU-managed cache.

...read moreread less

Abstract: Chip Multiprocessors (CMPs) allow different applications to concurrently execute on a single chip. When applications with differing demands for memory compete for a shared cache, the conventional LRU replacement policy can significantly degrade cache performance when the aggregate working set size is greater than the shared cache. In such cases, shared cache performance can be significantly improved by preserving the entire working set of applications that can co-exist in the cache and preserving some portion of the working set of the remaining applications. This paper investigates the use of adaptive insertion policies to manage shared caches. We show that directly extending the recently proposed dynamic insertion policy (DIP) is inadequate for shared caches since DIP is unaware of the characteristics of individual applications. We propose Thread-Aware Dynamic Insertion Policy (TADIP) that can take into account the memory requirements of each of the concurrently executing applications. Our evaluation with multi-programmed workloads for 2-core, 4-core, 8-core, and 16-core CMPs show that a TADIP-managed shared cache improves overall throughput by as much as 94%, 64%, 26%, and 16% respectively (on average 14%, 18%, 15%, and 17%) over the baseline LRU policy. The performance benefit of TADIP is 2.6x compared to DIP and 1.3x compared to the recently proposed Utility-based Cache Partitioning (UCP) scheme. We also show that a TADIP-managed shared cache provides performance benefits similar to doubling the size of an LRU-managed cache. Furthermore, TADIP requires a total storage overhead of less than two bytes per core, does not require changes to the existing cache structure, and performs similar to LRU for LRU friendly workloads.

...read moreread less

321 citations

Journal Article•DOI•

A NUCA Substrate for Flexible CMP Cache Sharing

[...]

Jaehyuk Huh¹, Changkyu Kim, Hazim Shafi², Lixin Zhang³, Doug Burger, Stephen W. Keckler - Show less +2 more•Institutions (3)

Advanced Micro Devices¹, Microsoft², IBM³

01 Aug 2007-IEEE Transactions on Parallel and Distributed Systems

TL;DR: It is demonstrated that migratory dynamic NUCA approaches improve performance significantly for a subset of the workloads at the cost of increased complexity, especially as per-application cache partitioning strategies are applied.

...read moreread less

Abstract: We propose an organization for the on-chip memory system of a chip multiprocessor in which 16 processors share a 16-Mbyte pool of 64 level-2 (L2) cache banks. The L2 cache is organized as a nonuniform cache architecture (NUCA) array with a switched network embedded in it for high performance. We show that this organization can support a spectrum of degrees of sharing: unshared, in which each processor owns a private portion of the cache, thus reducing hit latency, and completely shared, in which every processor shares the entire cache, thus minimizing misses, and every point in between. We measure the optimal degree of sharing for different cache bank mapping policies and also evaluate a per-application cache partitioning strategy. We conclude that a static NUCA organization with sharing degrees of 2 or 4 works best across a suite of commercial and scientific parallel workloads. We demonstrate that migratory dynamic NUCA approaches improve performance significantly for a subset of the workloads at the cost of increased complexity, especially as per-application cache partitioning strategies are applied. We also evaluate the energy efficiency of each design point in terms of network traffic, bank accesses, and external memory accesses.

...read moreread less

319 citations

Journal Article•DOI•

A Case for MLP-Aware Cache Replacement

[...]

Moinuddin K. Qureshi¹, Daniel N. Lynch¹, Onur Mutlu¹, Yale N. Patt¹•Institutions (1)

University of Texas at Austin¹

01 May 2006

TL;DR: Evaluations with the SPEC CPU2000 benchmarks show that MLP-aware cache replacement can improve performance by as much as 23% and a novel, low-hardware overhead mechanism called sampling based adaptive replacement (SBAR) is proposed, to dynamically choose between an MLp-aware and a traditional replacement policy, depending on which one is more effective at reducing the number of memory related stalls.

...read moreread less

Abstract: Performance loss due to long-latency memory accesses can be reduced by servicing multiple memory accesses concurrently. The notion of generating and servicing long-latency cache misses in parallel is called Memory Level Parallelism (MLP). MLP is not uniform across cache misses - some misses occur in isolation while some occur in parallel with other misses. Isolated misses are more costly on performance than parallel misses. However, traditional cache replacement is not aware of the MLP-dependent cost differential between different misses. Cache replacement, if made MLP-aware, can improve performance by reducing the number of performance-critical isolated misses. This paper makes two key contributions. First, it proposes a framework for MLP-aware cache replacement by using a runtime technique to compute the MLP-based cost for each cache miss. It then describes a simple cache replacement mechanism that takes both MLP-based cost and recency into account. Second, it proposes a novel, low-hardware overhead mechanism called Sampling Based Adaptive Replacement (SBAR), to dynamically choose between an MLP-aware and a traditional replacement policy, depending on which one is more effective at reducing the number of memory related stalls. Evaluations with the SPEC CPU2000 benchmarks show that MLP-aware cache replacement can improve performance by as much as 23%.

...read moreread less

316 citations

Collapse

Network Information

Performance

Metrics

10,702

Papers

250,710

Citations

No. of papers in the topic in previous years
Year	Papers
2023	44
2022	117
2021	4
2020	8
2019	7
2018	20

Cache invalidation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics