Topic

Cache pollution

About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

A NUCA substrate for flexible CMP cache sharing

[...]

Jaehyuk Huh¹, Changkyu Kim¹, Hazim Shafi², Lixin Zhang², Doug Burger¹, Stephen W. Keckler¹ - Show less +2 more•Institutions (2)

University of Texas at Austin¹, IBM²

20 Jun 2005

TL;DR: It is demonstrated that migratory, dynamic NUCA approaches improve performance significantly for a subset of the workloads at the cost of increased power consumption and complexity, especially as per-application cache partitioning strategies are applied.

...read moreread less

Abstract: We propose an organization for the on-chip memory system of a chip multiprocessor, in which 16 processors share a 16MB pool of 256 L2 cache banks. The L2 cache is organized as a non-uniform cache architecture (NUCA) array with a switched network embedded in it for high performance. We show that this organization can support the spectrum of degrees of sharing: unshared, in which each processor has a private portion of the cache, thus reducing hit latency, completely shared, in which every processor shares the entire cache, thus minimizing misses, and every point in between. We find the optimal degree of sharing for a number of cache bank mapping policies, and also evaluate a per-application cache partitioning strategy. We conclude that a static NUCA organization with sharing degrees of two or four work best across a suite of commercial and scientific parallel workloads. We also demonstrate that migratory, dynamic NUCA approaches improve performance significantly for a subset of the workloads at the cost of increased power consumption and complexity, especially as per-application cache partitioning strategies are applied.

...read moreread less

218 citations

Journal Article•DOI•

Program optimization for instruction caches

[...]

S. McFarling¹•Institutions (1)

Stanford University¹

01 Apr 1989

TL;DR: This paper presents an optimization algorithm for reducing instruction cache misses that uses profile information to reposition programs in memory so that a direct-mapped cache behaves much like an optimal cache with full associativity and full knowledge of the future.

...read moreread less

Abstract: This paper presents an optimization algorithm for reducing instruction cache misses. The algorithm uses profile information to reposition programs in memory so that a direct-mapped cache behaves much like an optimal cache with full associativity and full knowledge of the future. For best results, the cache should have a mechanism for excluding certain instructions designated by the compiler. This paper first presents a reduced form of the algorithm. This form is shown to produce an optimal miss rate for programs without conditionals and with a tree call graph, assuming basic blocks can be reordered at will. If conditionals are allowed, but there are no loops within conditionals, the algorithm does as well as an optimal cache for the worst case execution of the program consistent with the profile information. Next, the algorithm is extended with heuristics for general programs. The effectiveness of these heuristics are demonstrated with empirical results for a set of 10 programs for various cache sizes. The improvement depends on cache size. For a 512 word cache, miss rates for a direct-mapped instruction cache are halved. For an 8K word cache, miss rates fall by over 75%. Over a wide range of cache sizes the algorithm is as effective as increasing the cache size by a factor of 3 times. For 512 words, the algorithm generates only 32% more misses than an optimal cache. Optimized programs on a direct-mapped cache have lower miss rates than unoptimized programs on set-associative caches of the same size.

...read moreread less

217 citations

Proceedings Article•DOI•

Random Fill Cache Architecture

[...]

Fangfei Liu¹, Ruby B. Lee¹•Institutions (1)

Princeton University¹

13 Dec 2014

TL;DR: A novel random fill cache architecture is proposed that replaces demand fetch with random cache fill within a configurable neighborhood window and shows that it provides information-theoretic security against reuse based attacks.

...read moreread less

Abstract: Correctly functioning caches have been shown to leak critical secrets like encryption keys, through various types of cache side-channel attacks. This nullifies the security provided by strong encryption and allows confidentiality breaches, impersonation attacks and fake services. Hence, future cache designs must consider security, ideally without degrading performance and power efficiency. We introduce a new classification of cache side channel attacks: contention based attacks and reuse based attacks. Previous secure cache designs target only contention based attacks, and we show that they cannot defend against reuse based attacks. We show the surprising insight that the fundamental demand fetch policy of a cache is a security vulnerability that causes the success of reuse based attacks. We propose a novel random fill cache architecture that replaces demand fetch with random cache fill within a configurable neighborhood window. We show that our random fill cache does not degrade performance, and in fact, improves the performance for some types of applications. We also show that it provides information-theoretic security against reuse based attacks.

...read moreread less

217 citations

Proceedings Article•DOI•

Architectural support for operating system-driven CMP cache management

[...]

Nauman Rafique¹, Won-Taek Lim¹, Mithuna Thottethodi¹•Institutions (1)

Purdue University¹

16 Sep 2006

TL;DR: This paper designs architectural support for OS to efficiently manage shared caches with a wide variety of policies and demonstrates that the scheme can support a wide range of policies including policies that provide passive performance differentiation, reactive fairness by miss-rate equalization and reactive performance differentiation.

...read moreread less

Abstract: The role of the operating system (OS) in managing shared resources such as CPU time, memory, peripherals, and even energy is well motivated and understood [23]. Unfortunately, one key resource — lower-level shared cache in chip multi-processors — is commonly managed purely in hardware by rudimentary replacement policies such as least-recently-used (LRU). The rigid nature of the hardware cache management policy poses a serious problem since there is no single best cache management policy across all sharing scenarios. For example, the cache management policy for a scenario where applications from a single organization are running under "best effort" performance expectation is likely to be different from the policy for a scenario where applications from competing business entities (say, at a third party data center) are running under a minimum service level expectation. When it comes to managing shared caches, there is an inherent tension between flexibility and performance. On one hand, managing the shared cache in the OS offers immense policy flexibility since it may be implemented in software. Unfortunately, it is prohibitively expensive in terms of performance for the OS to be involved in managing temporally fine-grain events such as cache allocation. On the other hand, sophisticated hardware-only cache management techniques to achieve fair sharing or throughput maximization have been proposed. But they offer no policy flexibility. This paper addresses this problem by designing architectural support for OS to efficiently manage shared caches with a wide variety of policies. Our scheme consists of a hardware cache quota management mechanism, an OS interface and a set of OS level quota orchestration policies. The hardware mechanism guarantees that OS-specified quotas are enforced in shared caches, thus eliminating the need for (and the performance penalty of) temporally fine-grained OS intervention. The OS retains policy flexibility since it can tune the quotas during regularly scheduled OS interventions. We demonstrate that our scheme can support a wide range of policies including policies that provide (a) passive performance differentiation, (b) reactive fairness by miss-rate equalization and (c) reactive performance differentiation.

...read moreread less

215 citations

Proceedings Article•

Cache Conscious Algorithms for Relational Query Processing

[...]

Ambuj Shatdal¹, Chander Kant¹, Jeffrey F. Naughton¹•Institutions (1)

University of Wisconsin-Madison¹

12 Sep 1994

TL;DR: It is shown that there are significant benefits in redesigning traditional query processing algorithms so that they can make better use of the cache, and new algorithms run 8%-200% faster than the traditional ones.

...read moreread less

Abstract: The current main memory (DRAM) access speeds lag far behind CPU speeds Cache memory, made of static RAM, is being used in today’s architectures to bridge this gap It provides access latencies of 2-4 processor cycles, in contrast to main memory which requires 15-25 cycles Therefore, the performance of the CPU depends upon how well the cache can be utilized We show that there are significant benefits in redesigning our traditional query processing algorithms so that they can make better use of the cache The new algorithms run 8%-200% faster than the traditional ones

...read moreread less

215 citations

Collapse

Network Information

Performance

Metrics

11,507

Papers

268,081

Citations

No. of papers in the topic in previous years
Year	Papers
2023	42
2022	110
2021	12
2020	20
2019	15
2018	30

Cache pollution

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics