Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

PIPP: promotion/insertion pseudo-partitioning of multi-core shared caches

[...]

Yuejian Xie¹, Gabriel H. Loh¹•Institutions (1)

Georgia Institute of Technology¹

20 Jun 2009

TL;DR: This work proposes a new cache management approach that combines dynamic insertion and promotion policies to provide the benefits of cache partitioning, adaptive insertion, and capacity stealing all with a single mechanism.

...read moreread less

Abstract: Many multi-core processors employ a large last-level cache (LLC) shared among the multiple cores. Past research has demonstrated that sharing-oblivious cache management policies (e.g., LRU) can lead to poor performance and fairness when the multiple cores compete for the limited LLC capacity. Different memory access patterns can cause cache contention in different ways, and various techniques have been proposed to target some of these behaviors. In this work, we propose a new cache management approach that combines dynamic insertion and promotion policies to provide the benefits of cache partitioning, adaptive insertion, and capacity stealing all with a single mechanism. By handling multiple types of memory behaviors, our proposed technique outperforms techniques that target only either capacity partitioning or adaptive insertion.

...read moreread less

334 citations

Patent•

Remote file services network-infrastructure cache

[...]

William M. Pitts, Joel R. Rigler, Robert E. Lister

23 Jul 1998

TL;DR: The NI Cache as discussed by the authors is a network infrastructure cache that provides proxy file services to a plurality of client workstations concurrently requesting access to file data stored on a server through a network interface.

...read moreread less

Abstract: A network-infrastructure cache ("NI Cache") transparently provides proxy file services to a plurality of client workstations concurrently requesting access to file data stored on a server. The NI Cache includes a network interface that connects to a digital computer network. A file-request service-module of the NI Cache receives and responds to network-file-services-protocol requests from workstations through the network interface. A cache, also included in the NI Cache, stores data that is transmitted back to the workstations. A file-request generation-module, also included in the NI Cache, transmits requests for data to the server, and receives responses from the server that include data missing from the cache.

...read moreread less

331 citations

Journal Article•DOI•

Victim Replication: Maximizing Capacity while Hiding Wire Delay in Tiled Chip Multiprocessors

[...]

Michael Zhang¹, Krste Asanovic¹•Institutions (1)

Massachusetts Institute of Technology¹

01 May 2005

TL;DR: This paper presents a new cache management policy, victim replication, which combines the advantages of private and shared schemes, and shows that victim replication reduces the average memory access latency of the shared L2 cache by an average of 16% for multi-threaded benchmarks and 24% for single-threading benchmarks.

...read moreread less

Abstract: In this paper, we consider tiled chip multiprocessors (CMP) where each tile contains a slice of the total on-chip L2 cache storage and tiles are connected by an on-chip network. The L2 slices can be managed using two basic schemes: 1) each slice is treated as a private L2 cache for the tile 2) all slices are treated as a single large L2 cache shared by all tiles. Private L2 caches provide the lowest hit latency but reduce the total effective cache capacity, as each tile creates local copies of any line it touches. A shared L2 cache increases the effective cache capacity for shared data, but incurs long hit latencies when L2 data is on a remote tile. We present a new cache management policy, victim replication, which combines the advantages of private and shared schemes. Victim replication is a variant of the shared scheme which attempts to keep copies of local primary cache victims within the local L2 cache slice. Hits to these replicated copies reduce the effective latency of the shared L2 cache, while retaining the benefits of a higher effective capacity for shared data. We evaluate the various schemes using full-system simulation of both single-threaded and multi-threaded benchmarks running on an 8-processor tiled CMP. We show that victim replication reduces the average memory access latency of the shared L2 cache by an average of 16%for multi-threaded benchmarks and 24%for single-threaded benchmarks, providing better overall performance than either private or shared schemes.

...read moreread less

331 citations

Proceedings Article•DOI•

A new memory monitoring scheme for memory-aware scheduling and partitioning

[...]

G.E. Suh¹, Srinivas Devadas¹, Larry Rudolph¹•Institutions (1)

Massachusetts Institute of Technology¹

02 Feb 2002

TL;DR: A scheme that enables an accurate estimate of the isolated miss-rates of each process as a function of cache size under the standard LRU replacement policy is described, which can be used to schedule jobs or to partition the cache to minimize the overall miss-rate.

...read moreread less

Abstract: We propose a low overhead, online memory monitoring scheme utilizing a set of novel hardware counters. The counters indicate the marginal gain in cache hits as the size of the cache is increased, which gives the cache miss-rate as a function of cache size. Using the counters, we describe a scheme that enables an accurate estimate of the isolated miss-rates of each process as a function of cache size under the standard LRU replacement policy. This information can be used to schedule jobs or to partition the cache to minimize the overall miss-rate. The data collected by the monitors can also be used by an analytical model of cache and memory behavior to produce a more accurate overall miss-rate for the collection of processes sharing a cache in both time and space. This overall miss-rate can be used to improve scheduling and partitioning schemes.

...read moreread less

325 citations

Proceedings Article•DOI•

A highly configurable cache architecture for embedded systems

[...]

Chuanjun Zhang¹, Frank Vahid¹, Walid Najjar¹•Institutions (1)

University of California, Riverside¹

01 May 2003

TL;DR: This work introduces a novel cache architecture intended for embedded microprocessor platforms that can be configured by software to be direct-mapped, two-way, or four-way set associative, using a technique the authors call way concatenation, having very little size or performance overhead.

...read moreread less

Abstract: Energy consumption is a major concern in many embedded computing systems. Several studies have shown that cache memories account for about 50% of the total energy consumed in these systems. The performance of a given cache architecture is largely determined by the behavior of the application using that cache. Desktop systems have to accommodate a very wide range of applications and therefore the manufacturer usually sets the cache architecture as a compromise given current applications, technology and cost. Unlike desktop systems, embedded systems are designed to run a small range of well-defined applications. In this context, a cache architecture that is tuned for that narrow range of applications can have both increased performance as well as lower energy consumption. We introduce a novel cache architecture intended for embedded microprocessor platforms. The cache can be configured by software to be direct-mapped, two-way, or four-way set associative, using a technique we call way concatenation, having very little size or performance overhead. We show that the proposed cache architecture reduces energy caused by dynamic power compared to a way-shutdown cache. Furthermore, we extend the cache architecture to also support a way shutdown method designed to reduce the energy from static power that is increasing in importance in newer CMOS technologies. Our study of 23 programs drawn from Powerstone, MediaBench and Spec2000 show that tuning the cache's configuration saves energy for every program compared to conventional four-way set-associative as well as direct mapped caches, with average savings of 40% compared to a four-way conventional cache.

...read moreread less

323 citations

Collapse

Network Information

Performance

Metrics

10,702

Papers

250,710

Citations

No. of papers in the topic in previous years
Year	Papers
2023	44
2022	117
2021	4
2020	8
2019	7
2018	20

Cache invalidation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics