Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Impact of Cache Partitioning on Multi-tasking Real Time Embedded Systems

[...]

Bach D. Bui¹, Marco Caccamo¹, Lui Sha¹, J. Martinez²•Institutions (2)

University of Illinois at Urbana–Champaign¹, Lockheed Martin Corporation²

25 Aug 2008

TL;DR: In this work, the cache partitioning problem is presented as an optimization problem whose solution sets the size of each cache partition and assigns tasks to partitions such that system worst-case utilization is minimized thus increasing real-time schedulability.

...read moreread less

Abstract: Cache partitioning techniques have been proposed in the past as a solution for the cache interference problem. Due to qualitative differences with general purpose platforms, real-time embedded systems need to minimize task real-time utilization (function of execution time and period) instead of only minimizing the number of cache misses. In this work, the partitioning problem is presented as an optimization problem whose solution sets the size of each cache partition and assigns tasks to partitions such that system worst-case utilization is minimized thus increasing real-time schedulability. Since the problem is NP-Hard, a genetic algorithm is presented to find a near optimal solution. A case study and experiments show that in a typical real-time embedded system, the proposed algorithm is able to reduce the worst-case utilization by 15% (on average) if compared to the case when the system uses a shared cache or a proportional cache partitioned environment.

...read moreread less

109 citations

Journal Article•DOI•

Nonuniform cache architectures for wire-delay dominated on-chip caches

[...]

Changkyu Kim, Doug Burger, Stephen W. Keckler

01 Nov 2003-IEEE Micro

TL;DR: The authors propose several designs that treat the cache as a network of banks and facilitate nonuniform accesses to different physical regions that offer low-latency access, increased scalability, and greater performance stability than conventional uniform access cache architectures.

...read moreread less

Abstract: Nonuniform cache access designs solve the on-chip wire delay problem for future large integrated caches. By embedding a network in the cache, NUCA designs let data migrate within the cache, clustering the working set nearest the processor. The authors propose several designs that treat the cache as a network of banks and facilitate nonuniform accesses to different physical regions. NUCA architectures offer low-latency access, increased scalability, and greater performance stability than conventional uniform access cache architectures.

...read moreread less

109 citations

Patent•

Method and system for coherently caching I/O devices across a network

[...]

James Ian Percival

22 Nov 2004

TL;DR: In this paper, the authors propose a cache that keeps regularly accessed disk I/O data within RAM that forms part of a computer systems main memory, depending on the size of the access.

...read moreread less

Abstract: The cache keeps regularly accessed disk I/O data within RAM that forms part of a computer systems main memory. The cache operates across a network of computers systems, maintaining cache coherency for the disk I/O devices that are shared by the multiple computer systems within that network. Read access for disk I/O data that is contained within the RAM is returned much faster than would occur if the disk I/O device was accessed directly. The data is held in one of three areas of the RAM for the cache, dependent on the size of the I/O access. The total RAM containing the three areas for the cache does not occupy a fixed amount of a computers main memory. The RAM for the cache grows to contain more disk I/O data on demand and shrinks when more of the main memory is required by the computer system for other uses. The user of the cache is allowed to specify which size of I/O access is allocated to the three areas for the RAM, along with a limit for the total amount of main memory that will be used by the cache at any one time.

...read moreread less

109 citations

Proceedings Article•DOI•

Locality-Driven Dynamic GPU Cache Bypassing

[...]

Chao Li¹, Shuaiwen Leon Song², Hongwen Dai¹, Albert Sidelnik³, Siva Kumar Sastry Hari³, Huiyang Zhou¹ - Show less +2 more•Institutions (3)

North Carolina State University¹, Pacific Northwest National Laboratory², Nvidia³

08 Jun 2015

TL;DR: This paper presents a design that integrates locality filtering based on reuse characteristics of GPU workloads into the decoupled tag store of the existing L1 D-cache through simple and cost-effective hardware extensions.

...read moreread less

Abstract: This paper presents novel cache optimizations for massively parallel, throughput-oriented architectures like GPUs. L1 data caches (L1 D-caches) are critical resources for providing high-bandwidth and low-latency data accesses. However, the high number of simultaneous requests from single-instruction multiple-thread (SIMT) cores makes the limited capacity of L1 D-caches a performance and energy bottleneck, especially for memory-intensive applications. We observe that the memory access streams to L1 D-caches for many applications contain a significant amount of requests with low reuse, which greatly reduce the cache efficacy. Existing GPU cache management schemes are either based on conditional/reactive solutions or hit-rate based designs specifically developed for CPU last level caches, which can limit overall performance. To overcome these challenges, we propose an efficient locality monitoring mechanism to dynamically filter the access stream on cache insertion such that only the data with high reuse and short reuse distances are stored in the L1 D-cache. Specifically, we present a design that integrates locality filtering based on reuse characteristics of GPU workloads into the decoupled tag store of the existing L1 D-cache through simple and cost-effective hardware extensions. Results show that our proposed design can dramatically reduce cache contention and achieve up to 56.8% and an average of 30.3% performance improvement over the baseline architecture, for a range of highly-optimized cache-unfriendly applications with minor area overhead and better energy efficiency. Our design also significantly outperforms the state-of-the-art CPU and GPU bypassing schemes (especially for irregular applications), without generating extra L2 and DRAM level contention.

...read moreread less

109 citations

Proceedings Article•DOI•

Secure Hierarchy-Aware Cache Replacement Policy (SHARP): Defending Against Cache-Based Side Channel Atacks

[...]

Mengjia Yan¹, Bhargava Gopireddy¹, Thomas Shull¹, Josep Torrellas¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

24 Jun 2017

TL;DR: This paper proposes to alter the line replacement algorithm of the shared cache, to prevent a process from creating inclusion victims in the caches of cores running other processes, and calls it SHARP (Secure Hierarchy-Aware cache Replacement Policy).

...read moreread less

Abstract: In cache-based side channel attacks, a spy that shares a cache with a victim probes cache locations to extract information on the victim's access patterns. For example, in evict+reload, the spy repeatedly evicts and then reloads a probe address, checking if the victim has accessed the address in between the two operations. While there are many proposals to combat these cache attacks, they all have limitations: they either hurt performance, require programmer intervention, or can only defend against some types of attacks.This paper makes the following observation for an environment with an inclusive cache hierarchy: when the spy evicts the probe address from the shared cache, the address will also be evicted from the private cache of the victim process, creating an inclusion victim. Consequently, to disable cache attacks, this paper proposes to alter the line replacement algorithm of the shared cache, to prevent a process from creating inclusion victims in the caches of cores running other processes. By enforcing this rule, the spy cannot evict the probe address from the shared cache and, hence, cannot glimpse any information on the victim's access patterns. We call our proposal SHARP (Secure Hierarchy-Aware cache Replacement Policy). SHARP efficiently defends against all existing cross-core shared-cache attacks, needs only minimal hardware modifications, and requires no code modifications. We implement SHARP in a cycle-level full-system simulator. We show that it protects against real-world attacks, and that it introduces negligible average performance degradation.

...read moreread less

109 citations

Collapse

Network Information

Performance

Metrics

10,702

Papers

250,710

Citations

No. of papers in the topic in previous years
Year	Papers
2023	44
2022	117
2021	4
2020	8
2019	7
2018	20

Cache invalidation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics