Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Two fast and high-associativity cache schemes

[...]

Chenxi Zhang, Xiaodong Zhang, Yong Yan

01 Sep 1997-IEEE Micro

TL;DR: Two schemes for implementing associativity greater than two are proposed, which are an extension of the column-associative cache and the parallel multicolumn cache, which can effectively reduce the average access time.

...read moreread less

Abstract: In the race to improve cache performance, many researchers have proposed schemes that increase a cache's associativity. The associativity of a cache is the number of places in the cache where a block may reside. In a direct-mapped cache, which has an associativity of 1, there is only one location to search for a match for each reference. In a cache with associativity n-an n-way set-associative cache-there are n locations. Increasing associativity reduces the miss rate by decreasing the number of conflict, or interference, references. The column-associative cache and the predictive sequential associative cache seem to have achieved near-optimal performance for an associativity of two. Increasing associativity beyond two, therefore, is one of the most important ways to further improve cache performance. We propose two schemes for implementing associativity greater than two: the sequential multicolumn cache, which is an extension of the column-associative cache, and the parallel multicolumn cache. For an associativity of four, they achieve the low miss rate of a four-way set-associative cache. Our simulation results show that both schemes can effectively reduce the average access time.

...read moreread less

84 citations

Proceedings Article•DOI•

Priority-based cache allocation in throughput processors

[...]

Dong Li¹, Minsoo Rhu¹, Daniel R. Johnson², Mike O'Connor¹, Mattan Erez¹, Doug Burger³, Donald S. Fussell¹, Stephen W. Redder¹ - Show less +4 more•Institutions (3)

University of Texas at Austin¹, Nvidia², Microsoft³

09 Mar 2015

TL;DR: A priority-based cache allocation (PCAL) that provides preferential cache capacity to a subset of high-priority threads while simultaneously allowing lower priority threads to execute without contending for the cache is proposed.

...read moreread less

Abstract: GPUs employ massive multithreading and fast context switching to provide high throughput and hide memory latency. Multithreading can Increase contention for various system resources, however, that may result In suboptimal utilization of shared resources. Previous research has proposed variants of throttling thread-level parallelism to reduce cache contention and improve performance. Throttling approaches can, however, lead to under-utilizing thread contexts, on-chip interconnect, and off-chip memory bandwidth. This paper proposes to tightly couple the thread scheduling mechanism with the cache management algorithms such that GPU cache pollution is minimized while off-chip memory throughput is enhanced. We propose priority-based cache allocation (PCAL) that provides preferential cache capacity to a subset of high-priority threads while simultaneously allowing lower priority threads to execute without contending for the cache. By tuning thread-level parallelism while both optimizing caching efficiency as well as other shared resource usage, PCAL builds upon previous thread throttling approaches, improving overall performance by an average 17% with maximum 51%.

...read moreread less

84 citations

Proceedings Article•DOI•

Scheduling irregular parallel computations on hierarchical caches

[...]

Guy E. Blelloch¹, Jeremy T. Fineman¹, Phillip B. Gibbons², Harsha Vardhan Simhadri¹•Institutions (2)

Carnegie Mellon University¹, Intel²

04 Jun 2011

TL;DR: The parallel cache-oblivious (PCO) model is presented, a relatively simple modification to the CO model that can be used to account for costs on a broad range of cache hierarchies, and a new scheduler is described, which attains provably good cache performance and runtime on parallel machine models with hierarchical caches.

...read moreread less

Abstract: For nested-parallel computations with low depth (span, critical path length) analyzing the work, depth, and sequential cache complexity suffices to attain reasonably strong bounds on the parallel runtime and cache complexity on machine models with either shared or private caches. These bounds, however, do not extend to general hierarchical caches, due to limitations in (i) the cache-oblivious (CO) model used to analyze cache complexity and (ii) the schedulers used to map computation tasks to processors. This paper presents the parallel cache-oblivious (PCO) model, a relatively simple modification to the CO model that can be used to account for costs on a broad range of cache hierarchies. The first change is to avoid capturing artificial data sharing among parallel threads, and the second is to account for parallelism-memory imbalances within tasks. Despite the more restrictive nature of PCO compared to CO, many algorithms have the same asymptotic cache complexity bounds.The paper then describes a new scheduler for hierarchical caches, which extends recent work on "space-bounded schedulers" to allow for computations with arbitrary work imbalance among parallel subtasks. This scheduler attains provably good cache performance and runtime on parallel machine models with hierarchical caches, for nested-parallel computations analyzed using the PCO model. We show that under reasonable assumptions our scheduler is "work efficient" in the sense that the cost of the cache misses are evenly balanced across the processors---i.e., the runtime can be determined within a constant factor by taking the total cost of the cache misses analyzed for a computation and dividing it by the number of processors. In contrast, to further support our model, we show that no scheduler can achieve such bounds (optimizing for both cache misses and runtime) if work, depth, and sequential cache complexity are the only parameters used to analyze a computation.

...read moreread less

84 citations

Patent•

System, apparatus and method for multi-level cache in a multi-processor/multi-controller environment

[...]

Noel Simen Otterness, William A. Brant, Keith Edward Short, Joseph G. Skazinski

30 Sep 1999

TL;DR: In this paper, a multiple level cache structure and multiple level caching method that distributes I/O processing loads including caching operations between processors to provide higher performance processing, especially in a server environment is presented.

...read moreread less

Abstract: This inventive provides a multiple level cache structure and multiple level caching method that distributes I/O processing loads including caching operations between processors to provide higher performance I/O processing, especially in a server environment. A method of achieving optimal data throughput by taking full advantage of multiple processing resources is disclosed. A method for managing the allocation of the data caches to optimize the host access time and parity generation is disclosed. A cache allocation for RAID stripes guaranteed to provide fast access times for the XOR engine by ensuring that all cache lines are allocated from the same cache level is disclosed. Allocation of cache lines for RAID levels which do not require parity generation and are allocated in such manner as to maximize utilization of the memory bandwidth is disclosed. Parity generation which is optimized for use of the processor least utilized at the time the cache lines are allocated, thereby providing for dynamic load balancing amongst the multiple processing resources, is disclosed. An inventive cache line descriptor for maintaining information about which cache data pool the cache line resides within, and an inventive cache line descriptor which includes enhancements to allow for movement of cache data from one cache level to another is disclosed. A cache line descriptor with enhancements for tracking the cache within which RAID stripe cache lines siblings reside is disclosed. System, apparatus, computer program product, and methods to support these aspects alone and in combination are also provided.

...read moreread less

84 citations

Patent•

Method and apparatus for managing internal caches and external caches in a data processing system

[...]

James R. H. Challenger¹, George Prentice Copeland¹, Paul M. Dantzig¹, Arun Iyengar¹, Matthew Dale McClain¹ - Show less +1 more•Institutions (1)

IBM¹

29 May 2002

TL;DR: In this paper, a method and apparatus in a data processing system for caching data in an internal cache and in an external cache is presented. But this method is limited to the case where a set of fragments is received for caching and a location is identified to store each fragment within the plurality of fragments.

...read moreread less

Abstract: A method and apparatus in a data processing system for caching data in an internal cache and in an external cache A set of fragments is received for caching. A location is identified to store each fragment within the plurality of fragments based on a rate of change of data in each fragment. The set of fragments is stored in the internal cache and the external cache using the location identified for each fragment within the plurality of fragments.

...read moreread less

84 citations

Collapse

Network Information

Performance

Metrics

10,702

Papers

250,710

Citations

No. of papers in the topic in previous years
Year	Papers
2023	44
2022	117
2021	4
2020	8
2019	7
2018	20

Cache invalidation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics