Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

An Efficient Compiler Framework for Cache Bypassing on GPUs

[...]

Yun Liang¹, Xiaolong Xie¹, Guangyu Sun¹, Deming Chen²•Institutions (2)

Peking University¹, University of Illinois at Urbana–Champaign²

21 Apr 2015-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: An efficient compiler framework for cache bypassing on GPUs is proposed and efficient algorithms that judiciously select global load instructions for cache access or bypass are presented.

...read moreread less

Abstract: Graphics processing units (GPUs) have become ubiquitous for general purpose applications due to their tremendous computing power. Initially, GPUs only employ scratchpad memory as on-chip memory. Though scratchpad memory benefits many applications, it is not ideal for those general purpose applications with irregular memory accesses. Hence, GPU vendors have introduced caches in conjunction with scratchpad memory in the recent generations of GPUs. The caches on GPUs are highly configurable. The programmer or compiler can explicitly control cache access or bypass for global load instructions. This highly configurable feature of GPU caches opens up the opportunities for optimizing the cache performance. In this paper, we propose an efficient compiler framework for cache bypassing on GPUs. Our objective is to efficiently utilize the configurable cache and improve the overall performance for general purpose GPU applications. In order to achieve this goal, we first characterize GPU cache utilization and develop performance metrics to estimate the cache reuses and memory traffic. Next, we present efficient algorithms that judiciously select global load instructions for cache access or bypass. Finally, we present techniques to explore the unified cache and shared memory design space. We integrate our techniques into an automatic compiler framework that leverages parallel thread execution instruction set architecture to enable cache bypassing for GPUs. Experiments evaluation on NVIDIA GTX680 using a variety of applications demonstrates that compared to cache-all and bypass-all solutions, our techniques improve the performance from 4.6% to 13.1% for 16 KB L1 cache.

...read moreread less

60 citations

Proceedings Article•DOI•

Choosing an Error Protection Scheme for a Microprocessor's L1 Data Cache

[...]

N.N. Sadler¹, Daniel J. Sorin¹•Institutions (1)

Duke University¹

01 Oct 2006

TL;DR: This work deconstructs and compares the two dominant existing approaches for L1 data cache (L1D) error protection and presents a new error protection scheme, called the punctured ECC recovery cache (PERC), that achieves the best features of both existing schemes.

...read moreread less

Abstract: We deconstruct and compare the two dominant existing approaches for L1 data cache (L1D) error protection, with respect to performance, L2 cache bandwidth, power, and area. The two approaches are: (1) parity on the L1D with write-through to an ECC-protected L2, and (2) ECC protection on the L1D. Qualitatively, the first approach requires a write-through L1D, which places a large bandwidth and power demand on the L2. The second approach adds more bits in the L1D for error protection, which adds to the L1D's area and power while degrading its performance. Our quantitative results show that the relative costs of the second approach are small and that its benefits outweigh these costs. We also present a new error protection scheme, called the Punctured ECC Recovery Cache (PERC), that achieves the best features of both existing schemes.

...read moreread less

60 citations

Patent•

Method, system, and program for maintaining data in distributed caches

[...]

Sandra K. Johnson¹•Institutions (1)

IBM¹

26 Sep 2003

TL;DR: In this article, the authors present a method, system, and program for maintaining data in distributed caches, where a copy of an object is maintained in at least one cache, wherein multiple caches may have different versions of the object, and wherein the objects are capable of having modifiable data units.

...read moreread less

Abstract: Provided are a method, system, and program for maintaining data in distributed caches. A copy of an object is maintained in at least one cache, wherein multiple caches may have different versions of the object, and wherein the objects are capable of having modifiable data units. Update information is maintained for each object maintained in each cache, wherein the update information for each object in each cache indicates the object, the cache including the object, and indicates whether each data unit in the object was modified. After receiving a modification to a target data unit in one target object in one target cache, the update information for the target object and target cache is updated to indicate that the target data unit is modified, wherein the update information for the target object in any other cache indicates that the target data unit is not modified.

...read moreread less

60 citations

Patent•

Data processing apparatus having cache and translation lookaside buffer

[...]

Matthias Lothar Boettcher, Daniel Kershaw

08 May 2013

TL;DR: In this article, a way table is provided for identifying which of a plurality of cache ways stores require data, and each way table entry corresponds to one of the translation look aside buffer (TLB) entries of the TLB and identifies, for each memory location, which cache way stores the data associated with that memory location.

...read moreread less

Abstract: A data processing apparatus has a cache and a translation look aside buffer (TLB). A way table is provided for identifying which of a plurality of cache ways stores require data. Each way table entry corresponds to one of the TLB entries of the TLB and identifies, for each memory location of the page associated with the corresponding TLB entry, which cache way stores the data associated with that memory location. Also, the cache may be capable of servicing M access requests in the same processing cycle. An arbiter may select pending access requests for servicing by the cache in a way that ensures that the selected pending access requests specify a maximum of N different virtual page addresses, where N < M.

...read moreread less

60 citations

Patent•

Cache memory management in a flash cache architecture

[...]

Evangelos Eleftheriou¹, Robert Haas¹, Xiao-Yu Hu¹, Roman A. Pletka¹•Institutions (1)

IBM¹

04 Feb 2013

TL;DR: In this paper, a cache controller is coupled to at least two flash bricks, each comprising a flash memory, and metadata is updated to indicate the flash brick having the flash memory on which data units are cached, wherein the metadata is used to determine the flash bricks on which the cache controller caches received data units.

...read moreread less

Abstract: Provided is a method for managing cache memory to cache data units in at least one storage device. A cache controller is coupled to at least two flash bricks, each comprising a flash memory. Metadata indicates a mapping of the data units to the flash bricks caching the data units, wherein the metadata is used to determine the flash bricks on which the cache controller caches received data units. The metadata is updated to indicate the flash brick having the flash memory on which data units are cached.

...read moreread less

60 citations

Collapse

Network Information

Performance

Metrics

10,702

Papers

250,710

Citations

No. of papers in the topic in previous years
Year	Papers
2023	44
2022	117
2021	4
2020	8
2019	7
2018	20

Cache invalidation

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics