scispace - formally typeset
Search or ask a question
Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.


Papers
More filters
Proceedings ArticleDOI
16 Feb 2004
TL;DR: This work introduces on-chip hardware implementing an efficient cache tuning heuristic that can automatically, transparently, and dynamically tune the cache to an executing program, completely transparently to the programmer.
Abstract: Memory accesses can account for about half of a microprocessor system's power consumption. Customizing a microprocessor cache's total size, line size and associativity to a particular program is well known to have tremendous benefits for performance and power. Customizing caches has until recently been restricted to core-based flows, in which a new chip will be fabricated. However, several configurable cache architectures have been proposed recently for use in pre-fabricated microprocessor platforms. Tuning those caches to a program is still however a cumbersome task left for designers, assisted in part by recent computer-aided design (CAD) tuning aids. We propose to move that CAD on-chip, which can greatly increase the acceptance of configurable caches. We introduce on-chip hardware implementing an efficient cache tuning heuristic that can automatically, transparently, and dynamically tune the cache to an executing program. We carefully designed the heuristic to avoid any cache flushing, since flushing is power and performance costly. By simulating numerous Powerstone and MediaBench benchmarks, we show that such a dynamic self-tuning cache can reduce memory-access energy by 45% to 55% on average, and as much as 97%, compared with a four-way set-associative base cache, completely transparently to the programmer.

91 citations

Patent
28 Feb 1997
TL;DR: In this paper, an active cache memory for use with microprocessors is disclosed, which is capable of performing transfers from external random access memory independently of the encache misaligned references and to transfer data to the microprocessor in bursts.
Abstract: An active cache memory for use with microprocessors is disclosed. The cache is external to the microprocessor and forms a second level cache which is novel in that it is capable of performing transfers from external random access memory independently of the encache misaligned references and to transfer data to the microprocessor in bursts.

90 citations

Proceedings ArticleDOI
22 Apr 2001
TL;DR: The experimental results show that the proposed techniques are effective in supporting coarse-grain cache management and reducing server response times for tested applications.
Abstract: Caching dynamic pages at a server site is beneficial in reducing server resource demands and it also helps dynamic page caching at proxy sites. Previous work has used fine-grain dependence graphs among individual dynamic pages and underlying data sets to enforce result consistency. This paper proposes a complementary solution for applications that require coarse-grain cache management. The key idea is to partition dynamic pages into classes based on URL patterns so that an application can specify page identification and data dependence, and invoke invalidation for a class of dynamic pages. To make this scheme time-efficient with small space requirement, lazy invalidation is used to minimize slow disk accesses when IDs of dynamic pages are stored in memory with a digest format. Selective precomputing is further proposed to refresh stale pages and smoothen load peaks. A data structure is developed for efficient URL class searching during lazy or eager invalidation. This paper also presents design and implementation of a caching system called Cachuma which integrates the above techniques, runs in tandem with standard Web servers, and allows Web sites to add dynamic page caching capability with minimal changes. The experimental results show that the proposed techniques are effective in supporting coarse-grain cache management and reducing server response times for tested applications.

90 citations

Patent
29 May 1997
TL;DR: In this paper, a virtual data storage system provides a method and apparatus for adaptively throttling transfers into a cache storage to prevent an overrun in the cache storage, and a recall throttle is computed based on cache free space and a number of storage devices reserved for recalling data files from the set of storage volumes.
Abstract: A virtual data storage system provides a method and apparatus for adaptively throttling transfers into a cache storage to prevent an overrun in the cache storage. The virtual data storage system includes a storage interface appearing as a set of addressable, virtual storage devices, a cache storage for initially storing host-originated data files, storage devices for eventually storing the data files on a set of storage volumes, and a storage manager for directing the data files between the cache storage and the storage devices. An amount of available space in the cache storage, or a cache free space, is monitored against an adjustable cache space threshold. A storage throttle is computed when the cache free space drops below the cache space threshold. Additionally, a recall throttle is computed based on the cache free space and a number of storage devices reserved for recalling data files from the set of storage volumes. A maximum value of the storage throttle and the recall throttle is used to delay the storing of data files and the recalling of data files into the cache storage and to prevent overrunning the cache storage by completely depleting the cache free space.

90 citations

Patent
16 Oct 2001
TL;DR: In this paper, the coherency protocol of a multiprocessor data processing system is described, which includes processing logic that returns to coherent operations with other processing units responsive to an occurrence of a pre-determined condition.
Abstract: A multiprocessor data processing system comprising a plurality of processing units, a plurality of caches, that is each affiliated with one of the processing units, and processing logic that, responsive to a receipt of a first system bus response to a coherency operation, causes the requesting processor to execute operations utilizing super-coherent data. The data processing system further includes logic eventually returning to coherent operations with other processing units responsive to an occurrence of a pre-determined condition. The coherency protocol of the data processing system includes a first coherency state that indicates that modification of data within a shared cache line of a second cache of a second processor has been snooped on a system bus of the data processing system. When the cache line is in the first coherency state, subsequent requests for the cache line is issued as a Z 1 read on a system bus and one of two responses are received. If the response to the Z 1 read indicates that the first processor should utilize local data currently available within the cache line, the first coherency state is changed to a second coherency state that indicates to the first processor that subsequent request for the cache line should utilize the data within the local cache and not be issued to the system interconnect. Coherency state transitions to the second coherency state is completed via the coherency protocol of the data processing system. Super-coherent data is provided to the processor from the cache line of the local cache whenever the second coherency state is set for the cache line and a request is received.

90 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Scalability
50.9K papers, 931.6K citations
88% related
Server
79.5K papers, 1.4M citations
88% related
Network packet
159.7K papers, 2.2M citations
83% related
Dynamic Source Routing
32.2K papers, 695.7K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202344
2022117
20214
20208
20197
201820