scispace - formally typeset
Search or ask a question
Topic

Cache invalidation

About: Cache invalidation is a research topic. Over the lifetime, 10539 publications have been published within this topic receiving 245409 citations.


Papers
More filters
Patent
James Gerald Brenza1
01 May 1986
TL;DR: In this paper, a common directory and an L1 control array (L1CA) are provided for the CPU to access both the L1 and L2 caches, each of which is either a real/absolute address or a virtual address according to whichever address mode the CPU is in.
Abstract: A data processing system which contains a multi-level storage hierarchy, in which the two highest hierarchy levels (e.g. L1 and L2) are private (not shared) to a single CPU, in order to be in close proximity to each other and to the CPU. Each cache has a data line length convenient to the respective cache. A common directory and an L1 control array (L1CA) are provided for the CPU to access both the L1 and L2 caches. The common directory contains and is addressed by the CPU requesting logical addresses, each of which is either a real/absolute address or a virtual address, according to whichever address mode the CPU is in. Each entry in the directory contains a logical address representation derived from a logical address that previously missed in the directory. A CPU request "hits" in the directory if its requested address is in any private cache (e.g. in L1 or L2). A line presence field (LPF) is included in each directory entry to aid in determining a hit in the L1 cache. The L1CA contains L1 cache information to supplement the corresponding common directory entry; the L1CA is used during a L1 LRU castout, but is not the critical path of an L1 or L2 hit. A translation lookaside buffer (TLB) is not used to determine cache hits. The TLB output is used only during the infrequent times that a CPU request misses in the cache directory, and the translated address (i.e. absolute address) is then used to access the data in a synonym location in the same cache, or in main storage, or in the L1 or L2 cache in another CPU in a multiprocessor system using synonym/cross-interrogate directories.

134 citations

Proceedings ArticleDOI
14 Feb 2004
TL;DR: An in-depth analysis of the pathological behavior of cache hashing functions is presented and two new hashing functions are proposed: prime modulo and prime displacement that are resistant to pathological behavior and yet are able to eliminate the worst case conflict behavior in the L2 cache are proposed.
Abstract: Using alternative cache indexing/hashing functions is a popular technique to reduce conflict misses by achieving a more uniform cache access distribution across the sets in the cache. Although various alternative hashing functions have been demonstrated to eliminate the worst case conflict behavior, no study has really analyzed the pathological behavior of such hashing functions that often result in performance slowdown. We present an in-depth analysis of the pathological behavior of cache hashing functions. Based on the analysis, we propose two new hashing functions: prime modulo and prime displacement that are resistant to pathological behavior and yet are able to eliminate the worst case conflict behavior in the L2 cache. We show that these two schemes can be implemented in fast hardware using a set of narrow add operations, with negligible fragmentation in the L2 cache. We evaluate the schemes on 23 memory intensive applications. For applications that have nonuniform cache accesses, both prime modulo and prime displacement hashing achieve an average speedup of 1.27 compared to traditional hashing, without slowing down any of the 23 benchmarks. We also evaluate using multiple prime displacement hashing functions in conjunction with a skewed associative L2 cache. The skewed associative cache achieves a better average speedup at the cost of some pathological behavior that slows down four applications by up to 7%.

133 citations

Journal ArticleDOI
TL;DR: In this article, an analytical model for the program behavior of a multitasked system is introduced, including the behavior of each process and the interactions between processes with regard to the sharing of data blocks.
Abstract: In many commercial multiprocessor systems, each processor accesses the memory through a private cache. One problem that could limit the extensibility of the system and its performance is the enforcement of cache coherence. A mechanism must exist which prevents the existence of several different copies of the same data block in different private caches. In this paper, we present an in-depth analysis of the effects of cache coherency in multiprocessors. A novel analytical model for the program behavior of a multitasked system is introduced. The model includes the behavior of each process and the interactions between processes with regard to the sharing of data blocks. An approximation is developed to derive the main effects of the cache coherency contributing to degradations in system performance.

133 citations

01 Jan 1985
TL;DR: The model shows that the majority of the cache misses that OPT avoids over LRU come from the most-recently-discarded lines of the LRU cache, which leads to three realizable near-optimal replacement algorithms that try to duplicate the replacement decisions made by OPT.
Abstract: This thesis describes a model used to analyze the replacement decisions made by LRU and OPT (Least-Recently-Used and an optimal replacement-algorithm). The model identifies a set of lines in the LRU cache that are dead, that is, lines that must leave the cache before they can be rereferenced. The model shows that the majority of the cache misses that OPT avoids over LRU come from the most-recently-discarded lines of the LRU cache. Also shown is that a very small set of lines account for the majority of the misses that OPT avoids over LRU. OPT requires perfect knowledge of the future and is not realizable, but our results lead to three realizable near-optimal replacement algorithms. These new algorithms try to duplicate the replacement decisions made by OPT. Simulation results, using a trace-tape and cache simulator, show that these new algorithms achieve up to eight percent fewer misses than LRU and obtain about 20 percent of the miss reduction that OPT obtains. Also presented in the thesis are two new trace-tape reduction techniques. Simulation results show that reductions in trace-tape length of two orders of magnitude are possible with little or no simulation error introduced.

133 citations

Proceedings ArticleDOI
01 May 2013
TL;DR: This work focuses on the cache allocation problem: namely, how to distribute the cache capacity across routers under a constrained total storage budget for the network, and formulate this problem as a content placement problem and obtains the exact optimal solution by a two-step method.
Abstract: Content-Centric Networking (CCN) is a promising framework for evolving the current network architecture, advocating ubiquitous in-network caching to enhance content delivery. Consequently, in CCN, each router has storage space to cache frequently requested content. In this work, we focus on the cache allocation problem: namely, how to distribute the cache capacity across routers under a constrained total storage budget for the network. We formulate this problem as a content placement problem and obtain the exact optimal solution by a two-step method. Through simulations, we use this algorithm to investigate the factors that affect the optimal cache allocation in CCN, such as the network topology and the popularity of content. We find that a highly heterogeneous topology tends to put most of the capacity over a few central nodes. On the other hand, heterogeneous content popularity has the opposite effect, by spreading capacity across far more nodes. Using our findings, we make observations on how network operators could best deploy CCN caches capacity.

133 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Scalability
50.9K papers, 931.6K citations
88% related
Server
79.5K papers, 1.4M citations
88% related
Network packet
159.7K papers, 2.2M citations
83% related
Dynamic Source Routing
32.2K papers, 695.7K citations
83% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202344
2022117
20214
20208
20197
201820