scispace - formally typeset
Search or ask a question
Topic

Cache pollution

About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.


Papers
More filters
Patent
Mason Cabot1
20 Dec 2005
TL;DR: In this article, a cache line eviction mechanism is proposed that enables programmers to mark portions of code with different cache priority levels based on anticipated or measured access patterns for those code portions.
Abstract: A method and apparatus to enable programmatic control of cache line eviction policies. A mechanism is provided that enables programmers to mark portions of code with different cache priority levels based on anticipated or measured access patterns for those code portions. Corresponding cues to assist in effecting the cache eviction policies associated with given priority levels are embedded in machine code generated from source-and/or assembly-level code. Cache architectures are provided that partition cache space into multiple pools, each pool being assigned a different priority. In response to execution of a memory access instruction, an appropriate cache pool is selected and searched based on information contained in the instruction's cue. On a cache miss, a cache line is selected from that pool to be evicted using a cache eviction policy associated with the pool. Implementations of the mechanism or described for both n-way set associative caches and fully-associative caches.

123 citations

Journal ArticleDOI
TL;DR: This paper surveys current cache coherence mechanisms, and identifies several issues critical to their design, and hybrid strategies are presented that can enhance the performance of the multiprocessor memory system by combining several different coherence mechanism into a single system.
Abstract: Private data caches have not been as effective in reducing the average memory delay in multiprocessors as in uniprocessors due to data spreading among the processors, and due to the cache coherence problem. A wide variety of mechanisms have been proposed for maintaining cache coherence in large-scale shared memory multiprocessors making it difficult to compare their performance and implementation implications. To help the computer architect understand some of the trade-offs involved, this paper surveys current cache coherence mechanisms, and identifies several issues critical to their design. These design issues include: 1) the coherence detection strategy, through which possibly incoherent memory accesses are detected either statically at compile-time, or dynamically at run-time; 2) the coherence enforcement strategy, such as updating or invalidating, that is used to ensure that stale cache entries are never referenced by a processor; 3) how the precision of block sharing information can be changed to trade-off the implementation cost and the performance of the coherence mechanism; and 4) how the cache block size affects the performance of the memory system. Trace-driven simulations are used to compare the performance and implementation impacts of these different issues. In addition, hybrid strategies are presented that can enhance the performance of the multiprocessor memory system by combining several different coherence mechanisms into a single system.

123 citations

Patent
17 Jan 1997
TL;DR: In this article, the authors proposed a set-associative cache memory for allocating entries in a branch prediction table (BPT) to branch prediction information for related branch instructions.
Abstract: Allocation circuitry for allocating entries within a set-associative cache memory is disclosed. The set-associative cache memory comprises N ways, each way having M entries and corresponding entries in each of the N ways constituting a set of entries. The allocation circuitry has a first circuit which identifies related data units by identifying a probability that the related data units may be successively read from the cache memory. A second circuit within the allocation circuitry allocates the corresponding entries in each of the ways to the related data units, so that related data units are stored in a common set of entries. Accordingly, the related data units will be simultaneously outputted from the set-associative cache memory, and are thus concurrently available for processing. The invention may find application in allocating entries of a common set in a branch prediction table (BPT) to branch prediction information for related branch instructions.

122 citations

Patent
22 Jun 1989
TL;DR: In this paper, a system is described wherein a CPU, a main memory means and a bus means are provided to couple the CPU to the bus means and means to indicate the status of a data unit stored within the cache memory means.
Abstract: A system is described wherein a CPU, a main memory means and a bus means are provided. Cache memory means is employed to couple the CPU to the bus means and is further provided with means to indicate the status of a data unit stored within the cache memory means. One status indication tells whether the contents of a storage position have been modified since those contents were received from main memory and another indicates whether the contents of the storage position may be present elsewhere memory means. Control means are provided to assure that when a data unit from a CPU is received and stored in the CPU's associated cache memory means, which data unit is indicated as being also stored in a cache memory means associated with another CPU, such CPU data unit is also written into main memory means. During that process, other cache memory means monitor the bus means and update its corresponding data unit. Bus monitor means are provided and monitor all writes to main memory and reads from main memory to aid in the assurance of system-wide data integrity.

122 citations

Proceedings ArticleDOI
01 Dec 2007
TL;DR: The tagless hit instruction cache (TH-IC) is proposed, a technique for completely eliminating the performance penalty associated with filter caches, as well as a further reduction in energy consumption due to not having to access the tag array on cache hits.
Abstract: Very small instruction caches have been shown to greatly reduce fetch energy. However, for many appli- cations the use of a small filter cache can lead to an unacceptable increase in execution time. In this paper, we propose the Tagless Hit Instruction Cache (TH-IC), a technique for completely eliminating the performance penalty associated with filter caches, as well as a fur- ther reduction in energy consumption due to not having to access the tag array on cache hits. Using a few meta- data bits per line, we are able to more efficiently track the cache contents and guarantee when hits will occur in our small TH-IC. When a hit is not guaranteed, we can instead fetch directly from the L1 instruction cache, eliminating any additional cycles due to a TH-IC miss. Experimental results show that the overall processor en- ergy consumption can be significantly reduced due to the faster application running time and the elimination of tag comparisons for most of the accesses.

122 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Compiler
26.3K papers, 578.5K citations
89% related
Scalability
50.9K papers, 931.6K citations
87% related
Server
79.5K papers, 1.4M citations
86% related
Static routing
25.7K papers, 576.7K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
2022110
202112
202020
201915
201830