scispace - formally typeset
Search or ask a question
Topic

Cache pollution

About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This work presents techniques that take into account the parameters of the data caches for organizing scalar and array variables declared in embedded code into memory, with the objective of improving data cache performance.
Abstract: Code generation for embedded processors opens up the possibility for several performance optimization techniques that have been ignored by traditional compilers due to compilation time constraints. We present techniques that take into account the parameters of the data caches for organizing scalar and array variables declared in embedded code into memory, with the objective of improving data cache performance. We present techniques for clustering variables to minimize compulsory cache misses, and for solving the memory assignment problem to minimize conflict cache misses. Our experiments with benchmark code kernels from DSP and other domains on the CW4001 embedded processor from LSI Logic indicate significant improvements in data cache performance by the application of our memory organization technique.

85 citations

Journal ArticleDOI
TL;DR: A cache-based computer system employs a fast, small memory interposed between the usual processor and main memory that provides a smaller ratio of memory access times, and holds the processor idle while blocks of data are being transferred from main memory to cache rather than switching to another task.
Abstract: A cache-based computer system employs a fast, small memory -the " cache" - interposed between the usual processor and main memory. At any given time the cache contains as much as possible the instructions and data the processor needs; as new information is needed it is brought from main memory to cache, displacing old information. The processor tends to operate with a memory of cache speed but with main memory cost-per-bit. This configuration has analogies with other systems employing memory hierarchies, such as "paging" or "virtual memory" systems. In contrast with these latter, a cache is managed by hardware algorithms, deals with smaller blocks of data (32 bytes, for example, rather than 4096), provides a smaller ratio of memory access times (5:1 rather than 1000: 1), and, because of the last factor, holds the processor idle while blocks of data are being transferred from main memory to cache rather than switching to another task. These are important differences, and may suffice to make the cache-based system cost effective in many situations where paging is not.

85 citations

Proceedings ArticleDOI
04 Jan 1995
TL;DR: Experimental results suggest that both the block buffering and Gray code addressing techniques are ideal for instruction cache designs which tend to be accessed in a consecutive sequence and can achieve an order of magnitude energy reduction on caches.
Abstract: Caches usually consume a significant amount of energy in modern microprocessors (eg superpipelined or superscalar processors) In this paper, we examine contemporary cache design techniques and provide an analytical model for estimating cache energy consumption We also present several novel techniques for designing energy-efficient caches, which include block buffering, cache sub-banking, and Gray code addressing Experimental results suggest that both the block buffering and Gray code addressing techniques are ideal for instruction cache designs which tend to be accessed in a consecutive sequence Cache sub-banking is ideal for both instruction and data caches Overall, these techniques can achieve an order of magnitude energy reduction on caches >

85 citations

Patent
23 May 1991
TL;DR: In this paper, a directory-based protocol for maintaining data coherency in a multiprocessing (MP) system having a number of processors with associated write-back caches, a multistage interconnection network (MIN) leading to a shared memory, and a global directory associated with the main memory to keep track of state and control information of cache lines.
Abstract: A directory-based protocol is provided for maintaining data coherency in a multiprocessing (MP) system having a number of processors with associated write-back caches, a multistage interconnection network (MIN) leading to a shared memory, and a global directory associated with the main memory to keep track of state and control information of cache lines. Upon a request by a requesting cache for a cache line which has been exclusively modified by a source cache, two buffers are situated in the global directory to collectively intercept modified data words of the modified cache line during the write-back to memory. A modified word buffer is used to capture modified words within the modified cache line. Moreover, a line buffer stores an old cache line transferred from the memory, during the write back operation. Finally, the line buffer and the modified word buffer, together, provide the entire modified line to a requesting cache.

85 citations

Proceedings ArticleDOI
10 Oct 1999
TL;DR: This work extends the work proposed by J. Kin et al. (1997), in which an extra, small cache is inserted between the CPU data path and the L1 cache and serves to filter most of the references initiated from the CPU.
Abstract: Energy dissipated in on-chip caches represents a substantial portion in the energy budget of today's processors. Extrapolating current trends, this portion is likely to increase in the near future, since the devices devoted to the caches occupy an increasingly larger percentage of the total area of the chip. We extend the work proposed by J. Kin et al. (1997), in which an extra, small cache (called filter cache) is inserted between the CPU data path and the L1 cache and serves to filter most of the references initiated from the CPU. In our scheme, the compiler is used to generate code that exploits the new memory hierarchy and reduces the possibility of a miss in the extra cache. Experimental results across a wide range of SPEC95 benchmarks show that this cache, which we call L-Cache, has a small performance overhead with respect to the scheme without any extra caches, and provides substantial energy savings. The L-Cache is placed between the CPU and the I-Cache. The D-Cache subsystem is not modified. Since the L-Cache is much smaller, and thus, has a smaller access time than the I-Cache, this scheme can also be used for performance improvements provided that the hit rate in the L-Cache is very high. In our experimental results, we show that the L-Cache does indeed improve performance in some cases.

85 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Compiler
26.3K papers, 578.5K citations
89% related
Scalability
50.9K papers, 931.6K citations
87% related
Server
79.5K papers, 1.4M citations
86% related
Static routing
25.7K papers, 576.7K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
2022110
202112
202020
201915
201830