scispace - formally typeset
Search or ask a question
Topic

Cache pollution

About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.


Papers
More filters
Patent
Marcus L. Kornegay1, Ngan N. Pham1
22 Nov 2006
TL;DR: In this article, a method for cache management of lines of binary code that are stored in the cache is described, where a cache manager log can identify a line, or lines, to be evicted based on data by accessing the cache directory.
Abstract: A method for cache management is disclosed. The method can assign or determined identifiers for lines of binary code that are, or will be stored in cache. The method can create a cache directory that utilizes the identifier to keep an eviction count and/or a reload count for cached lines. Thus, each time a line is entered into, or evicted from cache, the cache eviction log can be amended accordingly. When a processor receives or creates an instruction that requests that a line be evicted from cache, a cache manager log can identify a line, or lines of binary code to be evicted based on data by accessing the cache directory and then the line(s) can be evicted.

67 citations

Patent
22 Feb 1988
TL;DR: In this article, an on-chip VLSI cache architecture including a single-port, last-select, cache array organized as an n-way set-associative cache (with n congruence classes) is presented.
Abstract: An on-chip VLSI cache architecture including a single-port, last-select, cache array organized as an n-way set-associative cache (having n congruence classes) including a plurality of functionally integrated units on-chip in addition to the cache array and including a normal read/write CPU access function which provides an architectural organization for allowing the chip to be used in (1) a fast, "late-select" operation which may be provided with any desired degree of set-associativity while achieving an effective one-cycle write operation, and (2) a cache reload function which provides a highly parallel store-back and reload operation to substantially reduce the reload time, particularly for a store-in cache organization. The cache chip organization and architecture provide a late-select cache having a nearly transparent, multiple word reload by incorporating a Cache-Reload Buffer, a store-back buffer and a load-through function all included on the cache array chip for reloading, and a delayed write-enable for achieving an effective one-cycle write operation. Two separate decoder functions are integrated on the chip, one for cache access for normal read/write operations to and from the CPU and one for cache reload which also provides interim access to data which has been transferred out of main memory to the chip but not yet reloaded into the cache array. These two decoders provide for different accessing modes as required of the CPU or main memory operations.

67 citations

Patent
24 Dec 1998
TL;DR: Cache windowing as mentioned in this paper divides a large level 1 cache into smaller sizes called windows, allowing the cache to provide more data faster to the CPU, and provides high CPU utilization rates for those processing applications where locality of memory references is poor.
Abstract: A computer level 1 cache memory design with cache windowing divides a large level 1 cache into smaller sizes called windows, allowing the cache to provide more data faster to the CPU. Cache windowing provides the fast access times of a small level 1 cache through fewer, shorter paths and less circuitry than a large cache with multiple associative cache sets. Cache windowing allows context switching to occur with a simple change in cache window designation, eliminating the wait for cache reloading. Simulations of real cache implementations show an average of approximately 30 % improvement in CPU throughput with cache windowing, scaling with CPU speed increases. The resulting system 1) maintains or improves CPU utilization rates as CPU speeds increase, 2) provides large level 1 caches while maintaining cache access times of one CPU clock cycle, and 3) provides high CPU utilization rates for those processing applications where locality of memory references is poor (e.g., networking applications).

67 citations

Patent
23 Sep 2004
TL;DR: In this article, the authors describe methods and apparatus, including computer program products, that implement a centralized cache storage for run-time systems, where a computer program product can include instructions operable to receive a request at a centralized shared cache framework to store an entity; cache the entity in a shared memory in response to the request; and retrieve the stored entity from the shared memory.
Abstract: Described herein are methods and apparatus, including computer program products, that implement a centralized cache storage for runtime systems. A computer program product can include instructions operable to receive a request at a centralized shared cache framework to store an entity; cache the entity in a shared memory in response to the request, where the shared memory is operable to store the entity such that the entity is accessible to runtime systems, and caching the entity in the shared memory comprises storing the entity for one of the runtime systems; receive a request at the centralized shared cache framework to retrieve the entity from the shared memory; and if the entity is stored in the shared memory, retrieve the stored entity from the shared memory, where the centralized shared cache framework is operable to retrieve the entity for any of the runtime systems.

67 citations

Proceedings ArticleDOI
07 Oct 2013
TL;DR: Shared last-level caches, widely used in chip-multi-processors (CMPs), face two fundamental limitations: the latency and energy of shared caches degrade as the system scales up and cache partitioning techniques only provide isolation but do not reduce access latency.
Abstract: Shared last-level caches, widely used in chip-multiprocessors (CMPs), face two fundamental limitations. First, the latency and energy of shared caches degrade as the system scales up. Second, when multiple workloads share the CMP, they suffer from interference in shared cache accesses. Unfortunately, prior research addressing one issue either ignores or worsens the other: NUCA techniques reduce access latency but are prone to hotspots and interference, and cache partitioning techniques only provide isolation but do not reduce access latency. We present Jigsaw, a technique that jointly addresses the scalability and interference problems of shared caches. Hardware lets software define shares, collections of cache bank partitions that act as virtual caches, and map data to shares. Shares give software full control over both data placement and capacity allocation. Jigsaw implements efficient hardware support for share management, monitoring, and adaptation. We propose novel resource-management algorithms and use them to develop a system-level runtime that leverages Jigsaw to both maximize cache utilization and place data close to where it is used.We evaluate Jigsaw using extensive simulations of 16- and 64-core tiled CMPs. Jigsaw improves performance by up to 2.2x (18% avg) over a conventional shared cache, and significantly outperforms state-of-the-art NUCA and partitioning techniques.

67 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Compiler
26.3K papers, 578.5K citations
89% related
Scalability
50.9K papers, 931.6K citations
87% related
Server
79.5K papers, 1.4M citations
86% related
Static routing
25.7K papers, 576.7K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
2022110
202112
202020
201915
201830