scispace - formally typeset
Search or ask a question
Topic

Cache pollution

About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.


Papers
More filters
Patent
20 Oct 1993
TL;DR: In this article, a cache memory system is proposed to dynamically assign segments of cache memory to correspond to segments of the mass storage device, accept data written by the host into portions of the assigned segments, and determine if the elapsed time since any modified data has been written to the cache memory exceeds a predetermined period of time.
Abstract: A method for operating a cache memory system which has a high speed cache memory and a mass storage device that operate in a highly efficient manner with a host device. The system operates to dynamically assign segments of the cache memory to correspond to segments of the mass storage device, accept data written by the host into portions of the assigned segments of the cache memory, and determine if the elapsed time since any modified data has been written to the cache memory exceeds a predetermined period of time, or if the number of modified segments to be written to the mass storage device exceeds a preset limit. If so, the cache memory system enables a transfer mechanism to cause modified data to be written from the cache memory to the mass storage device, based on the location of segments relative to a currently selected track of the mass storage device. Movement of updated data from the cache memory (solid state storage) to the mass storage device (which may be, for example, a magnetic disk) and of prefetched data from the mass storage to the cache memory is done on a timely, but unobtrusive, basis as a background task. A direct, private channel between the cache memory and the mass storage device prevents communications between these two media from conflicting with transmission of data between the host and the cache memory system. A set of microprocessors manages and oversees the data transmission and storage. Data integrity is maintained in the event of a power interruption via a battery assisted, automatic and intelligent shutdown procedure.

86 citations

Journal ArticleDOI
Peter Sanders1
TL;DR: In this article, a fast priority queue for external memory and cached memory that is based on k-way merging is proposed, which is at least two times faster than an optimized implementation of binary heaps and 4-ary heaps for large inputs.
Abstract: The cache hierarchy prevalent in todays high performance processors has to be taken into account in order to design algorithms that perform well in practice. This paper advocates the adaption of external memory algorithms to this purpose. This idea and the practical issues involved are exemplified by engineering a fast priority queue suited to external memory and cached memory that is based on k-way merging. It improves previous external memory algorithms by constant factors crucial for transferring it to cached memory. Running in the cache hierarchy of a workstation the algorithm is at least two times faster than an optimized implementation of binary heaps and 4-ary heaps for large inputs.

86 citations

Proceedings ArticleDOI
09 Mar 2011
TL;DR: A novel approach is presented that efficiently analyzes interactions between threads to determine thread correlation and detect true and false sharing, and is able to improve the performance of some applications up to a factor of 12x and shed light on the obstacles that prevent their performance from scaling to many cores.
Abstract: In today's multi-core systems, cache contention due to true and false sharing can cause unexpected and significant performance degradation. A detailed understanding of a given multi-threaded application's behavior is required to precisely identify such performance bottlenecks. Traditionally, however, such diagnostic information can only be obtained after lengthy simulation of the memory hierarchy.In this paper, we present a novel approach that efficiently analyzes interactions between threads to determine thread correlation and detect true and false sharing. It is based on the following key insight: although the slowdown caused by cache contention depends on factors including the thread-to-core binding and parameters of the memory hierarchy, the amount of data sharing is primarily a function of the cache line size and application behavior. Using memory shadowing and dynamic instrumentation, we implemented a tool that obtains detailed sharing information between threads without simulating the full complexity of the memory hierarchy. The runtime overhead of our approach --- a 5x slowdown on average relative to native execution --- is significantly less than that of detailed cache simulation. The information collected allows programmers to identify the degree of cache contention in an application, the correlation among its threads, and the sources of significant false sharing. Using our approach, we were able to improve the performance of some applications up to a factor of 12x. For other contention-intensive applications, we were able to shed light on the obstacles that prevent their performance from scaling to many cores.

86 citations

Patent
Lishing Liu1
12 Sep 1986
TL;DR: In this article, a method and apparatus for associating in cache directories the Control Domain Identifications (CDIDs) of software covered by each cache line is provided, through the use of such provision and/or the addition of Identifications of users actively using lines, cache coherence of certain data is controlled without performing conventional Cross-Interrogates (XIs), if the accesses to such objects are properly synchronized with locking type concurrency controls.
Abstract: A method and apparatus is provided for associating in cache directories the Control Domain Identifications (CDIDs) of software covered by each cache line. Through the use of such provision and/or the addition of Identifications of users actively using lines, cache coherence of certain data is controlled without performing conventional Cross-Interrogates (XIs), if the accesses to such objects are properly synchronized with locking type concurrency controls. Software protocols to caches are provided for the resource kernel to control the flushing of released cache lines. The parameters of these protocols are high level Domain Identifications and Task Identifications.

86 citations

Proceedings ArticleDOI
01 Oct 2011
TL;DR: This paper proposes to integrate SRAM with STT-RAM to construct a novel hybrid cache architecture for CMPs and proposes dedicated microarchitectural mechanisms to make the hybrid cache robust to workloads with different write patterns.
Abstract: Modern high performance Chip Multiprocessor (CMP) systems rely on large on-chip cache hierarchy. As technology scales down, the leakage power of present SRAM based cache gradually dominates the on-chip power consumption, which can severely jeopardize system performance. The emerging nonvolatile Spin Transfer Torque RAM (STT-RAM) is a promising candidate for large on-chip cache because of the ultra low leakage power. However, the write operations on STT-RAM suffer from considerably higher energy as well as longer latency compared with SRAM which will make STT-RAM in trouble for write-intensive workloads. In this paper, we propose to integrate SRAM with STT-RAM to construct a novel hybrid cache architecture for CMPs. We also propose dedicated microarchitectural mechanisms to make the hybrid cache robust to workloads with different write patterns. Extensive simulation results demonstrate that the proposed hybrid scheme is adaptive to variations of workloads. Overall power consumption is reduced by 37.1% and performance is improved by 23.6% on average compared with SRAM based static NUCA under the same area configuration.

86 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Compiler
26.3K papers, 578.5K citations
89% related
Scalability
50.9K papers, 931.6K citations
87% related
Server
79.5K papers, 1.4M citations
86% related
Static routing
25.7K papers, 576.7K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
2022110
202112
202020
201915
201830