Topic
Cache pollution
About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.
Papers published on a yearly basis
Papers
More filters
••
22 Sep 2009TL;DR: Two strategies offering a kernel-assisted, single-copy model with support for noncontiguous and asynchronous transfers are introduced, which outperform the standard transfer method in the MPICH2 implementation when no cache is shared between the processing cores or when very large messages are being transferred.
Abstract: The emergence of multicore processors raises the need to efficiently transfer large amounts of data between local processes. MPICH2 is a highly portable MPI implementation whose large-message communication schemes suffer from high CPU utilization and cache pollution because of the use of a double-buffering strategy, common to many MPI implementations. We introduce two strategies offering a kernel-assisted, single-copy model with support for noncontiguous and asynchronous transfers. The first one uses the now widely available vmsplice Linux system call; the second one further improves performance thanks to a custom kernel module called KNEM. The latter also offers I/OAT copy offload, which is dynamically enabled depending on both hardware cache characteristics and message size. These new solutions outperform the standard transfer method in the MPICH2 implementation when no cache is shared between the processing cores or when very large messages are being transferred. Collective communication operations show a dramatic improvement, and the IS NAS parallel benchmark shows a 25% speedup and better cache efficiency.
61 citations
•
01 Aug 2001
TL;DR: In this paper, a microprocessor including a control unit and a cache connected with the control unit for storing data to be used by the control, wherein the cache is selectively configurable as either a single cache or as a partitioned cache having a locked cache portion and a normal cache portion.
Abstract: A microprocessor including a control unit and a cache connected with the control unit for storing data to be used by the control, wherein the cache is selectively configurable as either a single cache or as a partitioned cache having a locked cache portion and a normal cache portion. The normal cache portion is controlled by a hardware implemented automatic replacement process. The locked cache portion is locked so that the automatic replacement process cannot modify the contents of the locked cache. An instruction is provided in the instruction set that enables software to selectively allocate lines in the locked cache portion to correspond to locations in an external memory, thereby enabling the locked cache portion to be completely managed by software.
61 citations
•
21 Feb 1990TL;DR: A cache management system for a computer system having a central processing unit, a main memory, and cache memory including a memory management unit for transferring page size blocks of information, apparatus for reading information from main memory and apparatus for writing information to the cache memory is described in this article.
Abstract: A cache management system for a computer system having a central processing unit, a main memory, and cache memory including a memory management unit for transferring page size blocks of information, apparatus for reading information from main memory, apparatus for writing information to the cache memory, and apparatus for overlapping the write of information to the cache memory to occur during the read of information from the main memory.
61 citations
•
IBM1
TL;DR: In this paper, the authors present a balanced cache performance in a data processing system consisting of a first processor, a second processor, an intermediate cache memory, and a control circuit.
Abstract: The present invention provides balanced cache performance in a data processing system The data processing system includes a first processor, a second processor, a first cache memory, a second memory and a control circuit The first processor is connected to the first cache memory, which serves as a first level cache for the first processor The second processor and the first cache memory are connected to the second cache memory, which serves as a second level cache for the first processor and as a first level cache for the second processor Replacement of a set in the second cache memory results in the set being invalidated in the first cache memory The control circuit is connected to the second level cache and prevents replacing from a second level cache congruence class all sets that are in the first cache
61 citations
•
14 Apr 2003TL;DR: In this article, the authors propose a pre-fetching mechanism for read caches in response to a read cache hit upon the initial data in the read cache, which is similar to our approach.
Abstract: Exemplary systems and methods include pre-fetching data in response to a read cache hit. Various exemplary methods include priming a read cache with initial data, and triggering a read pre-fetch operation in response to a read cache hit upon the initial data in the read cache. Another exemplary implementation includes a storage device having a read cache and a trigger module that causes a pre-fetch of data from a mass storage medium in response to a read cache hit upon data in the read cache.
61 citations