Topic
Cache pollution
About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.
Papers published on a yearly basis
Papers
More filters
••
30 Mar 1998TL;DR: This paper investigates optimizations for unstructured iterative applications in which the computational structure remains static or changes only slightly through iterations, and reorganizes the data elements to obtain better memory system performance without modifying code fragments.
Abstract: The increasing gap in processor and memory speeds has forced microprocessors to rely on deep cache hierarchies to keep the processors from starving for data. For many applications, this results in a wide disparity between sustained and peak achievable speed. Applications need to be tuned to processor and memory system architectures for cache locality, memory layout and data prefetch and reuse. In this paper we investigate optimizations for unstructured iterative applications in which the computational structure remains static or changes only slightly through iterations. Our methods reorganize the data elements to obtain better memory system performance without modifying code fragments. Our experimental results show that the overall time can be reduced significantly using our optimizations. Further, the overhead of our methods is small enough that they are applicable even if the computational structure does nor substantially change for tens of iterations.
66 citations
••
23 May 2016TL;DR: MEMTUNE dynamically tunes computation/caching memory partitions at runtime based on workload memory demand and in-memory data cache needs, and if needed, the scheduling information from the analytic framework is leveraged to evict data that will be needed in the near future.
Abstract: Memory is a crucial resource for big data processing frameworks such as Spark and M3R, where the memory is used both for computation and for caching intermediate storage data. Consequently, optimizing memory is the key to extracting high performance. The extant approach is to statically split thememory for computation and caching based on workload profiling. This approach is unable to capture the varying workload characteristics and dynamic memory demands. Another factor that affects caching efficiency is the choice of data placement and eviction policy. The extant LRU policy is oblivious of task scheduling information from the analytic frameworks, and thus can lead to lost optimization opportunities. In this paper, we address the above issues by designing MEMTUNE, a dynamic memory manager for in-memory data analytics. MEMTUNE dynamically tunes computation/caching memory partitions at runtime based on workload memory demand and in-memory data cache needs. Moreover, if needed, the scheduling information from the analytic framework isleveraged to evict data that will not be needed in the near future. Finally, MEMTUNE also supports task-level data prefetching with a configurable window size to more effectively overlap computation with I/O. Our experiments show that MEMTUNE improves memory utilization, yields an overall performance gain of up to 46%, and achieves cache hit ratio of up to 41% compared to standard Spark.
66 citations
•
06 Jun 2008TL;DR: In this article, a shared code caching engine receives native code comprising at least a portion of a single module of the application program, and stores runtime data corresponding to the native code in a cache data file in the non-volatile memory.
Abstract: Computer code from an application program comprising a plurality of modules that each comprise a separately loadable file is code cached in a shared and persistent caching system. A shared code caching engine receives native code comprising at least a portion of a single module of the application program, and stores runtime data corresponding to the native code in a cache data file in the non-volatile memory. The engine then converts cache data file into a code cache file and enables the code cache file to be pre-loaded as a runtime code cache. These steps are repeated to store a plurality of separate code cache files at different locations in non-volatile memory.
66 citations
••
TL;DR: In this article, the authors describe a recovery cache that has been built for the PDP-11 family of machines, which is designed to be an "add-on" unit which requires no hardware alterations to the host CPU but intersects the bus between the CPU and the memory modules.
Abstract: Backward error recovery is an integral part of the recovery block scheme that has been advanced as a method for providing tolerance against faults in software; the recovery cache has been proposed as a mechanism for providing this error recovery capability. This correspondence describes a recovery cache that has been built for the PDP-11 family of machines. This recovery cache has been designed to be an "add-on" unit which requires no hardware alterations to the host CPU but which intersects the bus between the CPU and the memory modules. Specially designed hardware enables concurrent operation of the recovery cache and the host system, and aims to minimize the overheads imposed on the host.
66 citations
••
14 Jun 1993
TL;DR: In order to improve cache hit ratios, set-associative caches are used in some of the new superscalar microprocessors.
Abstract: During the past decade, microprocessor peak performance has increased at a tremendous rate using RISC concept, higher and higher clock frequencies and parallel/pipelined instruction issuing. As the gap between the main memory access time and the potential average instruction time is always increasing, it has become very important to improve the behavior of the caches, particularly when no secondary cache is used (i.e on all low cost microprocessor systems). In order to improve cache hit ratios, set-associative caches are used in some of the new superscalar microprocessors.
66 citations