scispace - formally typeset
Search or ask a question
Topic

Cache pollution

About: Cache pollution is a research topic. Over the lifetime, 11353 publications have been published within this topic receiving 262139 citations.


Papers
More filters
Proceedings ArticleDOI
15 Jun 1999
TL;DR: A systematic and quantitative approach for using software-implemented fault injection to guide the design and implementation of a fault-tolerant system to improve robustness in the presence of operating system errors is presented.
Abstract: Fault injection is typically used to characterize failures and to validate and compare fault-tolerant mechanisms. However fault injection is rarely used for all these purposes to guide the design and implementation of a fault tolerant system. We present a systematic and quantitative approach for using software-implemented fault injection to guide the design and implementation of a fault-tolerant system. Our system design goal is to build a write-back file cache on Intel PCs that is as reliable as a write-through file cache. We follow an iterative approach to improve robustness in the presence of operating system errors. In each iteration, we measure the reliability of the system, analyze the fault symptoms that lead to data con option, and apply fault-tolerant mechanisms that address the fault symptoms. Our initial system is 13 times less reliable than a write-through file cache. The result of several iterations is a design that is both more reliable (1.9% vs. 3.1% corruption rate) and 5-9 times as fast as a write-through file cache.

82 citations

Proceedings ArticleDOI
02 Feb 2002
TL;DR: Based on a new characterisation of data reuse across multiple loop nests, a method is preset, a prototyping implementation and some experimental results for analysing the cache behaviour of whole programs with regular computations and can be used to guide compiler locality optimisations and improve cache simulation performance.
Abstract: Based on a new characterisation of data reuse across multiple loop nests, we preset a method, a prototyping implementation and some experimental results for analysing the cache behaviour of whole programs with regular computations. Validation against cache simulation using real codes shows the efficiency and accuracy of our method. The largest program, we have analysed, Applu from SPECfP95, has 3868 lines, 16 subroutines and 2565 references. In the case of a 32KB cache with a 32B line size, our method obtains the miss ratio with an absolute error of about 0.80% in about 128 seconds while the simulator used runs for nearly 5 hours on a 933MHz Pentium. III PC. Our method can be used to guide compiler locality optimisations and improve cache simulation performance.

82 citations

Proceedings ArticleDOI
15 Jun 2005
TL;DR: This work exploits the widespread parallelism and regular communication patterns in stream programs to formulate a set of cache aware optimizations that automatically improve instruction and data locality in the context of the Synchronous Dataflow model.
Abstract: Effective use of the memory hierarchy is critical for achieving high performance on embedded systems. We focus on the class of streaming applications, which is increasingly prevalent in the embedded domain. We exploit the widespread parallelism and regular communication patterns in stream programs to formulate a set of cache aware optimizations that automatically improve instruction and data locality. Our work is in the context of the Synchronous Dataflow model, in which a program is described as a graph of independent actors that communicate over channels. The communication rates between actors are known at compile time, allowing the compiler to statically model the caching behavior.We present three cache aware optimizations: 1) execution scaling, which judiciously repeats actor executions to improve instruction locality, 2) cache aware fusion, which combines adjacent actors while respecting instruction cache constraints, and 3) scalar replacement, which converts certain data buffers into a sequence of scalar variables that can be register allocated. The optimizations are founded upon a simple and intuitive model that quantifies the temporal locality for a sequence of actor executions. Our implementation of cache aware optimizations in the StreamIt compiler yields a 249% average speedup (over unoptimized code) for our streaming benchmark suite on a StrongARM 1110 processor. The optimizations also yield a 154% speedup on a Pentium 3 and a 152% speedup on an Itanium 2.

82 citations

Patent
Jamshed H. Mirza1
15 Apr 1991
TL;DR: In this paper, a cache bypass mechanism automatically avoids caching of data for instructions whose data references, for whatever reason, exhibit low cache hit ratio, and this record is used to decide whether its future references should be cached or not.
Abstract: A cache bypass mechanism automatically avoids caching of data for instructions whose data references, for whatever reason, exhibit low cache hit ratio. The mechanism keeps a record of an instruction's behavior in the immediate past, and this record is used to decide whether its future references should be cached or not. If an instruction is experiencing bad cache hit ratio, it is marked as non-cacheable, and its data references are made to bypass the cache. This avoids the additional penalty of unnecessarily fetching the remaining words in the line, reduces the demand on the memory bandwidth, avoids flushing the cache of useful data and, in parallel processing environments, prevents line thrashing. The cache management scheme is automatic and requires no compiler or user intervention.

82 citations

Patent
22 Jun 1992
TL;DR: In this paper, an intelligent cache memory system and associated method for reducing a central processing unit (CPU) idle time is proposed, which performs prefetches based on data fetching characteristics of the CPU.
Abstract: An intelligent cache memory system and associated method for reducing a central processing unit (CPU) idle time. The system performs prefetches based on data fetching characteristics of the CPU. The system includes cache control logic, a first and a second cache memory, each having a number of cache lines, and a first and a second cache tag array, each having cache tag entries corresponding to the cache lines. The cache tag entries comprise cache tags and valid bits. The cache tag entries of the second cache tag array further comprise interest bits. In addition to their traditional functions, the cache tags and the valid bits, in conjunction with the interest bits, are used to track the data fetching history of the CPU. For each read cycle, the cache control logic returns the data being fetched by the CPU from either the first or the second cache memory or the main memory. Additionally, the cache control logic initiates prefetch and updates the data fetching history conditionally. The data fetched from either the second cache memory or the main memory are also stored in the first cache memory, whereas the data prefetched are stored in the second cache memory. Prefetch is conditioned on the data fetching history, while data fetching history update is conditioned on where the data requested by the CPU are fetched. As a result, CPU idle time is further reduced and system performance is further improved.

81 citations


Network Information
Related Topics (5)
Cache
59.1K papers, 976.6K citations
93% related
Compiler
26.3K papers, 578.5K citations
89% related
Scalability
50.9K papers, 931.6K citations
87% related
Server
79.5K papers, 1.4M citations
86% related
Static routing
25.7K papers, 576.7K citations
84% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
202342
2022110
202112
202020
201915
201830