Topic

Cache algorithms

About: Cache algorithms is a research topic. Over the lifetime, 14321 publications have been published within this topic receiving 320796 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Summary cache: a scalable wide-area web cache sharing protocol

[...]

Li Fan¹, Pei Cao², Jussara M. Almeida¹, Andrei Z. Broder•Institutions (2)

University of Wisconsin-Madison¹, Cisco Systems, Inc.²

01 Jun 2000-IEEE ACM Transactions on Networking

TL;DR: This paper demonstrates the benefits of cache sharing, measures the overhead of the existing protocols, and proposes a new protocol called "summary cache", which reduces the number of intercache protocol messages, reduces the bandwidth consumption, and eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP.

...read moreread less

Abstract: The sharing of caches among Web proxies is an important technique to reduce Web traffic and alleviate network bottlenecks. Nevertheless it is not widely deployed due to the overhead of existing protocols. In this paper we demonstrate the benefits of cache sharing, measure the overhead of the existing protocols, and propose a new protocol called "summary cache". In this new protocol, each proxy keeps a summary of the cache directory of each participating proxy, and checks these summaries for potential hits before sending any queries. Two factors contribute to our protocol's low overhead: the summaries are updated only periodically, and the directory representations are very economical, as low as 8 bits per entry. Using trace-driven simulations and a prototype implementation, we show that, compared to existing protocols such as the Internet cache protocol (ICP), summary cache reduces the number of intercache protocol messages by a factor of 25 to 60, reduces the bandwidth consumption by over 50%, eliminates 30% to 95% of the protocol CPU overhead, all while maintaining almost the same cache hit ratios as ICP. Hence summary cache scales to a large number of proxies. (This paper is a revision of Fan et al. 1998; we add more data and analysis in this version.).

...read moreread less

2,174 citations

Proceedings Article•DOI•

Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers

[...]

Norman P. Jouppi

01 May 1990

TL;DR: In this article, a hardware technique to improve the performance of caches is presented, where a small fully-associative cache between a cache and its refill path is used to place prefetched data and not in the cache.

...read moreread less

Abstract: Projections of computer technology forecast processors with peak performance of 1,000 MIPS in the relatively near future. These processors could easily lose half or more of their performance in the memory hierarchy if the hierarchy design is based on conventional caching techniques. This paper presents hardware techniques to improve the performance of caches.Miss caching places a small fully-associative cache between a cache and its refill path. Misses in the cache that hit in the miss cache have only a one cycle miss penalty, as opposed to a many cycle miss penalty without the miss cache. Small miss caches of 2 to 5 entries are shown to be very effective in removing mapping conflict misses in first-level direct-mapped caches.Victim caching is an improvement to miss caching that loads the small fully-associative cache with the victim of a miss and not the requested line. Small victim caches of 1 to 5 entries are even more effective at removing conflict misses than miss caching.Stream buffers prefetch cache lines starting at a cache miss address. The prefetched data is placed in the buffer and not in the cache. Stream buffers are useful in removing capacity and compulsory cache misses, as well as some instruction cache conflict misses. Stream buffers are more effective than previously investigated prefetch techniques at using the next slower level in the memory hierarchy when it is pipelined. An extension to the basic stream buffer, called multi-way stream buffers, is introduced. Multi-way stream buffers are useful for prefetching along multiple intertwined data reference streams.Together, victim caches and stream buffers reduce the miss rate of the first level in the cache hierarchy by a factor of two to three on a set of six large benchmarks.

...read moreread less

1,481 citations

Proceedings Article•DOI•

Utility-Based Cache Partitioning: A Low-Overhead, High-Performance, Runtime Mechanism to Partition Shared Caches

[...]

Moinuddin K. Qureshi, Yale N. Patt

09 Dec 2006

TL;DR: In this article, the authors propose a low-overhead, runtime mechanism that partitions a shared cache between multiple applications depending on the reduction in cache misses that each application is likely to obtain for a given amount of cache resources.

...read moreread less

Abstract: This paper investigates the problem of partitioning a shared cache between multiple concurrently executing applications. The commonly used LRU policy implicitly partitions a shared cache on a demand basis, giving more cache resources to the application that has a high demand and fewer cache resources to the application that has a low demand. However, a higher demand for cache resources does not always correlate with a higher performance from additional cache resources. It is beneficial for performance to invest cache resources in the application that benefits more from the cache resources rather than in the application that has more demand for the cache resources. This paper proposes utility-based cache partitioning (UCP), a low-overhead, runtime mechanism that partitions a shared cache between multiple applications depending on the reduction in cache misses that each application is likely to obtain for a given amount of cache resources. The proposed mechanism monitors each application at runtime using a novel, cost-effective, hardware circuit that requires less than 2kB of storage. The information collected by the monitoring circuits is used by a partitioning algorithm to decide the amount of cache resources allocated to each application. Our evaluation, with 20 multiprogrammed workloads, shows that UCP improves performance of a dual-core system by up to 23% and on average 11% over LRU-based cache partitioning.

...read moreread less

1,083 citations

Proceedings Article•

Cost-aware WWW proxy caching algorithms

[...]

Pei Cao¹, Sandy Irani²•Institutions (2)

University of Wisconsin-Madison¹, University of California, Irvine²

08 Dec 1997

TL;DR: GreedyDual-Size as discussed by the authors incorporates locality with cost and size concerns in a simple and nonparameterized fashion for high performance, which can potentially improve the performance of main-memory caching of Web documents.

...read moreread less

Abstract: Web caches can not only reduce network traffic and downloading latency, but can also affect the distribution of web traffic over the network through cost-aware caching. This paper introduces GreedyDual-Size, which incorporates locality with cost and size concerns in a simple and non-parameterized fashion for high performance. Trace-driven simulations show that with the appropriate cost definition, GreedyDual-Size outperforms existing web cache replacement algorithms in many aspects, including hit ratios, latency reduction and network cost reduction. In addition, GreedyDual-Size can potentially improve the performance of main-memory caching of Web documents.

...read moreread less

1,048 citations

Proceedings Article•

FLUSH+RELOAD: a high resolution, low noise, L3 cache side-channel attack

[...]

Yuval Yarom¹, Katrina Falkner¹•Institutions (1)

University of Adelaide¹

20 Aug 2014

TL;DR: This paper presents FLUSH+RELOAD, a cache side-channel attack technique that exploits a weakness in the Intel X86 processors to monitor access to memory lines in shared pages and recovers 96.7% of the bits of the secret key by observing a single signature or decryption round.

...read moreread less

Abstract: Sharing memory pages between non-trusting processes is a common method of reducing the memory footprint of multi-tenanted systems In this paper we demonstrate that, due to a weakness in the Intel X86 processors, page sharing exposes processes to information leaks We present FLUSH+RELOAD, a cache side-channel attack technique that exploits this weakness to monitor access to memory lines in shared pages Unlike previous cache side-channel attacks, FLUSH+RELOAD targets the Last-Level Cache (ie L3 on processors with three cache levels) Consequently, the attack program and the victim do not need to share the execution core We demonstrate the efficacy of the FLUSH+RELOAD attack by using it to extract the private encryption keys from a victim program running GnuPG 1413 We tested the attack both between two unrelated processes in a single operating system and between processes running in separate virtual machines On average, the attack is able to recover 967% of the bits of the secret key by observing a single signature or decryption round

...read moreread less

1,001 citations

Collapse

Network Information

Performance

Metrics

14,612

Papers

330,306

Citations

No. of papers in the topic in previous years
Year	Papers
2023	78
2022	210
2021	46
2020	62
2019	70
2018	103

Cache algorithms

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics