Topic

Smart Cache

About: Smart Cache is a research topic. Over the lifetime, 7680 publications have been published within this topic receiving 180618 citations.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Proceedings Article•DOI•

Squirrel: a decentralized peer-to-peer web cache

[...]

Sitaram Iyer¹, Antony Rowstron², Peter Druschel¹•Institutions (2)

Rice University¹, Microsoft²

21 Jul 2002

TL;DR: This paper proposes and evaluates decentralized web caching algorithms for Squirrel, and discovers that it exhibits performance comparable to a centralized web cache in terms of hit ratio, bandwidth usage and latency.

...read moreread less

Abstract: This paper presents a decentralized, peer-to-peer web cache called Squirrel. The key idea is to enable web browsers on desktop machines to share their local caches, to form an efficient and scalable web cache, without the need for dedicated hardware and the associated administrative cost. We propose and evaluate decentralized web caching algorithms for Squirrel, and discover that it exhibits performance comparable to a centralized web cache in terms of hit ratio, bandwidth usage and latency. It also achieves the benefits of decentralization, such as being scalable, self-organizing and resilient to node failures, while imposing low overhead on the participating nodes.

...read moreread less

429 citations

Proceedings Article•

Weaving Relations for Cache Performance

[...]

Anastassia Ailamaki¹, David J. DeWitt², Mark D. Hill², Marios Skounakis²•Institutions (2)

Carnegie Mellon University¹, University of Wisconsin-Madison²

11 Sep 2001

TL;DR: This paper proposes a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page, and demonstrates that in-page data placement is the key to high cache performance.

...read moreread less

Abstract: Relational database systems have traditionally optimzed for I/O performance and organized records sequentially on disk pages using the N-ary Storage Model (NSM) (a.k.a., slotted pages). Recent research, however, indicates that cache utilization and performance is becoming increasingly important on modern platforms. In this paper, we first demonstrate that in-page data placement is the key to high cache performance and that NSM exhibits low cache utilization on modern platforms. Next, we propose a new data organization model called PAX (Partition Attributes Across), that significantly improves cache performance by grouping together all values of each attribute within each page. Because PAX only affects layout inside the pages, it incurs no storage penalty and does not affect I/O behavior. According to our experimental results, when compared to NSM (a) PAX exhibits superior cache and memory bandwidth utilization, saving at least 75% of NSM’s stall time due to data cache accesses, (b) range selection queries and updates on memoryresident relations execute 17-25% faster, and (c) TPC-H queries involving I/O execute 11-48% faster.

...read moreread less

428 citations

Proceedings Article•DOI•

Cache-Conscious Wavefront Scheduling

[...]

Timothy G. Rogers¹, Mike O'Connor², Tor M. Aamodt¹•Institutions (2)

University of British Columbia¹, Advanced Micro Devices²

01 Dec 2012

TL;DR: This paper proposes Cache-Conscious Wave front Scheduling (CCWS), an adaptive hardware mechanism that makes use of a novel intra-wave front locality detector to capture locality that is lost by other schedulers due to excessive contention for cache capacity.

...read moreread less

Abstract: This paper studies the effects of hardware thread scheduling on cache management in GPUs. We propose Cache-Conscious Wave front Scheduling (CCWS), an adaptive hardware mechanism that makes use of a novel intra-wave front locality detector to capture locality that is lost by other schedulers due to excessive contention for cache capacity. In contrast to improvements in the replacement policy that can better tolerate difficult access patterns, CCWS shapes the access pattern to avoid thrashing the shared L1. We show that CCWS can outperform any replacement scheme by evaluating against the Belady-optimal policy. Our evaluation demonstrates that cache efficiency and preservation of intra-wave front locality become more important as GPU computing expands beyond use in high performance computing. At an estimated cost of 0.17% total chip area, CCWS reduces the number of threads actively issued on a core when appropriate. This leads to an average 25% fewer L1 data cache misses which results in a harmonic mean 24% performance improvement over previously proposed scheduling policies across a diverse selection of cache-sensitive workloads.

...read moreread less

408 citations

Journal Article•DOI•

Dynamic Partitioning of Shared Cache Memory

[...]

G.E. Suh¹, Larry Rudolph¹, Srinivas Devadas¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Apr 2004-The Journal of Supercomputing

TL;DR: The results show that smart cache management and scheduling is essential to achieve high performance with shared cache memory and can improve the total IPC significantly over the standard least recently used (LRU) replacement policy.

...read moreread less

Abstract: This paper proposes dynamic cache partitioning amongst simultaneously executing processes/threads. We present a general partitioning scheme that can be applied to set-associative caches. Since memory reference characteristics of processes/threads can change over time, our method collects the cache miss characteristics of processes/threads at run-time. Also, the workload is determined at run-time by the operating system scheduler. Our scheme combines the information, and partitions the cache amongst the executing processes/threads. Partition sizes are varied dynamically to reduce the total number of misses. The partitioning scheme has been evaluated using a processor simulator modeling a two-processor CMP system. The results show that the scheme can improve the total IPC significantly over the standard least recently used (LRU) replacement policy. In a certain case, partitioning doubles the total IPC over standard LRU. Our results show that smart cache management and scheduling is essential to achieve high performance with shared cache memory.

...read moreread less

402 citations

Proceedings Article•DOI•

Cache template attacks: automating attacks on inclusive last-level caches

[...]

Daniel Gruss¹, Raphael Spreitzer¹, Stefan Mangard¹•Institutions (1)

Graz University of Technology¹

12 Aug 2015

TL;DR: An automated attack on the T-table-based AES implementation of OpenSSL that is as efficient as state-of-the-art manual cache attacks and can reduce the entropy per character from log2(26) = 4.7 to 1.4 bits on Linux systems is performed.

...read moreread less

Abstract: Recent work on cache attacks has shown that CPU caches represent a powerful source of information leakage. However, existing attacks require manual identification of vulnerabilities, i.e., data accesses or instruction execution depending on secret information. In this paper, we present Cache Template Attacks. This generic attack technique allows us to profile and exploit cache-based information leakage of any program automatically, without prior knowledge of specific software versions or even specific system information. Cache Template Attacks can be executed online on a remote system without any prior offline computations or measurements. Cache Template Attacks consist of two phases. In the profiling phase, we determine dependencies between the processing of secret information, e.g., specific key inputs or private keys of cryptographic primitives, and specific cache accesses. In the exploitation phase, we derive the secret values based on observed cache accesses. We illustrate the power of the presented approach in several attacks, but also in a useful application for developers. Among the presented attacks is the application of Cache Template Attacks to infer keystrokes and--even more severe--the identification of specific keys on Linux and Windows user interfaces. More specifically, for lowercase only passwords, we can reduce the entropy per character from log2(26) = 4.7 to 1.4 bits on Linux systems. Furthermore, we perform an automated attack on the T-table-based AES implementation of OpenSSL that is as efficient as state-of-the-art manual cache attacks.

...read moreread less

387 citations

Collapse

Network Information

Performance

Metrics

7,847

Papers

186,437

Citations

No. of papers in the topic in previous years
Year	Papers
2023	50
2022	114
2021	5
2020	1
2019	8
2018	18

Smart Cache

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics