Proceedings ArticleDOI
L1 data cache decomposition for energy efficiency
Michael C. Huang,Jose Renau,Seung-Moon Yoo,Josep Torrellas +3 more
- pp 10-15
TLDR
In this paper, a new L1 data cache structure that combines a Specialized Stack Cache (SSC) and a Pseudo Set-Associative Cache (PSAC) is proposed.Abstract:
The L1 data cache is a time-critical module and, at the same time a major consumer of energy To reduce its energy-delay product, we apply two principles of low-power design: specialize part of the cache structure and break the cache down into smaller caches To this end, we propose a new L1 data cache structure that combines a Specialized Stack Cache (SSC) and a Pseudo Set-Associative Cache (PSAC) Individually, our SSC and PSAC designs have a lower energy-delay product than previously-proposed related designs In addition, their combined operation is very effective Relative to a conventional 2-way 32 KB data cache, a design containing a 4-way 32 KB PSAC and a 512 B SSC reduces the energy-delay product of several applications by an average of 44%read more
Citations
More filters
Proceedings ArticleDOI
A highly configurable cache architecture for embedded systems
TL;DR: This work introduces a novel cache architecture intended for embedded microprocessor platforms that can be configured by software to be direct-mapped, two-way, or four-way set associative, using a technique the authors call way concatenation, having very little size or performance overhead.
Book
Computer Architecture Techniques for Power-Efficiency
TL;DR: This book aims to document some of the most important architectural techniques that were invented, proposed, and applied to reduce both dynamic power and static power dissipation in processors and memory hierarchies by focusing on their common characteristics.
Proceedings ArticleDOI
Energy efficient Frequent Value data Cache design
Jun Yang,Rajiv Gupta +1 more
TL;DR: This paper proposes the design of the Frequent Value Cache (FVC), a cache in which storing a frequent value requires few bits as they are stored in encoded form while all other values are storage in unencoded form using 32 bits.
Journal ArticleDOI
IATAC: a smart predictor to turn-off L2 cache lines
TL;DR: This paper introduces IATAC (inter-access time per access count), a new hardware technique to reduce cache leakage for L2 caches that outperforms all previous state-of-the-art techniques.
Journal ArticleDOI
Improving memory encryption performance in secure processors
Jun Yang,Lan Gao,Youtao Zhang +2 more
TL;DR: In this article, a pseudo-one-time pad encryption scheme was proposed to produce the instructions and data ciphertext in parallel with memory accesses, minimizing the trade-off between storage size and performance penalty.
References
More filters
Journal ArticleDOI
CACTI: an enhanced cache access and cycle time model
TL;DR: In this paper, an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches is presented, where the inputs to the model are the cache size, block size, and associativity, as well as array organization and process parameters.
Journal ArticleDOI
The Mips R10000 superscalar microprocessor
TL;DR: The Mips R10000 is a dynamic, superscalar microprocessor that implements the 64-bit Mips 4 instruction set architecture that fetches and decodes four instructions per cycle and dynamically issues them to five fully-pipelined, low-latency execution units.
Proceedings ArticleDOI
Way-predicting set-associative cache for high performance and low energy consumption
TL;DR: In this paper, a new approach using way prediction for achieving high performance and low energy consumption of set-associative caches is proposed, where only a single cache way is accessed, instead of accessing all the ways in a set.
Proceedings ArticleDOI
Column-associative caches: a technique for reducing the miss rate of direct-mapped caches
Anant Agarwal,Stephen D. Pudar +1 more
TL;DR: This paperribes the design of column-ossociotive caches which minhize the cofllcrs that arise in direct-mapped accesses by allowing conflicting addressest to dynamically choose alternate hashing functions, so that most of the cordiicting datacanreside in the cache.
Journal ArticleDOI
Cache performance of operating system and multiprogramming workloads
TL;DR: A program tracing technique called ATUM (Address Tracing Using Microcode) is developed that captures realistic traces of multitasking workloads including the operating system that shows that both the operating System and multiprogramming activity significantly degrade cache performance, with an even greater proportional impact on large caches.