scispace - formally typeset
Proceedings ArticleDOI

L1 data cache decomposition for energy efficiency

TLDR
In this paper, a new L1 data cache structure that combines a Specialized Stack Cache (SSC) and a Pseudo Set-Associative Cache (PSAC) is proposed.
Abstract
The L1 data cache is a time-critical module and, at the same time a major consumer of energy To reduce its energy-delay product, we apply two principles of low-power design: specialize part of the cache structure and break the cache down into smaller caches To this end, we propose a new L1 data cache structure that combines a Specialized Stack Cache (SSC) and a Pseudo Set-Associative Cache (PSAC) Individually, our SSC and PSAC designs have a lower energy-delay product than previously-proposed related designs In addition, their combined operation is very effective Relative to a conventional 2-way 32 KB data cache, a design containing a 4-way 32 KB PSAC and a 512 B SSC reduces the energy-delay product of several applications by an average of 44%

read more

Content maybe subject to copyright    Report

Citations
More filters
Proceedings ArticleDOI

A highly configurable cache architecture for embedded systems

TL;DR: This work introduces a novel cache architecture intended for embedded microprocessor platforms that can be configured by software to be direct-mapped, two-way, or four-way set associative, using a technique the authors call way concatenation, having very little size or performance overhead.
Book

Computer Architecture Techniques for Power-Efficiency

TL;DR: This book aims to document some of the most important architectural techniques that were invented, proposed, and applied to reduce both dynamic power and static power dissipation in processors and memory hierarchies by focusing on their common characteristics.
Proceedings ArticleDOI

Energy efficient Frequent Value data Cache design

TL;DR: This paper proposes the design of the Frequent Value Cache (FVC), a cache in which storing a frequent value requires few bits as they are stored in encoded form while all other values are storage in unencoded form using 32 bits.
Journal ArticleDOI

IATAC: a smart predictor to turn-off L2 cache lines

TL;DR: This paper introduces IATAC (inter-access time per access count), a new hardware technique to reduce cache leakage for L2 caches that outperforms all previous state-of-the-art techniques.
Journal ArticleDOI

Improving memory encryption performance in secure processors

TL;DR: In this article, a pseudo-one-time pad encryption scheme was proposed to produce the instructions and data ciphertext in parallel with memory accesses, minimizing the trade-off between storage size and performance penalty.
References
More filters
Journal ArticleDOI

CACTI: an enhanced cache access and cycle time model

TL;DR: In this paper, an analytical model for the access and cycle times of on-chip direct-mapped and set-associative caches is presented, where the inputs to the model are the cache size, block size, and associativity, as well as array organization and process parameters.
Journal ArticleDOI

The Mips R10000 superscalar microprocessor

K.C. Yeager
- 01 Apr 1996 - 
TL;DR: The Mips R10000 is a dynamic, superscalar microprocessor that implements the 64-bit Mips 4 instruction set architecture that fetches and decodes four instructions per cycle and dynamically issues them to five fully-pipelined, low-latency execution units.
Proceedings ArticleDOI

Way-predicting set-associative cache for high performance and low energy consumption

TL;DR: In this paper, a new approach using way prediction for achieving high performance and low energy consumption of set-associative caches is proposed, where only a single cache way is accessed, instead of accessing all the ways in a set.
Proceedings ArticleDOI

Column-associative caches: a technique for reducing the miss rate of direct-mapped caches

TL;DR: This paperribes the design of column-ossociotive caches which minhize the cofllcrs that arise in direct-mapped accesses by allowing conflicting addressest to dynamically choose alternate hashing functions, so that most of the cordiicting datacanreside in the cache.
Journal ArticleDOI

Cache performance of operating system and multiprogramming workloads

TL;DR: A program tracing technique called ATUM (Address Tracing Using Microcode) is developed that captures realistic traces of multitasking workloads including the operating system that shows that both the operating System and multiprogramming activity significantly degrade cache performance, with an even greater proportional impact on large caches.
Related Papers (5)