scispace - formally typeset
Open Access

CACTI 6.0: A Tool to Model Large Caches

TLDR
This report details the analytical model assumed for the newly added modules along with their validation analysis of CACTI 6.0, a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches.
Abstract
© CACTI 6.0: A Tool to Model Large Caches Naveen Muralimanohar, Rajeev Balasubramonian, Norman P. Jouppi HP Laboratories HPL-2009-85 No keywords available. Future processors will likely have large on-chip caches with a possibility of dedicating an entire die for on-chip storage in a 3D stacked design. With the ever growing disparity between transistor and wire delay, the properties of such large caches will primarily depend on the characteristics of the interconnection networks that connect various sub-modules of a cache. CACTI 6.0 is a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches. In addition to strengthening the existing analytical model of the tool for dominant cache components, CACTI 6.0 includes two major extensions over earlier versions: first, the ability to model Non-Uniform Cache Access (NUCA), and second, the ability to model different types of wires, such as RC based wires with different power, delay, and area characteristics and differential low-swing buses. This report details the analytical model assumed for the newly added modules along with their validation analysis. External Posting Date: April 21, 2009 [Fulltext] Approved for External Publication Internal Posting Date: April 21, 2009 [Fulltext] Published in International Symposium on Microarchitecture, Chicago, Dec 2007. Copyright International Symposium on Microarchitecture, 2007. CACTI 6.0: A Tool to Model Large Caches Naveen Muralimanohar, Rajeev Balasubramonian, Norman P. Jouppi † School of Computing, University of Utah ‡ Hewlett-Packard Laboratories Abstract Future processors will likely have large on-chip caches with a possibility of dedicating an entire die for on-chip storage in a 3D stacked design. With the ever growing disparity between transistor and wire delay, the properties of such large caches will primarily depend on the characteristics of the interconnection networks that connect various sub-modules of a cache. CACTI 6.0 is a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches. In addition to strengthening the existing analytical model of the tool for dominant cache components, CACTI 6.0 includes two major extensions over earlier versions: first, the ability to model Non-Uniform Cache Access (NUCA), and second, the ability to model different types of wires, such as RC based wires with different power, delay, and area characteristics and differential low-swing buses. This report details the analytical model assumed for the newly added modules along with their validation analysis.Future processors will likely have large on-chip caches with a possibility of dedicating an entire die for on-chip storage in a 3D stacked design. With the ever growing disparity between transistor and wire delay, the properties of such large caches will primarily depend on the characteristics of the interconnection networks that connect various sub-modules of a cache. CACTI 6.0 is a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches. In addition to strengthening the existing analytical model of the tool for dominant cache components, CACTI 6.0 includes two major extensions over earlier versions: first, the ability to model Non-Uniform Cache Access (NUCA), and second, the ability to model different types of wires, such as RC based wires with different power, delay, and area characteristics and differential low-swing buses. This report details the analytical model assumed for the newly added modules along with their validation analysis.

read more

Citations
More filters
Journal ArticleDOI

EIE: efficient inference engine on compressed deep neural network

TL;DR: In this paper, the authors proposed an energy efficient inference engine (EIE) that performs inference on a compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing.
Proceedings ArticleDOI

Relaxing non-volatility for fast and energy-efficient STT-RAM caches

TL;DR: It is found that a pure STT-RAM cache hierarchy provides the best energy efficiency, though a hybrid design of SRAM-based L1 caches with reduced-retention STt-RAM L2 and L3 caches eliminates performance loss while still reducing the energy-delay product by more than 70%.
Proceedings ArticleDOI

PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture

TL;DR: In this article, the authors propose a new PIM architecture that does not change the existing sequential programming models and automatically decides whether to execute PIM operations in memory or processors depending on the locality of data.
Proceedings ArticleDOI

DRISA: a DRAM-based Reconfigurable In-Situ Accelerator

TL;DR: DRISA, a DRAM-based Reconfigurable In-Situ Accelerator architecture, is proposed to provide both powerful computing capability and large memory capacity/bandwidth to address the memory wall problem in traditional von Neumann architecture.
Posted Content

EIE: Efficient Inference Engine on Compressed Deep Neural Network

TL;DR: An energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing and is 189x and 13x faster when compared to CPU and GPU implementations of the same DNN without compression.
References
More filters
Book

Principles and Practices of Interconnection Networks

TL;DR: This book offers a detailed and comprehensive presentation of the basic principles of interconnection network design, clearly illustrating them with numerous examples, chapter exercises, and case studies, allowing a designer to see all the steps of the process from abstract design to concrete implementation.
Journal ArticleDOI

The future of wires

TL;DR: Wires that shorten in length as technologies scale have delays that either track gate delays or grow slowly relative to gate delays, which is good news since these "local" wires dominate chip wiring.
Proceedings ArticleDOI

An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches

TL;DR: This paper proposes physical designs for these Non-Uniform Cache Architectures (NUCAs) and extends these physical designs with logical policies that allow important data to migrate toward the processor within the same level of the cache.
Proceedings ArticleDOI

Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0

TL;DR: This work implements two major extensions to the CACTI cache modeling tool that focus on interconnect design for a large cache, and adopts state-of-the-art design space exploration strategies for non-uniform cache access (NUCA).
Related Papers (5)