Open Access
CACTI 6.0: A Tool to Model Large Caches
TLDR
This report details the analytical model assumed for the newly added modules along with their validation analysis of CACTI 6.0, a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches.Abstract:
© CACTI 6.0: A Tool to Model Large Caches Naveen Muralimanohar, Rajeev Balasubramonian, Norman P. Jouppi HP Laboratories HPL-2009-85 No keywords available. Future processors will likely have large on-chip caches with a possibility of dedicating an entire die for on-chip storage in a 3D stacked design. With the ever growing disparity between transistor and wire delay, the properties of such large caches will primarily depend on the characteristics of the interconnection networks that connect various sub-modules of a cache. CACTI 6.0 is a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches. In addition to strengthening the existing analytical model of the tool for dominant cache components, CACTI 6.0 includes two major extensions over earlier versions: first, the ability to model Non-Uniform Cache Access (NUCA), and second, the ability to model different types of wires, such as RC based wires with different power, delay, and area characteristics and differential low-swing buses. This report details the analytical model assumed for the newly added modules along with their validation analysis. External Posting Date: April 21, 2009 [Fulltext] Approved for External Publication Internal Posting Date: April 21, 2009 [Fulltext] Published in International Symposium on Microarchitecture, Chicago, Dec 2007. Copyright International Symposium on Microarchitecture, 2007. CACTI 6.0: A Tool to Model Large Caches Naveen Muralimanohar, Rajeev Balasubramonian, Norman P. Jouppi † School of Computing, University of Utah ‡ Hewlett-Packard Laboratories Abstract Future processors will likely have large on-chip caches with a possibility of dedicating an entire die for on-chip storage in a 3D stacked design. With the ever growing disparity between transistor and wire delay, the properties of such large caches will primarily depend on the characteristics of the interconnection networks that connect various sub-modules of a cache. CACTI 6.0 is a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches. In addition to strengthening the existing analytical model of the tool for dominant cache components, CACTI 6.0 includes two major extensions over earlier versions: first, the ability to model Non-Uniform Cache Access (NUCA), and second, the ability to model different types of wires, such as RC based wires with different power, delay, and area characteristics and differential low-swing buses. This report details the analytical model assumed for the newly added modules along with their validation analysis.Future processors will likely have large on-chip caches with a possibility of dedicating an entire die for on-chip storage in a 3D stacked design. With the ever growing disparity between transistor and wire delay, the properties of such large caches will primarily depend on the characteristics of the interconnection networks that connect various sub-modules of a cache. CACTI 6.0 is a significantly enhanced version of the tool that primarily focuses on interconnect design for large caches. In addition to strengthening the existing analytical model of the tool for dominant cache components, CACTI 6.0 includes two major extensions over earlier versions: first, the ability to model Non-Uniform Cache Access (NUCA), and second, the ability to model different types of wires, such as RC based wires with different power, delay, and area characteristics and differential low-swing buses. This report details the analytical model assumed for the newly added modules along with their validation analysis.read more
Citations
More filters
Journal ArticleDOI
EIE: efficient inference engine on compressed deep neural network
TL;DR: In this paper, the authors proposed an energy efficient inference engine (EIE) that performs inference on a compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing.
Proceedings ArticleDOI
Relaxing non-volatility for fast and energy-efficient STT-RAM caches
TL;DR: It is found that a pure STT-RAM cache hierarchy provides the best energy efficiency, though a hybrid design of SRAM-based L1 caches with reduced-retention STt-RAM L2 and L3 caches eliminates performance loss while still reducing the energy-delay product by more than 70%.
Proceedings ArticleDOI
PIM-enabled instructions: a low-overhead, locality-aware processing-in-memory architecture
TL;DR: In this article, the authors propose a new PIM architecture that does not change the existing sequential programming models and automatically decides whether to execute PIM operations in memory or processors depending on the locality of data.
Proceedings ArticleDOI
DRISA: a DRAM-based Reconfigurable In-Situ Accelerator
TL;DR: DRISA, a DRAM-based Reconfigurable In-Situ Accelerator architecture, is proposed to provide both powerful computing capability and large memory capacity/bandwidth to address the memory wall problem in traditional von Neumann architecture.
Posted Content
EIE: Efficient Inference Engine on Compressed Deep Neural Network
TL;DR: An energy efficient inference engine (EIE) that performs inference on this compressed network model and accelerates the resulting sparse matrix-vector multiplication with weight sharing and is 189x and 13x faster when compared to CPU and GPU implementations of the same DNN without compression.
References
More filters
Book
Principles and Practices of Interconnection Networks
William J. Dally,Brian Towles +1 more
TL;DR: This book offers a detailed and comprehensive presentation of the basic principles of interconnection network design, clearly illustrating them with numerous examples, chapter exercises, and case studies, allowing a designer to see all the steps of the process from abstract design to concrete implementation.
Journal ArticleDOI
The future of wires
R. Ho,Ken Mai,Mark Horowitz +2 more
TL;DR: Wires that shorten in length as technologies scale have delays that either track gate delays or grow slowly relative to gate delays, which is good news since these "local" wires dominate chip wiring.
Proceedings ArticleDOI
An adaptive, non-uniform cache structure for wire-delay dominated on-chip caches
TL;DR: This paper proposes physical designs for these Non-Uniform Cache Architectures (NUCAs) and extends these physical designs with logical policies that allow important data to migrate toward the processor within the same level of the cache.
Proceedings ArticleDOI
Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0
TL;DR: This work implements two major extensions to the CACTI cache modeling tool that focus on interconnect design for a large cache, and adopts state-of-the-art design space exploration strategies for non-uniform cache access (NUCA).