Write-variation aware alternatives to replace SRAM buffers with non-volatile buffers in on-chip interconnects

doi:10.1049/IET-CDT.2019.0039

Home
/
Papers
/
Write-variation aware alternatives to replace SRAM buffers with non-volatile buffers in on-chip interconnects

Journal Article•DOI•

Write-variation aware alternatives to replace SRAM buffers with non-volatile buffers in on-chip interconnects

01 Nov 2019-Iet Computers and Digital Techniques (The Institution of Engineering and Technology)-Vol. 13, Iss: 6, pp 481-492

TL;DR: Proposed policies to reduce the leakage power consumption of NoC buffers by the use of non-volatile spin transfer torque random access memory (STT-RAM)-based buffers and improve lifetime by 3.2 times and 1093 times, respectively are presented.

read less

Abstract: With the advancement in CMOS technology and multiple processors on the chip, communication across these cores is managed by a network-on-chip (NoC). Power and performance of these NoC interconnects have become a significant factor.The authors aim to reduce the leakage power consumption of NoC buffers by the use of non-volatile spin transfer torque random access memory (STT-RAM)-based buffers. STT-RAM technology has the advantages of high density and low leakage but suffers from low endurance. This low endurance has an impact on the lifetime of the router on the whole due to unwanted write-variations governed by virtual channel (VC) allocation policies. Here various VC allocation policies that help the uniform distribution of the writes across the buffers are proposed. Iso-capacity and iso-area-based alternatives to replace SRAM buffers with STT-RAM buffers are also presented. Pure STT-RAM buffers, however, impact the network latency. To mitigate this, a hybrid variant of the proposed policies which uses alternative VCs made of SRAM technology in the case of heavy network traffic is proposed. Experimental evaluation of full system simulation shows that proposed policies reduce the write variation by 99% and improve lifetime by 3.2 times and 1093 times, respectively. Also a 55.5% gain in the energy delay product is obtained.

...read moreread less

Citations

PDF

Open Access

More filters

Proceedings Article•DOI•

DidaSel: dirty data based selection of VC for effective utilization of NVM buffers in on-chip interconnects

[...]

Khushboo Rani¹, Sukarn Agarwal¹, Hemangee K. Kapoor¹•Institutions (1)

Indian Institute of Technology Guwahati¹

10 Aug 2020

TL;DR: A write reduction technique, which is based on dirty flits present in write-back data packets, which results in a significant decrease in total and dynamic network power consumption and shows remarkable improvement in the lifetime.

...read moreread less

Abstract: In a multi-core system, communication across cores is managed by an on-chip interconnect called Network-on-Chip (NoC). The utilization of NoC results in limitations such as high communication delay and high network power consumption. The buffers of the NoC router consume a considerable amount of leakage power. This paper attempts to reduce leakage power consumption by using Non-Volatile Memory technology-based buffers. NVM technology has the advantage of higher density and low leakage but suffers from costly write operation, and weaker write endurance. These characteristics impact on the total network power consumption, network latency, and lifetime of the router as a whole.In this paper, we propose a write reduction technique, which is based on dirty flits present in write-back data packets. The method also suggests a dirty flit based Virtual Channel (VC) allocation technique that distributes writes in NVM technology-based VCs to improve the lifetime of NVM buffers.The experimental evaluation on the full system simulator shows that the proposed policy obtains a 53% reduction in write-back flits, which results in 27% lesser total network flit on average. All these results in a significant decrease in total and dynamic network power consumption. The policy also shows remarkable improvement in the lifetime.

...read moreread less

2 citations

Cites background from "Write-variation aware alternatives ..."

...[16, 17] presented wear-leveling techniques to remove unwanted write variation in NVM based buffers to improve the lifetime of NoC routers....
[...]

Journal Article•DOI•

Investigating Frequency Scaling, Nonvolatile, and Hybrid Memory Technologies for On-Chip Routers to Support the Era of Dark Silicon

[...]

Khushboo Rani¹, Hemangee K. Kapoor¹•Institutions (1)

Indian Institute of Technology Guwahati¹

01 Apr 2021-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Keep the routers always powered ON to maintain constant connectivity and investigate various approaches to use a combination of SRAM and nonvolatile spin-transfer torque random access memory-based VCs in the routers, which yield significant energy savings while maintaining connectivity.

...read moreread less

Abstract: In the era of dark silicon, several components on the chip [i.e., cores, memory, and network on chip (NoC)] need to be powered-off or run in low-power mode. This is mainly due to the increased leakage power consumption at smaller technology nodes. Other than the power consumed by cores and caches, power and performance of the interconnects is a significant factor as the communication network consumes a considerable share of the power budget. In particular, the buffers used at every port of the NoC router consume considerable dynamic as well as static power. To support dark silicon and save energy, a popular approach is to power off the routers and wake them up when needed. However, this affects the packet latency, and we need to observe the traffic through the nodes to decide turning the routers ON–OFF. In this article, we propose to keep the routers always powered ON to maintain constant connectivity and investigate various approaches. One proposal is to frequency scale the routers connected to powered OFF nodes, and the other proposals are to use a combination of SRAM and nonvolatile spin-transfer torque random access memory-based VCs in the routers. By managing which VCs to be active at a given time, we achieve energy savings. The proposals are evaluated by varying the percentage of dark nodes on the chip. The experimental results show that all proposals yield significant energy savings while maintaining connectivity.

...read moreread less

1 citations

Cites background from "Write-variation aware alternatives ..."

...Rani and Kapoor [44], [45] presented wear-leveling techniques to remove unwanted write variation in NVMbased buffers to improve the lifetime of NoC routers....
[...]

References

PDF

Open Access

More filters

Journal Article•DOI•

The gem5 simulator

[...]

Nathan Binkert¹, Bradford M. Beckmann², Gabriel Black³, Steven K. Reinhardt², Ali G. Saidi, Arkaprava Basu⁴, Joel Hestness⁵, Derek R. Hower⁴, Tushar Krishna⁶, Somayeh Sardashti⁴, Rathijit Sen⁴, Korey Sewell⁷, Muhammad Shoaib⁴, Nilay Vaish⁴, Mark D. Hill⁴, Darien Wood⁴ - Show less +12 more•Institutions (7)

Hewlett-Packard¹, Advanced Micro Devices², Google³, University of Wisconsin-Madison⁴, University of Texas at Austin⁵, Massachusetts Institute of Technology⁶, University of Michigan⁷

31 Aug 2011-ACM Sigarch Computer Architecture News

TL;DR: The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.

...read moreread less

Abstract: The gem5 simulation infrastructure is the merger of the best aspects of the M5 [4] and GEMS [9] simulators. M5 provides a highly configurable simulation framework, multiple ISAs, and diverse CPU models. GEMS complements these features with a detailed and exible memory system, including support for multiple cache coherence protocols and interconnect models. Currently, gem5 supports most commercial ISAs (ARM, ALPHA, MIPS, Power, SPARC, and x86), including booting Linux on three of them (ARM, ALPHA, and x86).The project is the result of the combined efforts of many academic and industrial institutions, including AMD, ARM, HP, MIPS, Princeton, MIT, and the Universities of Michigan, Texas, and Wisconsin. Over the past ten years, M5 and GEMS have been used in hundreds of publications and have been downloaded tens of thousands of times. The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.

...read moreread less

4,039 citations

Journal Article•DOI•

SPEC CPU2006 benchmark descriptions

[...]

John L. Henning¹•Institutions (1)

Sun Microsystems¹

01 Sep 2006-ACM Sigarch Computer Architecture News

TL;DR: On August 24, 2006, the Standard Performance Evaluation Corporation (SPEC) announced CPU2006, which replaces CPU2000, and the SPEC CPU benchmarks are widely used in both industry and academia.

...read moreread less

Abstract: On August 24, 2006, the Standard Performance Evaluation Corporation (SPEC) announced CPU2006 [2], which replaces CPU2000. The SPEC CPU benchmarks are widely used in both industry and academia [3].

...read moreread less

1,864 citations

Journal Article•DOI•

Dark Silicon and the End of Multicore Scaling

[...]

Hadi Esmaeilzadeh¹, Emily Blem², R. St. Amant³, Karthikeyan Sankaralingam², Doug Burger⁴ - Show less +1 more•Institutions (4)

University of Washington¹, University of Wisconsin-Madison², University of Texas at Austin³, Microsoft⁴

01 May 2012-IEEE Micro

TL;DR: A comprehensive study that projects the speedup potential of future multicores and examines the underutilization of integration capacity-dark silicon-is timely and crucial.

...read moreread less

Abstract: A key question for the microprocessor research and design community is whether scaling multicores will provide the performance and value needed to scale down many more technology generations. To provide a quantitative answer to this question, a comprehensive study that projects the speedup potential of future multicores and examines the underutilization of integration capacity-dark silicon-is timely and crucial.

...read moreread less

1,556 citations

Journal Article•DOI•

NVSim: A Circuit-Level Performance, Energy, and Area Model for Emerging Nonvolatile Memory

[...]

Xiangyu Dong¹, Cong Xu², Yuan Xie², Norman P. Jouppi³•Institutions (3)

Qualcomm¹, Pennsylvania State University², Hewlett-Packard³

01 Jul 2012-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: NVSim is developed, a circuit-level model for NVM performance, energy, and area estimation, which supports various NVM technologies, including STT-RAM, PCRAM, ReRAM, and legacy NAND Flash and is expected to help boost architecture-level NVM-related studies.

...read moreread less

Abstract: Various new nonvolatile memory (NVM) technologies have emerged recently. Among all the investigated new NVM candidate technologies, spin-torque-transfer memory (STT-RAM, or MRAM), phase-change random-access memory (PCRAM), and resistive random-access memory (ReRAM) are regarded as the most promising candidates. As the ultimate goal of this NVM research is to deploy them into multiple levels in the memory hierarchy, it is necessary to explore the wide NVM design space and find the proper implementation at different memory hierarchy levels from highly latency-optimized caches to highly density- optimized secondary storage. While abundant tools are available as SRAM/DRAM design assistants, similar tools for NVM designs are currently missing. Thus, in this paper, we develop NVSim, a circuit-level model for NVM performance, energy, and area estimation, which supports various NVM technologies, including STT-RAM, PCRAM, ReRAM, and legacy NAND Flash. NVSim is successfully validated against industrial NVM prototypes, and it is expected to help boost architecture-level NVM-related studies.

...read moreread less

1,100 citations

Journal Article•DOI•

The Raw microprocessor: a computational fabric for software circuits and general-purpose programs

[...]

Michael Taylor¹, Jung Hun Kim¹, Jason Miller¹, David Wentzlaff¹, Fae Ghodrat¹, Ben Greenwald¹, Henry Hoffman¹, Paul Johnson¹, Jae-Wook Lee¹, Woo Sik Lee¹, A. Ma¹, Arvind Saraf¹, M. Seneski¹, Nathan Shnidman¹, Volker Strumpen¹, Matthew I. Frank¹, Saman Amarasinghe¹, Anant Agarwal¹ - Show less +14 more•Institutions (1)

Massachusetts Institute of Technology¹

01 Mar 2002-IEEE Micro

TL;DR: The Raw microprocessor research prototype uses a scalable instruction set architecture to attack the emerging wire-delay problem by providing a parallel, software interface to the gate, wire and pin resources of the chip.

...read moreread less

Abstract: Wire delay is emerging as the natural limiter to microprocessor scalability. A new architectural approach could solve this problem, as well as deliver unprecedented performance, energy efficiency and cost effectiveness. The Raw microprocessor research prototype uses a scalable instruction set architecture to attack the emerging wire-delay problem by providing a parallel, software interface to the gate, wire and pin resources of the chip. An architecture that has direct, first-class analogs to all of these physical resources will ultimately let programmers achieve the maximum amount of performance and energy efficiency in the face of wire delay.

...read moreread less

1,087 citations