Proceedings ArticleDOI
PIMA-logic: a novel processing-in-memory architecture for highly flexible and energy-efficient logic computation
Shaahin Angizi,Zhezhi He,Deliang Fan +2 more
- pp 162
TLDR
This paper proposes PIMA-Logic, a novel Processing-in-Memory Architecture for highly flexible and efficient Logic computation that exploits a hardware-friendly approach to implement Boolean logic functions between operands either located in the same row or the same column within entire memory arrays.Abstract:
In this paper, we propose PIMA-Logic, as a novel Processing-in-Memory Architecture for highly flexible and efficient Logic computation. Instead of integrating complex logic units in cost-sensitive memory, PIMA-Logic exploits a hardware-friendly approach to implement Boolean logic functions between operands either located in the same row or the same column within entire memory arrays. Furthermore, it can efficiently process more complex logic functions between multiple operands to further reduce the latency and power-hungry data movement. The proposed architecture is developed based on Spin Orbit Torque Magnetic Random Access Memory (SOT-MRAM) array and it can simultaneously work as a non-volatile memory and a reconfigurable in-memory logic. The device-to-architecture co-simulation results show that PIMA-Logic can achieve up to 56% and 31.6% improvements with respect to overall energy and delay on combinational logic benchmarks compared to recent Pinatubo architecture. We further implement an in-memory data encryption engine based on PIMA-Logic as a case study. With AES application, it shows 77.2% and 21% lower energy consumption compared to CMOS-ASIC and recent RIMPA implementation, respectively.read more
Citations
More filters
Posted Content
A Modern Primer on Processing in Memory.
TL;DR: This chapter discusses recent research that aims to practically enable computation close to data, an approach called processing-in-memory (PIM).
Journal ArticleDOI
Processing-in-memory: A workload-driven perspective
TL;DR: This article describes the work on systematically identifying opportunities for PIM in real applications and quantifies potential gains for popular emerging applications (e.g., machine learning, data analytics, genome analysis) and describes challenges that remain for the widespread adoption of PIM.
Journal ArticleDOI
MRIMA: An MRAM-Based In-Memory Accelerator
TL;DR: This paper presents practical case studies to demonstrate MRIMA’s acceleration for binary-weight and low bit-width convolutional neural networks (CNNs) as well as data encryption, and shows ~77% and 21% lower energy consumption compared to CMOS-ASIC and recent domain-wall-based design, respectively.
Proceedings ArticleDOI
DUAL: Acceleration of Clustering Algorithms using Digital-based Processing In-Memory
TL;DR: DUAL is proposed, a Digital-based Unsupervised learning AcceLeration, which supports a wide range of popular algorithms on conventional crossbar memory and provides a comparable quality to existing clustering algorithms while using a binary representation and a simplified distance metric.
Journal ArticleDOI
PXNOR-BNN: In/With Spin-Orbit Torque MRAM Preset-XNOR Operation-Based Binary Neural Networks
TL;DR: An NVM-based CIM architecture employing a Preset-XNOR operation in/with the spin–orbit torque magnetic random access memory (SOT-MRAM) to accelerate the computation of BNNs (PXNOR-BNN) is proposed.
References
More filters
Journal ArticleDOI
The gem5 simulator
Nathan Binkert,Bradford M. Beckmann,Gabriel Black,Steven K. Reinhardt,Ali G. Saidi,Arkaprava Basu,Joel Hestness,Derek R. Hower,Tushar Krishna,Somayeh Sardashti,Rathijit Sen,Korey Sewell,Muhammad Shoaib,Nilay Vaish,Mark D. Hill,Darien Wood +15 more
TL;DR: The high level of collaboration on the gem5 project, combined with the previous success of the component parts and a liberal BSD-like license, make gem5 a valuable full-system simulation tool.
Proceedings ArticleDOI
McPAT: an integrated power, area, and timing modeling framework for multicore and manycore architectures
TL;DR: Combining power, area, and timing results of McPAT with performance simulation of PARSEC benchmarks at the 22nm technology node for both common in-order and out-of-order manycore designs shows that when die cost is not taken into account clustering 8 cores together gives the best energy-delay product, whereas when cost is taking into account configuring clusters with 4 cores gives thebest EDA2P and EDAP.
Proceedings ArticleDOI
Architecting phase change memory as a scalable dram alternative
TL;DR: This work proposes, crafted from a fundamental understanding of PCM technology parameters, area-neutral architectural enhancements that address these limitations and make PCM competitive with DRAM.
Journal ArticleDOI
ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars
Ali Shafiee,Anirban Nag,Naveen Muralimanohar,Rajeev Balasubramonian,John Paul Strachan,Miao Hu,R. Stanley Williams,Vivek Srikumar +7 more
TL;DR: This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner.
Journal ArticleDOI
Spin transfer torque devices utilizing the giant spin Hall effect of tungsten
TL;DR: Using spin torque induced ferromagnetic resonance with a β-W/CoFeB bilayer microstrip, the spin Hall angle was determined to be |θSHβ-W|=0.30±0.02 as mentioned in this paper.