A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors

doi:10.1109/ISSCC.2018.8310400

Proceedings ArticleDOI

A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors

- pp 494-496

TLDR

Many artificial intelligence (AI) edge devices use nonvolatile memory (NVM) to store the weights for the neural network (trained off-line on an AI server), and require low-energy and fast I/O accesses.

Abstract:

Many artificial intelligence (AI) edge devices use nonvolatile memory (NVM) to store the weights for the neural network (trained off-line on an AI server), and require low-energy and fast I/O accesses. The deep neural networks (DNN) used by AI processors [1,2] commonly require p-layers of a convolutional neural network (CNN) and q-layers of a fully-connected network (FCN). Current DNN processors that use a conventional (von-Neumann) memory structure are limited by high access latencies, I/O energy consumption, and hardware costs. Large working data sets result in heavy accesses across the memory hierarchy, moreover large amounts of intermediate data are also generated due to the large number of multiply-and-accumulate (MAC) operations for both CNN and FCN. Even when binary-based DNN [3] are used, the required CNN and FCN operations result in a major memory I/O bottleneck for AI edge devices.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

SLIM: Simultaneous Logic-in-Memory Computing Exploiting Bilayer Analog OxRAM Devices.

Sandeep Kaur Kingra, +5 more

- 13 Feb 2020 -

Scientific Reports

TL;DR: This paper proposes a novel ‘Simultaneous Logic in-Memory’ (SLIM) methodology which is complementary to existing LIM approaches in literature and demonstrates novel SLIM bitcells comprising non-filamentary bilayer analog OxRAM devices with NMOS transistors.

...read moreread less

Journal ArticleDOI

Neuro-inspired computing chips

Wenqiang Zhang, +8 more

TL;DR: The development of neuro-inspired computing chips and their key benchmarking metrics are reviewed, providing a co-design tool chain and proposing a roadmap for future large-scale chips are provided and a future electronic design automation tool chain is proposed.

...read moreread less

Journal ArticleDOI

Reinforcement learning with analogue memristor arrays

Zhongrui Wang, +18 more

TL;DR: An experimental demonstration of reinforcement learning on a three-layer 1-transistor 1-memristor (1T1R) network using a modified learning algorithm tailored for the authors' hybrid analogue–digital platform, which has the potential to achieve a significant boost in speed and energy efficiency.

...read moreread less

Journal ArticleDOI

Three-dimensional memristor circuits as complex neural networks

Peng Lin, +13 more

TL;DR: A three-dimensional circuit composed of eight layers of monolithically integrated memristive devices is built and used to implement complex neural networks, demonstrating accurate MNIST classification and effective edge detection in videos.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory

Ping Chi, +7 more

TL;DR: This work proposes a novel PIM architecture, called PRIME, to accelerate NN applications in ReRAM based main memory, and distinguishes itself from prior work on NN acceleration, with significant performance improvement and energy saving.

...read moreread less

Proceedings ArticleDOI

14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks

Dongjoo Shin, +3 more

TL;DR: A highly reconfigurable CNN-RNN processor with high energy-efficiency is desirable to support general-purpose deep neural networks (DNNs).

...read moreread less

Proceedings ArticleDOI

14.4 A scalable speech recognizer with deep-neural-network acoustic models and voice-activated power gating

Michael Price, +2 more

TL;DR: IC designs for ASR and VAD are described that improve on the accuracy, programmability, and scalability of previous work.

...read moreread less

Proceedings ArticleDOI

A 462GOPs/J RRAM-based nonvolatile intelligent processor for energy harvesting IoE system featuring nonvolatile logics and processing-in-memory

Fang Su, +13 more

TL;DR: This work presents the first nonvolatile processor capable of general as well as neural network computing in addition to the first integrated chip using RRAM-based PIM.

...read moreread less

Proceedings ArticleDOI

An offset-tolerant current-sampling-based sense amplifier for Sub-100nA-cell-current nonvolatile memory

Meng-Fan Chang, +12 more

TL;DR: This study proposes a new offset tolerant current-sampling-based SA (CSB-SA) to achieve 7× faster read speed than previous SAs for sensing small ICELL, and achieves 26ns macro random access time for reading sub-200nA ICELL.

...read moreread less

IEEE Journal of Solid-state Circuits

A 65nm 1Mb nonvolatile computing-in-memory ReRAM macro with sub-16ns multiply-and-accumulate for binary DNN AI edge processors

Citations

SLIM: Simultaneous Logic-in-Memory Computing Exploiting Bilayer Analog OxRAM Devices.

Neuro-inspired computing chips

Reinforcement learning with analogue memristor arrays

24.1 A 1Mb Multibit ReRAM Computing-In-Memory Macro with 14.6ns Parallel MAC Computing Time for CNN Based AI Edge Processors

Three-dimensional memristor circuits as complex neural networks

References

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory

14.2 DNPU: An 8.1TOPS/W reconfigurable CNN-RNN processor for general-purpose deep neural networks

14.4 A scalable speech recognizer with deep-neural-network acoustic models and voice-activated power gating

A 462GOPs/J RRAM-based nonvolatile intelligent processor for energy harvesting IoE system featuring nonvolatile logics and processing-in-memory

An offset-tolerant current-sampling-based sense amplifier for Sub-100nA-cell-current nonvolatile memory

Related Papers (5)

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory

ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars

Gradient-based learning applied to document recognition

Deep Residual Learning for Image Recognition

Eyeriss: An Energy-Efficient Reconfigurable Accelerator for Deep Convolutional Neural Networks