Ultra-Efficient Processing In-Memory for Data Intensive Applications

doi:10.1145/3061639.3062337

Open AccessProceedings ArticleDOI

Ultra-Efficient Processing In-Memory for Data Intensive Applications

- pp 6

TLDR

This paper proposes an ultra-efficient approximate processing in-memory architecture, called APIM, which exploits the analog characteristics of non-volatile memories to support addition and multiplication inside the crossbar memory, while storing the data.

Abstract:

Recent years have witnessed a rapid growth in the domain of Internet of Things (IoT). This network of billions of devices generates and exchanges huge amount of data. The limited cache capacity and memory bandwidth make transferring and processing such data on traditional CPUs and GPUs highly inefficient, both in terms of energy consumption and delay. However, many IoT applications are statistical at heart and can accept a part of inaccuracy in their computation. This enables the designers to reduce complexity of processing by approximating the results for a desired accuracy. In this paper, we propose an ultra-efficient approximate processing in-memory architecture, called APIM, which exploits the analog characteristics of non-volatile memories to support addition and multiplication inside the crossbar memory, while storing the data. The proposed design eliminates the overhead involved in transferring data to processor by virtually bringing the processor inside memory. APIM dynamically configures the precision of computation for each application in order to tune the level of accuracy during runtime. Our experimental evaluation running six general OpenCL applications shows that the proposed design achieves up to 20× performance improvement and provides 480× improvement in energy-delay product, ensuring acceptable quality of service. In exact mode, it achieves 28× energy savings and 4.8× speed up compared to the state-of-the-art GPU cores.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

FloatPIM: in-memory acceleration of deep neural network training with high precision

Mohsen Imani, +3 more

TL;DR: FloatPIM is proposed, a fully-digital scalable PIM architecture that accelerates CNN in both training and testing phases and natively supports floating-point representation, thus enabling accurate CNN training.

...read moreread less

Proceedings ArticleDOI

FELIX: fast and energy-efficient logic in memory

Saransh Gupta, +2 more

TL;DR: This paper proposes an in-memory implementation of fast and energy-efficient logic (FELIX) which combines the functionality of PIM with memories and is the first PIM logic to enable the single cycle NOR, NOT, NAND, minority, and OR directly in crossbar memory.

...read moreread less

Journal ArticleDOI

SIMPLER MAGIC: Synthesis and Mapping of In-Memory Logic Executed in a Single Row to Improve Throughput

Rotem Ben-Hur, +6 more

- 01 Oct 2020 -

IEEE Transactions on Computer-Aided Desi...

TL;DR: A novel automatic framework for efficient implementation of arbitrary combinational logic functions within a memristive memory using synthesis and in-memory mapping of logic execution in a single row (SIMPLER), a tool that optimizes the execution of in- memory logic operations in terms of throughput and area.

...read moreread less

Proceedings ArticleDOI

Efficient Algorithms for In-Memory Fixed Point Multiplication Using MAGIC

Ameer Haj-Ali, +3 more

TL;DR: The algorithms proposed in this paper not only improve the latency as compared to previously proposed algorithms by 1.8× on average, but their significantly better area efficiency now makes it possible to perform numerous fixed point multiplications simultaneously within memristive memory arrays.

...read moreread less

Journal ArticleDOI

SearcHD: A Memory-Centric Hyperdimensional Computing With Stochastic Training

Mohsen Imani, +6 more

- 01 Oct 2020 -

IEEE Transactions on Computer-Aided Desi...

TL;DR: SearcHD is proposed, a fully binarized HD computing algorithm with a fully binary training which generates multiple binary hypervectors for each class and uses the analog characteristic of nonvolatile memories to perform all encoding, training, and inference computations in memory.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Internet of Things (IoT): A vision, architectural elements, and future directions

Jayavardhana Gubbi, +3 more

- 01 Sep 2013 -

Future Generation Computer Systems

TL;DR: In this article, the authors present a cloud centric vision for worldwide implementation of Internet of Things (IoT) and present a Cloud implementation using Aneka, which is based on interaction of private and public Clouds, and conclude their IoT vision by expanding on the need for convergence of WSN, the Internet and distributed computing directed at technological research community.

...read moreread less

Journal ArticleDOI

‘Memristive’ switches enable ‘stateful’ logic operations via material implication

Julien Borghetti, +6 more

- 08 Apr 2010 -

Nature

TL;DR: Bipolar voltage-actuated switches, a family of nonlinear dynamical memory devices, can execute material implication (IMP), which is a fundamental Boolean logic operation on two variables p and q such that pIMPq is equivalent to (NOTp)ORq.

...read moreread less

Proceedings ArticleDOI

Approximate computing: An emerging paradigm for energy-efficient design

Jie Han, +1 more

TL;DR: This paper reviews recent progress in the area, including design of approximate arithmetic blocks, pertinent error and quality measures, and algorithm-level techniques for approximate computing.

...read moreread less

Journal ArticleDOI

Low-Power Digital Signal Processing Using Approximate Adders

Vaibhav Kumar Gupta, +3 more

- 01 Jan 2013 -

IEEE Transactions on Computer-Aided Desi...

TL;DR: This paper proposes logic complexity reduction at the transistor level as an alternative approach to take advantage of the relaxation of numerical accuracy, and demonstrates the utility of these approximate adders in two digital signal processing architectures with specific quality constraints.

...read moreread less

Journal ArticleDOI

MAGIC—Memristor-Aided Logic

Shahar Kvatinsky, +7 more

- 11 Sep 2014 -

IEEE Transactions on Circuits and System...

TL;DR: In this brief, a memristor-only logic family, i.e., memristar-aided logic (MAGIC), is presented, and in each MAGIC logic gate, memristors serve as an input with previously stored data, and an additional Memristor serves as an output.

...read moreread less

Collapse

Related Papers (5)

MAGIC—Memristor-Aided Logic

Shahar Kvatinsky, +7 more

- 11 Sep 2014 -

IEEE Transactions on Circuits and System...

VTEAM: A General Model for Voltage-Controlled Memristors

Shahar Kvatinsky, +3 more

- 20 May 2015 -

IEEE Transactions on Circuits and System...

Ultra-Efficient Processing In-Memory for Data Intensive Applications

Citations

FloatPIM: in-memory acceleration of deep neural network training with high precision

FELIX: fast and energy-efficient logic in memory

SIMPLER MAGIC: Synthesis and Mapping of In-Memory Logic Executed in a Single Row to Improve Throughput

Efficient Algorithms for In-Memory Fixed Point Multiplication Using MAGIC

SearcHD: A Memory-Centric Hyperdimensional Computing With Stochastic Training

References

Internet of Things (IoT): A vision, architectural elements, and future directions

‘Memristive’ switches enable ‘stateful’ logic operations via material implication

Approximate computing: An emerging paradigm for energy-efficient design

Low-Power Digital Signal Processing Using Approximate Adders

MAGIC—Memristor-Aided Logic

Related Papers (5)

MAGIC—Memristor-Aided Logic

VTEAM: A General Model for Voltage-Controlled Memristors

ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars

PRIME: a novel processing-in-memory architecture for neural network computation in ReRAM-based main memory

A scalable processing-in-memory accelerator for parallel graph processing