scispace - formally typeset
Search or ask a question
Author

John Reuben

Bio: John Reuben is an academic researcher from University of Erlangen-Nuremberg. The author has contributed to research in topics: Clock gating & Clock skew. The author has an hindex of 8, co-authored 26 publications receiving 183 citations. Previous affiliations of John Reuben include Technion – Israel Institute of Technology & VIT University.

Papers
More filters
Proceedings ArticleDOI
01 Sep 2017
TL;DR: This paper proposes metrics to compare memristive logic families using analytic expressions for performance, energy efficiency, and area, and provides guidelines for a holistic comparison of logic families and set the stage for the evolution of new logic families.
Abstract: Memristors have extended their influence beyond memory to logic and in-memory computing. Memristive logic design, the methodology of designing logic circuits using memristors, is an emerging concept whose growth is fueled by the quest for energy efficient computing systems. As a result, many memristive logic families have evolved with different attributes, and a mature comparison among them is needed to judge their merit. This paper presents a framework for comparing logic families by classifying them on the basis of fundamental properties such as statefulness, proximity (from the memory array), and flexibility of computation. We propose metrics to compare memristive logic families using analytic expressions for performance (latency), energy efficiency, and area. Then, we provide guidelines for a holistic comparison of logic families and set the stage for the evolution of new logic families.

76 citations

Journal ArticleDOI
TL;DR: In this article, a physics-based compact model was chosen due to its flexibility, and the proposed algorithm was used to exactly fit the model to different RRAMs, which differed greatly in their material composition and switching behavior.
Abstract: Modeling of resistive RAMs (RRAMs) is a herculean task due to its non-linearity. While the exigent need for a model has motivated research groups to formulate realistic models, the diversity in RRAMs’ characteristics has created a gap between model developers and model users. This paper bridges the gap by proposing an algorithm by which the parameters of a model are tuned to specific RRAMs. To this end, a physics-based compact model was chosen due to its flexibility, and the proposed algorithm was used to exactly fit the model to different RRAMs, which differed greatly in their material composition and switching behavior. Furthermore, the model was extended to simulate multiple low resistance states (LRS), which is a vital focus of research to increase memory density in RRAMs. The ability of the model to simulate the switching from a high resistance state to multiple LRS was verified by measurements on 1T-1R cells.

37 citations

Journal ArticleDOI
TL;DR: An algorithm to fit a certain amount of variability to an existing physics-based analytical model (Stanford-PKU model) and the extent of variability exhibited by the device is fitted to the model in a manner agnostic to the cause of variability.
Abstract: Intrinsic variability observed in resistive-switching devices (cycle-to-cycle and device-to-device) is widely recognised as a major hurdle for widespread adoption of Resistive RAM technology While physics-based models have been developed to accurately reproduce the resistive-switching behaviour, reproducing the observed variability behavior of a specific RRAM has not been studied Without a properly fitted variability in the model, the simulation error introduced at the device-level propagates through circuit-level to system-level simulations in an unpredictable manner In this work, we propose an algorithm to fit a certain amount of variability to an existing physics-based analytical model (Stanford-PKU model) The extent of variability exhibited by the device is fitted to the model in a manner agnostic to the cause of variability Further, the model is modified to better reproduce the variations observed in a device The model, fitted with variability can well reproduce cycle-to-cycle, as well as device-to-device variations The significance of integrating variability into RRAM models is underscored using a sensing example

29 citations

Book ChapterDOI
01 Jan 2020
TL;DR: This chapter presents mMPU memristive memory processing unit, which relies on a Memristor-Aided loGIC (MAGIC), a technique to compute logical functions using memristors within the memory array, and therefore directly tackles the von Neumann bottleneck.
Abstract: Data transfer between processing and memory units in modern computing systems is their main performance and energy-efficiency bottleneck, commonly known as the von Neumann bottleneck. Prior research attempts to alleviate the problem by moving the computing units closer to the memory that has had limited success since data transfer is still required. In this chapter, we present mMPU memristive memory processing unit, which relies on a memristive memory to perform computation using the memory cells, and therefore directly tackles the von Neumann bottleneck. In mMPU, the operation is controlled by a modified controller and peripheral circuit without changing the structure of the memory cells and arrays. As the basic logic element, we present Memristor-Aided loGIC (MAGIC), a technique to compute logical functions using memristors within the memory array. We further show how to extend basic MAGIC primitives to execute any arbitrary Boolean function and demonstrate the microarchitecture of the memory. This process is required to enable data computing using MAGIC. Finally, we show how to build the computing system using mMPU, which performs computation using MAGIC to enable a real processing-in-memory machine.

23 citations

Journal ArticleDOI
TL;DR: In this review, memristive logic families which can implement MAJORITY gate and NOT are to be favored for in-memory computing, and one-bit full adders implemented in memory array using different logic primitives are compared and the efficiency of majority-based implementation is underscores.
Abstract: As we approach the end of Moore’s law, many alternative devices are being explored to satisfy the performance requirements of modern integrated circuits. At the same time, the movement of data between processing and memory units in contemporary computing systems (‘von Neumann bottleneck’ or ‘memory wall’) necessitates a paradigm shift in the way data is processed. Emerging resistance switching memories (memristors) show promising signs to overcome the ‘memory wall’ by enabling computation in the memory array. Majority logic is a type of Boolean logic which has been found to be an efficient logic primitive due to its expressive power. In this review, the efficiency of majority logic is analyzed from the perspective of in-memory computing. Recently reported methods to implement majority gate in Resistive RAM array are reviewed and compared. Conventional CMOS implementation accommodated heterogeneity of logic gates (NAND, NOR, XOR) while in-memory implementation usually accommodates homogeneity of gates (only IMPLY or only NAND or only MAJORITY). In view of this, memristive logic families which can implement MAJORITY gate and NOT (to make it functionally complete) are to be favored for in-memory computing. One-bit full adders implemented in memory array using different logic primitives are compared and the efficiency of majority-based implementation is underscored. To investigate if the efficiency of majority-based implementation extends to n-bit adders, eight-bit adders implemented in memory array using different logic primitives are compared. Parallel-prefix adders implemented in majority logic can reduce latency of in-memory adders by 50–70% when compared to IMPLY, NAND, NOR and other similar logic primitives.

22 citations


Cited by
More filters
Journal ArticleDOI
01 Jun 2018
TL;DR: This Review Article examines the development of in-memory computing using resistive switching devices, where the two-terminal structure of the devices, theirresistive switching properties, and direct data processing in the memory can enable area- and energy-efficient computation.
Abstract: Modern computers are based on the von Neumann architecture in which computation and storage are physically separated: data are fetched from the memory unit, shuttled to the processing unit (where computation takes place) and then shuttled back to the memory unit to be stored. The rate at which data can be transferred between the processing unit and the memory unit represents a fundamental limitation of modern computers, known as the memory wall. In-memory computing is an approach that attempts to address this issue by designing systems that compute within the memory, thus eliminating the energy-intensive and time-consuming data movement that plagues current designs. Here we review the development of in-memory computing using resistive switching devices, where the two-terminal structure of the devices, their resistive switching properties, and direct data processing in the memory can enable area- and energy-efficient computation. We examine the different digital, analogue, and stochastic computing schemes that have been proposed, and explore the microscopic physical mechanisms involved. Finally, we discuss the challenges in-memory computing faces, including the required scaling characteristics, in delivering next-generation computing. This Review Article examines the development of in-memory computing using resistive switching devices.

1,193 citations

Journal ArticleDOI
TL;DR: In this article, the spin-transfer torque compute-in-memory (STT-CiM) was proposed for in-memory computing with spin transfer torque magnetic RAM, which allows multiple wordlines within an array to be simultaneously enabled, allowing for directly sensing functions of the values stored in multiple rows using a single access.
Abstract: In-memory computing is a promising approach to addressing the processor-memory data transfer bottleneck in computing systems. We propose spin-transfer torque compute-in-memory (STT-CiM), a design for in-memory computing with spin-transfer torque magnetic RAM (STT-MRAM). The unique properties of spintronic memory allow multiple wordlines within an array to be simultaneously enabled, opening up the possibility of directly sensing functions of the values stored in multiple rows using a single access. We propose modifications to STT-MRAM peripheral circuits that leverage this principle to perform logic, arithmetic, and complex vector operations. We address the challenge of reliable in-memory computing under process variations by extending error-correction code schemes to detect and correct errors that occur during CiM operations. We also address the question of how STT-CiM should be integrated within a general-purpose computing system. To this end, we propose architectural enhancements to processor instruction sets and on-chip buses that enable STT-CiM to be utilized as a scratchpad memory. Finally, we present data mapping techniques to increase the effectiveness of STT-CiM. We evaluate STT-CiM using a device-to-architecture modeling framework, and integrate cycle-accurate models of STT-CiM with a commercial processor and on-chip bus (Nios II and Avalon from Intel). Our system-level evaluation shows that STT-CiM provides the system-level performance improvements of 3.93 times on average (up to 10.4 times), and concurrently reduces memory system energy by 3.83 times on average (up to 12.4 times).

205 citations

01 Jan 2016
TL;DR: This living document is pleased to provide this living document for unlocking the evergrowing vocabulary of abbreviations and acronyms of the telecommunications world.
Abstract: physical design electronics wikipedia in integrated circuit design physical design is a step in the standard design cycle which follows after the circuit design at this step circuit representations of, integrated circuit layout wikipedia integrated circuit layout also known ic layout ic mask layout or mask design is the representation of an integrated circuit in terms of planar geometric shapes, engineering courses concordia university concordia university http www concordia ca content concordia en academics graduate calendar current encs engineering courses html, peer reviewed journal ijera com international journal of engineering research and applications ijera is an open access online peer reviewed international journal that publishes research, telecommunications abbreviations and acronyms consultation erkan is pleased to provide this living document for unlocking the evergrowing vocabulary of abbreviations and acronyms of the telecommunications world, contents international information institute vol 7 no 3 may 2004 mathematical and natural sciences study on bilinear scheme and application to three dimensional convective equation itaru hataue and yosuke

183 citations

Posted Content
TL;DR: This work addresses the challenge of reliable in-memory computing under process variations by extending error-correction code schemes to detect and correct errors that occur during CiM operations and proposes architectural enhancements to processor instruction sets and on-chip buses that enable STT-CiM to be utilized as a scratchpad memory.
Abstract: In-memory computing is a promising approach to addressing the processor-memory data transfer bottleneck in computing systems. We propose Spin-Transfer Torque Compute-in-Memory (STT-CiM), a design for in-memory computing with Spin-Transfer Torque Magnetic RAM (STT-MRAM). The unique properties of spintronic memory allow multiple wordlines within an array to be simultaneously enabled, opening up the possibility of directly sensing functions of the values stored in multiple rows using a single access. We propose modifications to STT-MRAM peripheral circuits that leverage this principle to perform logic, arithmetic, and complex vector operations. We address the challenge of reliable in-memory computing under process variations by extending ECC schemes to detect and correct errors that occur during CiM operations. We also address the question of how STT-CiM should be integrated within a general-purpose computing system. To this end, we propose architectural enhancements to processor instruction sets and on-chip buses that enable STT-CiM to be utilized as a scratchpad memory. Finally, we present data mapping techniques to increase the effectiveness of STT-CiM. We evaluate STT-CiM using a device-to-architecture modeling framework, and integrate cycle-accurate models of STT-CiM with a commercial processor and on-chip bus (Nios II and Avalon from Intel). Our system-level evaluation shows that STT-CiM provides system-level performance improvements of 3.93x on average (upto 10.4x), and concurrently reduces memory system energy by 3.83x on average (upto 12.4x).

115 citations

Journal ArticleDOI
03 Nov 2021-ACS Nano
TL;DR: In this article, the Baseline funding program of the King Abdullah University of Science and Technology (KAUST) has been used to support the work of the authors of this paper.
Abstract: This work has been supported by the generous Baseline funding program of the King Abdullah University of Science and Technology (KAUST).

78 citations