Showing papers on "Adder published in 2021"

PDF

Open Access

Journal Article•DOI•

Monolithic optical microlithography of high-density elastic circuits.

[...]

Yu-Qing Zheng¹, Yuxin Liu¹, Donglai Zhong¹, Shayla Nikzad¹, Shuhan Liu¹, Zhiao Yu¹, Deyu Liu¹, Hung-Chin Wu¹, Chenxin Zhu¹, Jinxing Li¹, Helen Tran¹, Jeffrey B.-H. Tok¹, Zhenan Bao¹ - Show less +9 more•Institutions (1)

Stanford University¹

02 Jul 2021-Science

TL;DR: In this article, a monolithic optical micro-lithographic process was proposed to directly micropattern a set of elastic electronic materials by sequential ultraviolet light-triggered solubility modulation.

...read moreread less

Abstract: Polymeric electronic materials have enabled soft and stretchable electronics. However, the lack of a universal micro/nanofabrication method for skin-like and elastic circuits results in low device density and limited parallel signal recording and processing ability relative to silicon-based devices. We present a monolithic optical microlithographic process that directly micropatterns a set of elastic electronic materials by sequential ultraviolet light-triggered solubility modulation. We fabricated transistors with channel lengths of 2 micrometers at a density of 42,000 transistors per square centimeter. We fabricated elastic circuits including an XOR gate and a half adder, both of which are essential components for an arithmetic logic unit. Our process offers a route to realize wafer-level fabrication of complex, high-density, and multilayered elastic circuits with performance rivaling that of their rigid counterparts.

...read moreread less

102 citations

Journal Article•DOI•

An Improved Logarithmic Multiplier for Energy-Efficient Neural Computing

[...]

Mohammad Saeed Ansari¹, Bruce F. Cockburn¹, Jie Han¹•Institutions (1)

University of Alberta¹

01 Apr 2021-IEEE Transactions on Computers

TL;DR: This article proposes an improved logarithmic multiplier (ILM) that, unlike existing designs, rounds both inputs to their nearest powers of two by using a proposed nearest-one detector (NOD) circuit.

...read moreread less

Abstract: Multiplication is the most resource-hungry operation in neural networks (NNs). Logarithmic multipliers (LMs) simplify multiplication to shift and addition operations and thus reduce the energy consumption. Since implementing the logarithm in a compact circuit often introduces approximation, some accuracy loss is inevitable in LMs. However, this inaccuracy accords with the inherent error tolerance of NNs and their associated applications. This article proposes an improved logarithmic multiplier (ILM) that, unlike existing designs, rounds both inputs to their nearest powers of two by using a proposed nearest-one detector (NOD) circuit. Considering that the output of the NOD uses a one-hot representation, some entries in the truth table of a conventional adder cannot occur. Hence, a compact adder is designed for the reduced truth table. The 8×8 ILM achieves up to 17.48 percent saving in power consumption compared to a recent LM in the literature while being almost 8 percent more accurate. Moreover, the evaluation of the ILM for two benchmark NN workloads shows up to 21.85 percent reduction in energy consumption compared to the NNs implemented with other LMs. Interestingly, using the ILM increases the classification accuracy of the considered NNs by up to 1.4 percent compared to a NN implementation that uses exact multipliers.

...read moreread less

53 citations

Journal Article•DOI•

Design and Analysis of Majority Logic-Based Approximate Adders and Multipliers

[...]

Weiqiang Liu¹, Tingting Zhang¹, Emma McLarnon², Maire OrNeill², Paolo Montuschi³, Fabrizio Lombardi⁴ - Show less +2 more•Institutions (4)

Nanjing University of Aeronautics and Astronautics¹, Queen's University Belfast², Polytechnic University of Turin³, Northeastern University⁴

01 Jul 2021-IEEE Transactions on Emerging Topics in Computing

TL;DR: In this article, the authors proposed designs of approximate adders and multipliers based on majority logic (ML), which utilize approximate compressors and a reduction circuitry with so-called complement bits.

...read moreread less

Abstract: As a new paradigm for nanoscale technologies, approximate computing deals with error tolerance in the computational process to improve performance and reduce power consumption. Majority logic (ML) is applicable to many emerging nanotechnologies; its basic building block (the 3-input majority voter, MV) has been extensively used for digital circuit design. In this paper, designs of approximate adders and multipliers based on ML are proposed; the proposed multipliers utilize approximate compressors and a reduction circuitry with so-called complement bits. An influence factor is defined and analyzed to assess the importance of different complement bits depending on the size of the multiplier; a scheme for selection of the complement bits is also presented. The proposed designs are evaluated using hardware metrics (such delay and gate complexity) as well as error metrics. Compared with other ML-based designs found in the technical literature, the proposed designs are found to offer superior performance. Case studies of error-resilient applications are also presented to show the validity of the proposed designs.

...read moreread less

44 citations

Journal Article•DOI•

High-Performance Accurate and Approximate Multipliers for FPGA-based Hardware Accelerators

[...]

Salim Ullah¹, Semeen Rehman², Muhammad Shafique³, Akash Kumar¹•Institutions (3)

Dresden University of Technology¹, Vienna University of Technology², New York University Abu Dhabi³

02 Feb 2021-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: Generic area-optimized, low-latency accurate, and approximate softcore multiplier architectures, which exploit the underlying architectural features of FPGAs, i.e., lookup table (LUT) structures and fast-carry chains to reduce the overall critical path delay (CPD) and resource utilization of multipliers

...read moreread less

Abstract: Multiplication is one of the widely used arithmetic operations in a variety of applications, such as image/video processing and machine learning FPGA vendors provide high-performance multipliers in the form of DSP blocks These multipliers are not only limited in number and have fixed locations on FPGAs but can also create additional routing delays and may prove inefficient for smaller bit-width multiplications Therefore, FPGA vendors additionally provide optimized soft IP cores for multiplication However, in this work, we advocate that these soft multiplier IP cores for FPGAs still need better designs to provide high-performance and resource efficiency Towards this, we present generic area-optimized, low-latency accurate and approximate softcore multiplier architectures, which exploit the underlying architectural features of FPGAs, ie, look-up table (LUT) structures and fast carry chains to reduce the overall critical path delay and resource utilization of multipliers Compared to Xilinx multiplier LogiCORE IP, our proposed unsigned and signed accurate architecture provides up to 25% and 53% reduction in LUT utilization, respectively, for different sizes of multipliers Moreover, with our unsigned approximate multiplier architectures, a reduction of up to 51% in the critical path delay can be achieved with an insignificant loss in output accuracy when compared with the LogiCORE IP For illustration, we have deployed the proposed multiplier architecture in accelerators used in image and video applications, and evaluated them for area and performance gains Our library of accurate and approximate multipliers is open-source and available online at https://cfaedtu-dresdende/pd-downloads to fuel further research and development in this area, facilitate reproducible research, and thereby enabling a new research direction for the FPGA community

...read moreread less

43 citations

Journal Article•DOI•

Carbon nanotube field effect transistor (Cntfet) and resistive random access memory (rram) based ternary combinational logic circuits

[...]

Furqan Zahoor, Fawnizu Azmadi Hussin, Farooq Ahmad Khanday, M. R. Ahmad, Illani Mohd Nawi, Chia Yee Ooi, Fakhrul Zaman Rokhani - Show less +3 more

04 Jan 2021-Electronics

TL;DR: A design approach for ternary combinational logic circuits while using CNTFETs and RRAM is presented and the proposed designs show a significant reduction in the transistor count, decreased cell area, and lower power consumption.

...read moreread less

Abstract: The capability of multiple valued logic (MVL) circuits to achieve higher storage density when compared to that of existing binary circuits is highly impressive. Recently, MVL circuits have attracted significant attention for the design of digital systems. Carbon nanotube field effect transistors (CNTFETs) have shown great promise for design of MVL based circuits, due to the fact that the scalable threshold voltage of CNTFETs can be utilized easily for the multiple voltage designs. In addition, resistive random access memory (RRAM) is also a feasible option for the design of MVL circuits, owing to its multilevel cell capability that enables the storage of multiple resistance states within a single cell. In this manuscript, a design approach for ternary combinational logic circuits while using CNTFETs and RRAM is presented. The designs of ternary half adder, ternary half subtractor, ternary full adder, and ternary full subtractor are evaluated while using Synopsis HSPICE simulation software with standard 32 nm CNTFET technology under different operating conditions, including different supply voltages, output load variation, and different operating temperatures. Finally, the proposed designs are compared with the state-of-the-art ternary designs. Based on the obtained simulation results, the proposed designs show a significant reduction in the transistor count, decreased cell area, and lower power consumption. In addition, due to the participation of RRAM, the proposed designs have advantages in terms of non-volatility.

...read moreread less

35 citations

Proceedings Article•DOI•

AdderSR: Towards Energy Efficient Image Super-Resolution

[...]

Dehua Song¹, Yunhe Wang¹, Hanting Chen¹, Chang Xu², Chunjing Xu¹, Dacheng Tao² - Show less +2 more•Institutions (2)

Huawei¹, University of Sydney²

20 Jun 2021

TL;DR: Hu et al. as mentioned in this paper proposed to use adder neural networks (AdderNets) to calculate the output features to avoid massive energy consumptions of conventional multiplications for image super-resolution.

...read moreread less

Abstract: This paper studies the single image super-resolution problem using adder neural networks (AdderNets). Com-pared with convolutional neural networks, AdderNets utilize additions to calculate the output features thus avoid massive energy consumptions of conventional multiplications. However, it is very hard to directly inherit the existing success of AdderNets on large-scale image classification to the image super-resolution task due to the different calculation paradigm. Specifically, the adder operation cannot easily learn the identity mapping, which is essential for image processing tasks. In addition, the functionality of high-pass filters cannot be ensured by AdderNets. To this end, we thoroughly analyze the relationship between an adder operation and the identity mapping and insert shortcuts to enhance the performance of SR models using adder networks. Then, we develop a learnable power activation for adjusting the feature distribution and refining details. Experiments conducted on several benchmark models and datasets demonstrate that, our image super-resolution models using AdderNets can achieve comparable performance and visual quality to that of their CNN baselines with an about 2.5× reduction on the energy consumption. The codes are available at: https://github.com/huawei-noah/AdderNet.

...read moreread less

34 citations

Posted Content•DOI•

Wafer-scale functional circuits based on two dimensional semiconductors with fabrication optimized by machine learning.

[...]

Xinyu Chen¹, Yufeng Xie¹, Yaochen Sheng¹, Hongwei Tang¹, Zeming Wang¹, Yu Wang¹, Yin Wang¹, Fuyou Liao¹, Jingyi Ma¹, Xiaojiao Guo¹, Ling Tong¹, Hanqi Liu¹, Hao Liu¹, Tianxiang Wu¹, Jiaxin Cao¹, Sitong Bu¹, Hui Shen¹, Fuyu Bai¹, Daming Huang¹, Jianan Deng¹, Antoine Riaud¹, Zihan Xu, Chenjian Wu², Shiwei Xing², Ye Lu¹, Shunli Ma¹, Zhengzong Sun¹, Zhongyin Xue³, Zengfeng Di³, Xiao Gong⁴, David Wei Zhang¹, Peng Zhou¹, Jing Wan¹, Wenzhong Bao¹ - Show less +30 more•Institutions (4)

Fudan University¹, Soochow University (Suzhou)², Chinese Academy of Sciences³, National University of Singapore⁴

12 Oct 2021-Nature Communications

TL;DR: The wafer-scale fabrication processes are guided by ML combined with grid searching to co-optimize device performance, including mobility, threshold voltage and subthreshold swing, and experimentally validate the application potential of ML-assisted fabrication optimization for beyond-silicon electronic materials.

...read moreread less

Abstract: Triggered by the pioneering research on graphene, the family of two-dimensional layered materials (2DLMs) has been investigated for more than a decade, and appealing functionalities have been demonstrated. However, there are still challenges inhibiting high-quality growth and circuit-level integration, and results from previous studies are still far from complying with industrial standards. Here, we overcome these challenges by utilizing machine-learning (ML) algorithms to evaluate key process parameters that impact the electrical characteristics of MoS2 top-gated field-effect transistors (FETs). The wafer-scale fabrication processes are then guided by ML combined with grid searching to co-optimize device performance, including mobility, threshold voltage and subthreshold swing. A 62-level SPICE modeling was implemented for MoS2 FETs and further used to construct functional digital, analog, and photodetection circuits. Finally, we present wafer-scale test FET arrays and a 4-bit full adder employing industry-standard design flows and processes. Taken together, these results experimentally validate the application potential of ML-assisted fabrication optimization for beyond-silicon electronic materials. Here, the authors demonstrate the application of machine learning to optimize the device fabrication process for wafer-scale 2D semiconductors, and eventually fabricate digital, analog, and optoelectrical circuits.

...read moreread less

30 citations

Proceedings Article•DOI•

Mix and Match: A Novel FPGA-Centric Deep Neural Network Quantization Framework

[...]

Sung-En Chang¹, Yanyu Li¹, Mengshu Sun¹, Runbin Shi², Hayden K.-H. So², Xuehai Qian³, Yanzhi Wang¹, Xue Lin¹ - Show less +4 more•Institutions (3)

Northeastern University¹, University of Hong Kong², University of Southern California³

01 Feb 2021

TL;DR: Zhang et al. as mentioned in this paper proposed an FPGA-centric mixed scheme quantization (MSQ) with an ensemble of the proposed sum-ofpower-of-2 (SP2) and the fixed-point schemes.

...read moreread less

Abstract: Deep Neural Networks (DNNs) have achieved extraordinary performance in various application domains. To support diverse DNN models, efficient implementations of DNN inference on edge-computing platforms, e.g., ASICs, FPGAs, and embedded systems, are extensively investigated. Due to the huge model size and computation amount, model compression is a critical step to deploy DNN models on edge devices. This paper focuses on weight quantization, a hardware-friendly model compression approach that is complementary to weight pruning.Unlike existing methods that use the same quantization scheme for all weights, we propose the first solution that applies different quantization schemes for different rows of the weight matrix. It is motivated by (1) the distribution of the weights in the different rows are not the same; and (2) the potential of achieving better utilization of heterogeneous FPGA hardware resources. To achieve that, we first propose a hardware-friendly quantization scheme named sum-of-power-of-2 (SP2) suitable for Gaussian-like weight distribution, in which the multiplication arithmetic can be replaced with logic shifter and adder, thereby enabling highly efficient implementations with the FPGA LUT resources. In contrast, the existing fixed-point quantization is suitable for Uniform-like weight distribution and can be implemented efficiently by DSP. Then to fully explore the resources, we propose an FPGA-centric mixed scheme quantization (MSQ) with an ensemble of the proposed SP2 and the fixed-point schemes. Combining the two schemes can maintain, or even increase accuracy due to better matching with weight distributions.For the FPGA implementations, we develop a parameterized architecture with heterogeneous Generalized Matrix Multiplication (GEMM) cores—one using LUTs for computations with SP2 quantized weights and the other utilizing DSPs for fixed-point quantized weights. Given the partition ratio among the two schemes based on resource characterization, MSQ quantization training algorithm derives an optimally quantized model for the FPGA implementation. We evaluate our FPGA-centric quantization framework across multiple application domains. With optimal SP2/fixed-point ratios on two FPGA devices, i.e., Zynq XC7Z020 and XC7Z045, we achieve performance improvement of 2.1 × -4.1 × compared to solely exploiting DSPs for all multiplication operations. In addition, the CNN implementations with the proposed MSQ scheme can achieve higher accuracy and comparable hardware utilization efficiency compared to the state-of-the-art designs.

...read moreread less

30 citations

Journal Article•DOI•

A high-speed and scalable XOR-XNOR-based hybrid full adder design

[...]

Mehedi Hasan¹, Mehedi Hasan², Md. Shahbaz Hussain³, Mainul Hossain⁴, Mohd. Hasan³, Hasan U. Zaman⁵, Hasan U. Zaman¹, Sharnali Islam⁴ - Show less +4 more•Institutions (5)

North South University¹, University of Science and Technology Chittagong², Aligarh Muslim University³, University of Dhaka⁴, Samsung⁵

01 Jul 2021-Computers & Electrical Engineering

TL;DR: The proposed hybrid FA based on the XOR-XNOR module can be a reliable and superior alternative to existing FAs and showed superior performance in the 32-bit operation.

...read moreread less

29 citations

Journal Article•DOI•

A review on all-optical logic adder: Heading towards next-generation processor

[...]

Kamanashis Goswami¹, Haraprasad Mondal¹, Mrinal K. Sen¹•Institutions (1)

Indian Institutes of Technology¹

15 Mar 2021-Optics Communications

TL;DR: In this paper, insights have been discussed leading to design of more efficient PhC based all-optical adders for next generation ultra-first optical processors.

...read moreread less

25 citations

Journal Article•DOI•

Accelerated Addition in Resistive RAM Array Using Parallel-Friendly Majority Gates

[...]

John Reuben¹, Stefan Pechmann²•Institutions (2)

University of Erlangen-Nuremberg¹, University of Bayreuth²

01 Apr 2021-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: A method to implement a majority gate in a transistor-accessed ReRAM array during the READ operation, which forms a functionally complete Boolean logic, capable of implementing any digital logic.

...read moreread less

Abstract: To overcome the “von Neumann bottleneck,” methods to compute in memory are being researched in many emerging memory technologies, including resistive RAMs (ReRAMs). Majority logic is efficient for synthesizing arithmetic circuits when compared to NAND/NOR/IMPLY logic. In this work, we propose a method to implement a majority gate in a transistor-accessed ReRAM array during the READ operation. Together with NOT gate, which is also implemented in memory, the proposed gate forms a functionally complete Boolean logic, capable of implementing any digital logic. Computing is simplified to a sequence of READ and WRITE operations and does not require any major modifications to the peripheral circuitry of the array. While many methods have been proposed recently to implement the Boolean logic in memory, the latency of in-memory adders implemented as a sequence of such Boolean operations is exorbitant ( ${O}$ ( ${n}$ )). Parallel-prefix (PP) adders use prefix computation to accelerate addition in conventional CMOS-based adders. By exploiting the parallel-friendly nature of the proposed majority gate and the regular structure of the memory array, it is demonstrated how PP adders can be implemented in memory in ${O}$ (log( ${n}$ )) latency. The proposed in-memory addition technique incurs a latency of $4\cdot $ log( ${n}$ )+6 for $n$ -bit addition and is energy-efficient due to the absence of sneak currents in 1Transistor–1Resistor configuration.

...read moreread less

Book Chapter•DOI•

An Efficient Design of 8 * 8 Wallace Tree Multiplier Using 2 and 3-Bit Adders

[...]

M. Sakthimohan¹, J. Deny¹•Institutions (1)

Kalasalingam University¹

01 Jan 2021

TL;DR: In this paper, a Wallace tree 8 * 8 multiplier architecture is proposed, and it produces optimized area and delay, where 2-bit and 3-bit adders are utilized in the 8-bit multiplier.

...read moreread less

Abstract: In VLSI, hardware architecture requires the multiplier unit as one of the important parts for arithmetic operation. A multiplier is a major component in many hardware architectures, so various experts are focusing their research in multiplier design to accomplish compact area, delay, and power. Numerous case studies were done for many architectures, in that the increased speed and low area are achieved through a reduction of partial products. One and only of the finest methods is Wallace tree multiplier (WTM). In this research article, Wallace tree 8 * 8 multiplier architecture is proposed, and it produces optimized area and delay. Our work targets structuring and execution of Wallace tree 8 * 8 multiplier utilizing VHDL language. Using limiting quantity of partial products, 2-bit and 3-bit adders are utilized in the 8-bit multiplier. In this work, 8 * 8 Wallace tree multiplier development is inspected and reproduced in XILINX Integrated Software Environment tool. In this 8-bit Wallace tree multiplier circuit, our primary objectives are to diminish the area of multiplier circuit and speed up multiplier routine.

...read moreread less

Posted Content•DOI•

A novel design of all-optical full-adder using nonlinear X-shaped photonic crystal resonators

[...]

Saleh Naghizade¹, Hamed Saghaei¹•Institutions (1)

Islamic Azad University¹

12 Feb 2021-Optical and Quantum Electronics

TL;DR: In this article, the authors proposed a new all-optical fulladder design based on nonlinear X-shaped photonic crystal (PhC) resonators for high-speed data processing systems.

...read moreread less

Abstract: This paper proposes a new all-optical full-adder design based on nonlinear X-shaped photonic crystal (PhC) resonators. The PhC-based full-adder consists of three input ports, two X-shaped PhC resonators (X-PCRs), and two output ports. The dielectric rods made of silicon and nonlinear rods composed of doped glass are used to design the X-PCRs. Two well-known plane wave expansion and finite difference time domain methods are applied to study and analyze the photonic band structure and light propagation inside the PhC, respectively. Our numerical results demonstrate when the incoming light intensity increases, the nonlinear Kerr effect appears and manages the direction of light propagation inside the structure. The maximum time delay and footprint of the proposed full-adder are about 2.5 ps and 663 μm2, making it an appropriate adder for high-speed data processing systems.

...read moreread less

Journal Article•DOI•

Design and analysis of novel QCA full adder-subtractor

[...]

Marshal Raj¹, Lakshminarayanan Gopalakrishnan¹, Seok-Bum Ko²•Institutions (2)

National Institute of Technology, Tiruchirappalli¹, University of Saskatchewan²

03 Jul 2021-International Journal of Electronics Letters

TL;DR: Quantum-dot Cellular Automata is an evolving post-CMOS paradigm that can be used for designing nanoscale circuits and digital circuits are implemented in QCA using majority logic.

...read moreread less

Abstract: Quantum-dot Cellular Automata is an evolving post-CMOS paradigm that can be used for designing nanoscale circuits. Digital circuits are implemented in QCA using majority logic. Adder and subtractor...

...read moreread less

Journal Article•DOI•

Comprehensive study of 1-Bit full adder cells: review, performance comparison and scalability analysis

[...]

Mehedi Hasan¹, Mehedi Hasan², Abdul Hasib Siddique³, Abdal Hoque Mondol⁴, Mainul Hossain⁵, Hasan U. Zaman⁶, Hasan U. Zaman², Sharnali Islam⁵ - Show less +4 more•Institutions (6)

University of Science and Technology Chittagong¹, North South University², Khalifa University³, United International University⁴, University of Dhaka⁵, Samsung⁶

01 Jun 2021

TL;DR: The main finding of this research is that the single-bit performance parameters of FA cells should not be considered as the main basis for performance comparison and any FA cell should be analyzed in a multi-bit structure to determine its practical effectiveness.

...read moreread less

Abstract: Full Adder (FA) circuits are integral components in the design of Arithmetic Logic Units (ALUs) of modern computing systems. Recently, there have been massive research interests in this area due to the growing need for low-power and high-performance computing systems. Researchers have proposed a variety of FA cells with diverse design techniques, each having its pros and cons. As a result, a systematic method for performance comparison of FA cells using a common simulation platform has become necessary. In this work, we present an extensive study of FA cells. We have compared the performance of thirty-three (33) existing 1-bit FA cells. The drive powers of these FA cells have been compared by applying a variety of load conditions. In addition, the 1-bit FA cells have been extended to 32-bit structures to test their scalability and to investigate their performance in wide-word structures. We have determined that twenty-one (21) of the thirty-three (33) FA cells cannot operate in a 32-bit structure, even though some of them exhibit excellent performance as a 1-bit cell. The main finding of this research is that the single-bit performance parameters of FA cells should not be considered as the main basis for performance comparison. Any FA cell should be analyzed in a multi-bit structure to determine its practical effectiveness.

...read moreread less

Proceedings Article•DOI•

PrefixRL: Optimization of Parallel Prefix Circuits using Deep Reinforcement Learning

[...]

Rajarshi Roy¹, Jonathan Raiman¹, Neel Kant¹, Ilyas Elkin¹, Robert M. Kirby¹, Michael Siu¹, Stuart F. Oberman¹, Saad Godil¹, Bryan Catanzaro¹ - Show less +5 more•Institutions (1)

Nvidia¹

05 Dec 2021

TL;DR: In this paper, a grid-based state-action representation and an RL environment for constructing legal prefix circuits are designed and RL agents trained on this environment produce prefix adder circuits that Pareto-dominate existing baselines with up to 16.0% and 30.2% lower area.

...read moreread less

Abstract: In this work, we present a reinforcement learning (RL) based approach to designing parallel prefix circuits such as adders or priority encoders that are fundamental to high-performance digital design. Unlike prior methods, our approach designs solutions tabula rasa purely through learning with synthesis in the loop. We design a grid-based state-action representation and an RL environment for constructing legal prefix circuits. Deep Convolutional RL agents trained on this environment produce prefix adder circuits that Pareto-dominate existing baselines with up to 16.0% and 30.2% lower area for the same delay in the 32b and 64b settings respectively. We observe that agents trained with open-source synthesis tools and cell library can design adder circuits that achieve lower area and delay than commercial tool adders in an industrial cell library.

...read moreread less

Journal Article•DOI•

Memristor-based Hopfield network circuit for recognition and sequencing application

[...]

Junwei Sun¹, Xiao Xiao¹, Qinfei Yang¹, Peng Liu¹, Yanfeng Wang¹ - Show less +1 more•Institutions (1)

Zhengzhou University of Light Industry¹

01 May 2021-Aeu-international Journal of Electronics and Communications

TL;DR: A memristor neural network circuit is designed, which can recognize and sequence four characters simultaneously, which may provide a reference for the development of new brain-like system.

...read moreread less

Abstract: Hopfield neural network has been widely used in image recognition because of its associative memory behavior. In this paper, a memristor neural network circuit is designed, which can recognize and sequence four characters simultaneously. It mainly includes three modules, namely a character recognition module, a signal processing module and a sequence module. The character recognition module consists of four individual character recognition units, corresponding to the recognition of four character images (W, H, A, T). The character recognition module includes calculation submodule and iteration submodule. After the operation of the calculation submodule and the iterative submodule, the four-character images distributed by noise can be identified simultaneously. The signal processing module is used to simplify the output signals of the character recognition module by four adder units. The sequence module ensures that stable state is eventually converged to the word (WHAT). The synapse weight circuit given in this paper can obtain different weights, so as to realize the function of associative memory. The iterative process circuit of Hopfield neural network is also designed to further demonstrate the iterative process. The neural network circuit composed of memristors maybe smaller, which may provide a reference for the development of new brain-like system.

...read moreread less

Journal Article•DOI•

Logic Device Based on Skyrmion Annihilation

[...]

Moojune Song¹, Min Gyu Park¹, San Ko¹, Sung Kyu Jang¹, Minkyu Je¹, Kab-Jin Kim¹ - Show less +2 more•Institutions (1)

KAIST¹

10 Feb 2021-IEEE Transactions on Electron Devices

TL;DR: In this paper, a skyrmion-based logic device is presented, which takes advantage of the skyrmetion annihilation (SA) and increases the efficiency of logic operation.

...read moreread less

Abstract: Skyrmion-based devices are an attractive candidate for nonvolatile memory and low-power computation. In a real device, however, skyrmions easily annihilate at device edges, which hampers device applications. Here, we present a novel skyrmion-based logic device, which takes advantage of the skyrmion annihilation (SA) and increases the efficiency of logic operation. An SA half adder (HA) is implemented in a ferromagnet/heavy metal nanotrack by introducing a geometric notch that annihilates the skyrmion. In addition, full adder and ${n}$ -bit SA ripple-carry adder are demonstrated by directly cascading the SA HAs. The prototype of a 32-bit SA ripple-carry adder consumes energy as low as 0.62 pJ per each operation, which is only 18% of the previously proposed skyrmion adder. Our SA logic gate can, therefore, be a promising candidate for the beyond-CMOS logic device.

...read moreread less

Journal Article•DOI•

Z-PIM: A Sparsity-Aware Processing-in-Memory Architecture With Fully Variable Weight Bit-Precision for Energy-Efficient Deep Neural Networks

[...]

Ji-Hoon Kim¹, Juhyoung Lee¹, Jinsu Lee¹, Jaehoon Heo¹, Joo-Young Kim¹ - Show less +1 more•Institutions (1)

KAIST¹

27 Jan 2021-IEEE Journal of Solid-state Circuits

TL;DR: Z-PIM as mentioned in this paper adopts the bit serial arithmetic that performs a multiplication bit-by-bit through multiple cycles to reduce the complexity of the operation in a single cycle and to provide flexibility in bit-precision.

...read moreread less

Abstract: We present an energy-efficient processing-in-memory (PIM) architecture named Z-PIM that supports both sparsity handling and fully variable bit-precision in weight data for energy-efficient deep neural networks. Z-PIM adopts the bit-serial arithmetic that performs a multiplication bit-by-bit through multiple cycles to reduce the complexity of the operation in a single cycle and to provide flexibility in bit-precision. To this end, it employs a zero-skipping convolution SRAM, which performs in-memory AND operations based on custom 8T-SRAM cells and channel-wise accumulations, and a diagonal accumulation SRAM that performs bit- and spatial-wise accumulation on the channel-wise accumulation results using diagonal logic and adders to produce the final convolution outputs. We propose the hierarchical bitline structure for energy-efficient weight bit pre-charging and computational readout by reducing the parasitic capacitances of the bitlines. Its charge reuse scheme reduces the switching rate by 95.42% for the convolution layers of VGG-16 model. In addition, Z-PIM’s channel-wise data mapping enables sparsity handling by skip-reading the input channels with zero weight. Its read-operation pipelining enabled by a read-sequence scheduling improves the throughput by 66.1%. The Z-PIM chip is fabricated in a 65-nm CMOS process on a 7.568-mm2 die, while it consumes average 5.294-mW power at 1.0-V voltage and 200-MHz frequency. It achieves 0.31–49.12-TOPS/W energy efficiency for convolution operations as the weight sparsity and bit-precision vary from 0.1 to 0.9 and 1 to 16 bit, respectively. For the figure of merit considering input bit-width, weight bit-width, and energy efficiency, the Z-PIM shows more than 2.1 times improvement over the state-of-the-art PIM implementations.

...read moreread less

Journal Article•DOI•

Ultracompact all-optical full adders using an interference effect based on 2D photonic crystal nanoring resonators

[...]

Masoud Mohammadi¹, Vahid Fallahi¹, Mahmood Seifouri¹•Institutions (1)

Shahid Rajaee Teacher Training University¹

01 Feb 2021-Journal of Computational Electronics

TL;DR: In this paper, a 2D-PC with a hexagonal nanoring resonator (NRR), a coupling rod, and several waveguides was designed and simulated, where the mechanism of the interference effect in the PC was used to simplify and minimize the structure.

...read moreread less

Abstract: Given the special place of hybrid logic circuits such as all-optical full adders in next-generation digital systems, a new kind of these structures using two-dimensional (2D) photonic crystals (2D-PC) is designed and simulated herein. The proposed structure is made of a hexagonal nanoring resonator (NRR), a coupling rod, and several waveguides. In this all-optical full adder, the mechanism of the interference effect in the PCs is used to simplify and minimize the structure. To make the structure flexible, the radius of the dielectric rod in the whole structure and the NRR are considered based on a lattice constant of 0.2a and 0.04a, respectively. The structure is operated at a wavelength of 1550 nm, considering the value of the power entering the waveguides and that exiting the Carry and Sum ports. To analyze the all-optical full adder, the plane-wave expansion method and finite-difference time-domain method are applied respectively to calculate the bandgap diagram and obtain the transmission and propagation of the optical field. In the proposed structure, the contrast ratio at the Carry and is been investigated in a unique and novel way, yielding values of 10.68 and 9.03 dB, respectively. In addition, the maximum and minimum response time for the Carry and Sum are obtained as 1.6 and 0.75 ps, respectively. The total footprint of the structure is about 183 µm2. Due to its ultracompact size, low power consumption, fast response time, and simple structure, this all-optical full adder is suitable for use in low-power optical integrated circuits.

...read moreread less

Journal Article•DOI•

Hybrid Partial Product-based High-Performance Approximate Recursive Multipliers

[...]

Haroon Waris¹, Chenghua Wang¹, Weiqiang Liu¹, Jie Han², Fabrizio Lombardi - Show less +1 more•Institutions (2)

Nanjing University of Aeronautics and Astronautics¹, University of Alberta²

01 Jan 2021-IEEE Transactions on Emerging Topics in Computing

TL;DR: In this brief, hybrid partial product-based building blocks are proposed by considering the probability distribution of the input operands and an efficient hardware implementation of approximate 4×4 multipliers is achieved, while maintaining the required accuracy.

...read moreread less

Abstract: In this brief, hybrid partial product-based building blocks are proposed by considering the probability distribution of the input operands. An efficient hardware implementation of approximate 4x4 multipliers is achieved while maintaining a required accuracy. Moreover, high-performance approximate NOR-based half adder (NxHA) and full adder (NxFA) cells are proposed for use in a 4x4 multiplier. Three different strategies (Ax8-1/2/3) are further proposed and analyzed for utilizing the 4x4 multipliers when designing larger multipliers. Ax8-2 provides the best trade-off among the designs with a moderate MRED. A reduction of 30% and 17% in the MRED is achieved compared to previous best energy-optimized and MRED-optimized designs. Among the designs with higher MREDs, Ax8-3 exhibits the smallest MRED and PDP. Moreover, it shows an improvement of 7% to 28% in delay compared to existing approximate recursive designs. As a case study, image multiplication is evaluated; a high peak signal-to-noise ratio (PSNR) with a value close to 50dB is obtained for the proposed multiplier designs.

...read moreread less

Journal Article•DOI•

SIXOR: Single-Cycle In-Memristor XOR

[...]

Nima TaheriNejad¹•Institutions (1)

Vienna University of Technology¹

01 May 2021-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: In this article, the authors proposed a stateful crossbar-compatible XOR operation that requires only one cycle for its completion, which is two times faster than the current minimum required time for performing XOR (which is two cycles) using other atomic operations in comparable memristive stateful logic families.

...read moreread less

Abstract: With the fast approach of the end of silicon scaling and existing problems, such as the Von-Neumann bottleneck, alternative computing paradigms are in demand. In-memory computation (IMC) is one of the most promising solutions, and memristive technology is one of the best platforms for that purpose. Many logic families have been proposed to enable memristive IMC, among which stateful logic family stands out due to its minimal power consumption and simplicity. In this work, to complement existing works, we propose the first stateful crossbar-compatible XOR atomic logic operation that requires only one cycle for its completion, which is two times faster than the current minimum required time for performing XOR (which is two cycles) using other atomic operations in comparable memristive stateful logic families. We show that, in an example case of an adder, by taking advantage of the proposed single-cycle in-memristor XOR (SIXOR), up to $4.5\times $ speedup can be achieved compared to other SoA stateful adders. The gained speed-up scales up in more complex systems and calculations that use XOR .

...read moreread less

Journal Article•DOI•

High Performance Error Tolerant Adders for Image Processing Applications

[...]

R. Jothin¹, C. Vasanthanayaki²•Institutions (2)

KGiSL Institute of Technology¹, Government College of Technology, Coimbatore²

04 Mar 2021-Iete Journal of Research

TL;DR: To achieve high performance, Multiplexer Based Approximate Full Adders (MBAFA) are proposed in the inaccurate part of the HPETA design, which exhibits high speed, area efficiency, low power consumption, less Area-Delay Product (ADP) and 56.32% lesser Power-Delayed Product (PDP) than the existing conventional CSLA, SAET-CSLA, ETCSLa, HSETA, HSSSA, respectively.

...read moreread less

Abstract: In this paper, we proposed High Performance Error Tolerant Adders (HPETA) which have an efficient design and quality metrics for inexact computing applications. To achieve high performance, Multipl...

...read moreread less

Journal Article•DOI•

Performance analysis of optimized plasmonic half-adder circuit using Mach-Zehnder interferometer for high-speed switching applications

[...]

Sandip Swarnakar, Amrutha Guddati, Siva Koti Reddy, Ramanand Harijan, Santosh Kumar¹, Santosh Kumar² - Show less +2 more•Institutions (2)

Liaocheng University¹, DIT University²

01 May 2021-Microelectronics Journal

TL;DR: The basic structure for a half-adder circuit is proposed by inducing nonlinear Kerr-material to the Mach-Zehnder interferometers (MZIs) by using MIM plasmonic waveguide-based MZIs in the footprint of 85 μm.

...read moreread less

Journal Article•DOI•

Design analysis and applications of all-optical multifunctional logic using a semiconductor optical amplifier-based polarization rotation switch

[...]

Ashif Raja¹, Kousik Mukherjee¹, Jitendra Nath Roy¹•Institutions (1)

Kazi Nazrul University¹

01 Feb 2021-Journal of Computational Electronics

TL;DR: In this article, a new semiconductor optical amplifier (SOA)-based module for multi-valued logic units using the cross-polarization modulation effect is proposed and analyzed.

...read moreread less

Abstract: In this communication, a new semiconductor optical amplifier (SOA)-based module for multi-valued logic units using the cross-polarization modulation effect is proposed and analyzed. The design is simple and compact, consisting of only three SOAs and a few passive optical elements. SOAs have very low switching power (< 1mW), and are very small (< 1 mm) and integrable into modern optical integrated circuits. Being multifunctional, the design is versatile; it can function as a demultiplexer, comparator, half adder, half subtractor, and as basic (OR, AND), universal (NOR, NAND), XOR, and XNOR logic gates. This design follows a tree architecture, operates at very high speed (~ 100Gbit/s), and provides a good Q factor (30 dB or more). The corresponding bit error rate (BER) is very low (~ 10–24). In this work, a relative eye opening as large as 90.4% is calculated. The variations in Q and BER with noise and control power are also investigated.

...read moreread less

Journal Article•DOI•

Multioperative reversible gate design with implementation of 1-bit full adder and subtractor along with energy dissipation analysis

[...]

Sadat Riyaz¹, Syed Farah Naz¹, Vijay Kumar Sharma¹•Institutions (1)

Shri Mata Vaishno Devi University¹

01 Apr 2021-International Journal of Circuit Theory and Applications

Journal Article•DOI•

A combined three and five inputs majority gate-based high performance coplanar full adder in quantum-dot cellular automata

[...]

Fahimeh Danehdaran, Shaahin Angizi¹, Milad Bagherian Khosroshahy², Keivan Navi², Nader Bagherzadeh³ - Show less +1 more•Institutions (3)

University of Central Florida¹, Shahid Beheshti University², University of California, Irvine³

01 Jun 2021-International Journal of Information Technology

TL;DR: An alternative approach for the streamlined physical design of quantum-dot cellular automata (QCA) full-adder circuits in which the placement of input cells and wire crossing congestion are substantially reduced.

...read moreread less

Abstract: Nowadays, arithmetic computing is an important subject in computer architectures in which the one-bit full-adder gate plays a significant role. Thus, efficient design of such full-adder component can be beneficial to the overall efficiency of the entire system. In this essay, a novel method for the design and simulation of a combined majority gate toward realization of the one-bit full-adder gate is proposed. We inspect an alternative approach for the streamlined physical design of quantum-dot cellular automata (QCA) full-adder circuits in which the placement of input cells and wire crossing congestion are substantially reduced. The proposed method has outstanding characteristics such as low complexity, reduced area consumption, simplified physical design, and ultra-high speed one-bit full-adder. Based on simulation results the proposed design provides 33.33% reduction in area and 20.00% improvement in complexity as well as 10.49% in 1 Ek reduction in power consumption.

...read moreread less

Journal Article•DOI•

LAHAF: Low-power, area-efficient, and high-performance approximate full adder based on static CMOS

[...]

Seyed Erfan Fatemieh¹, Samira Shirinabadi Farahani¹, Mohammad Reza Reshadinezhad¹•Institutions (1)

University of Isfahan¹

01 Jun 2021-Sustainable Computing: Informatics and Systems

TL;DR: A low power, area-efficient full adder cell designed with approximate outputs that is applicable in image processing as an error-resilient application and the final outputs of approximation are acceptable in this application due to image quality metrics.

...read moreread less

Proceedings Article•DOI•

Novel Approximate Multiplier Designs for Edge Detection Application

[...]

Yashaswi Mannepalli¹, Viraj Bharadwaj Korede¹, Madhav Rao¹•Institutions (1)

International Institute of Information Technology, Bangalore¹

22 Jun 2021

TL;DR: In this paper, the authors proposed a reliable and efficient approximate multiplier design, that uses optimized lower part constant OR adder (OLOCA) design and hardware optimized approximate adder with normal error distribution (HOAANED) separately as two variants.

...read moreread less

Abstract: Approximate computing in general has garnered much needed attention in the design community owing to high power saving benefits, and at the same time quick generation of results. Approximate computing as a design technique continues to offer design advantages which is recently ceased by the ever decreasing technology scaling. Approximate computing is mostly applied to arithmetic designs, that has resulted in significant research interests. The paper proposes a reliable and efficient approximate multiplier design, that uses optimized lower part constant OR adder (OLOCA) design and hardware optimized approximate adder with normal error distribution (HOAANED) separately as two variants. The two approximate multipliers derived from OLOCA adder and HOAANED adder were found to be highly power and footprint efficient, and in addition offers performance improvement over other approximate multipliers. The error characteristics for the proposed multiplier designs were evaluated and compared with the existing approximate multiplier design. The proposed multiplier design along with the existing ones were synthesized using 45 nm CMOS technology and results were analyzed. The proposed approximate multipliers were further explored for canny edge detection application, and results for different standard images were found to be highly acceptable showing 99.9% of outcome similar to exact multiplier design.

...read moreread less

Journal Article•DOI•

A high-performance full swing 1-bit hybrid full adder cell

[...]

Shahbaz Hussain¹, Mehedi Hasan², Mehedi Hasan³, Gazal Agrawal¹, Mohd. Hasan¹ - Show less +1 more•Institutions (3)

Aligarh Muslim University¹, North South University², University of Science and Technology Chittagong³

04 Sep 2021-Iet Circuits Devices & Systems

Collapse