scispace - formally typeset
Search or ask a question

Showing papers on "Adder published in 2017"


Proceedings ArticleDOI
27 Mar 2017
TL;DR: The EvoApprox8b library provides Verilog, Matlab and C models of all approximate circuits and the error is given for seven different error metrics.
Abstract: Approximate circuits and approximate circuit design methodologies attracted a significant attention of researchers as well as industry in recent years. In order to accelerate the approximate circuit and system design process and to support a fair benchmarking of circuit approximation methods, we propose a library of approximate adders and multipliers called EvoApprox8b. This library contains 430 non-dominated 8-bit approximate adders created from 13 conventional adders and 471 non-dominated 8-bit approximate multipliers created from 6 conventional multipliers. These implementations were evolved by a multi-objective Cartesian genetic programming. The EvoApprox8b library provides Verilog, Matlab and C models of all approximate circuits. In addition to standard circuit parameters, the error is given for seven different error metrics. The EvoApprox8b library is available at: www.fit.vutbr.cz/research/groups/ehw/approxlib

241 citations


Proceedings ArticleDOI
25 Jun 2017
TL;DR: An uncoordinated Gaussian multiple access channel with a relatively large number of active users within each block is considered, and a low complexity coding scheme is proposed, which is based on a combination of compute-and-forward and coding for a binary adder channel.
Abstract: We consider an uncoordinated Gaussian multiple access channel with a relatively large number of active users within each block. A low complexity coding scheme is proposed, which is based on a combination of compute-and-forward and coding for a binary adder channel. For a wide regime of parameters of practical interest, the energy-per-bit required by each user in the proposed scheme is significantly smaller than that required by popular solutions such as slotted-ALOHA and treating interference as noise.

216 citations


Journal ArticleDOI
TL;DR: This work presents a robust, general, scalable system, called 'Boolean logic and arithmetic through DNA excision' (BLADE), to engineer genetic circuits with multiple inputs and outputs in mammalian cells with minimal optimization.
Abstract: Engineered genetic circuits for mammalian cells often require extensive fine-tuning to perform as intended. We present a robust, general, scalable system, called 'Boolean logic and arithmetic through DNA excision' (BLADE), to engineer genetic circuits with multiple inputs and outputs in mammalian cells with minimal optimization. The reliability of BLADE arises from its reliance on recombinases under the control of a single promoter, which integrates circuit signals on a single transcriptional layer. We used BLADE to build 113 circuits in human embryonic kidney and Jurkat T cells and devised a quantitative, vector-proximity metric to evaluate their performance. Of 113 circuits analyzed, 109 functioned (96.5%) as intended without optimization. The circuits, which are available through Addgene, include a 3-input, two-output full adder; a 6-input, one-output Boolean logic look-up table; circuits with small-molecule-inducible control; and circuits that incorporate CRISPR-Cas9 to regulate endogenous genes. BLADE enables execution of sophisticated cellular computation in mammalian cells, with applications in cell and tissue engineering.

209 citations


Journal ArticleDOI
TL;DR: A review and classification are presented for the current designs of approximate arithmetic circuits including adders, multipliers, and dividers including improvements in delay, power, and area for the detection of differences in images by using approximate dividers.
Abstract: Often as the most important arithmetic modules in a processor, adders, multipliers, and dividers determine the performance and energy efficiency of many computing tasks. The demand of higher speed and power efficiency, as well as the feature of error resilience in many applications (e.g., multimedia, recognition, and data analytics), have driven the development of approximate arithmetic design. In this article, a review and classification are presented for the current designs of approximate arithmetic circuits including adders, multipliers, and dividers. A comprehensive and comparative evaluation of their error and circuit characteristics is performed for understanding the features of various designs. By using approximate multipliers and adders, the circuit for an image processing application consumes as little as 47% of the power and 36% of the power-delay product of an accurate design while achieving similar image processing quality. Improvements in delay, power, and area are obtained for the detection of differences in images by using approximate dividers.

197 citations


Journal ArticleDOI
18 Oct 2017-ACS Nano
TL;DR: The memlogic (memory logic) is proposed and demonstrated as a nonvolatile switch of logic operations integrated with memory function in a single light-gated memristor, able to achieve optical and electrical mixed basic Boolean logic of reconfigurable "AND", "OR", and "NOT" operations.
Abstract: Memristive devices are able to store and process information, which offers several key advantages over the transistor-based architectures. However, most of the two-terminal memristive devices have fixed functions once made and cannot be reconfigured for other situations. Here, we propose and demonstrate a memristive device “memlogic” (memory logic) as a nonvolatile switch of logic operations integrated with memory function in a single light-gated memristor. Based on nonvolatile light-modulated memristive switching behavior, a single memlogic cell is able to achieve optical and electrical mixed basic Boolean logic of reconfigurable “AND”, “OR”, and “NOT” operations. Furthermore, the single memlogic cell is also capable of functioning as an optical adder and digital-to-analog converter. All the memlogic outputs are memristive for in situ data storage due to the nonvolatile resistive switching and persistent photoconductivity effects. Thus, as a memdevice, the memlogic has potential for not only simplifying ...

152 citations


Journal ArticleDOI
TL;DR: A generic methodology for analytical modeling of probability of occurrence of error and the Probability Mass Function of error value in a selected class of approximate adders is presented, which can serve as performance metrics for the comparative analysis of various adders and their configurations.
Abstract: Approximate adders are widely being advocated as a means to achieve performance gain in error resilient applications. In this paper, a generic methodology for analytical modeling of probability of occurrence of error and the Probability Mass Function (PMF) of error value in a selected class of approximate adders is presented, which can serve as performance metrics for the comparative analysis of various adders and their configurations. The proposed model is applicable to approximate adders that comprise of sub-adder units of uniform as well as non-uniform lengths. Using a systematic methodology, we derive closed form expressions for the probability of error for a number of state-of-the-art high-performance approximate adders. The probabilistic analysis is carried out for arbitrary input distributions. It can be used to study the dependence of error statistics in an adder’s output on its configuration and input distribution. Moreover, it is shown that by building upon the proposed error model, we can estimate the probability of error in circuits with multiple approximate adders. We also demonstrate that, using the proposed analysis, the comparative performance of different approximate adders can be correctly predicted in practical applications of image processing.

88 citations


Journal ArticleDOI
TL;DR: A novel 3-input XOR gate structure is proposed based on half distance and cell interaction that promises extra low-power, extremely dense and high-speed structures at a nano scale and indicates the efficiency and robustness of the proposed designs.
Abstract: Quantum-dot cellular automata (QCA), which is a candidate technology to replace CMOS technology, promises extra low-power, extremely dense and high-speed structures at a nano scale. In this paper, a novel 3-input XOR gate structure is proposed based on half distance and cell interaction. Accordingly, a low-complexity and high-speed QCA one-bit full adder is designed by employing the proposed 3-input QCA XOR gate. Then a new 4-bit QCA Ripple Carry Adder (RCA) is proposed based on the proposed 3-input QCA XOR gate. The proposed designs are simulated using the both coherence and bi-stable simulation engines of QCADesigner version 2.0.3. Our simulation results indicate the efficiency and robustness of the proposed designs. The simulation results show 50% area improvement for the proposed 3-input XOR gate, 76% and 50% improvements in terms of cell count and latency, respectively for the proposed robust QCA full-adder, 58% and 52% improvements in terms of latency and cost, respectively for 4-bit QCA RCA compared to the previous designs.

82 citations


Journal ArticleDOI
TL;DR: A new low-power gate design, i.e., memristors-as-drivers gates, is proposed, which overcomes each of these issues by combining sense circuitry with the IMPLY operation.
Abstract: Memristors have recently begun to be explored in arithmetic applications. However, all prior designs for memristor-based gates have had shortcomings in terms of scalability, applicability, completeness, and performance. In this brief, a new low-power gate design, i.e., memristors-as-drivers gates, is proposed, which overcomes each of these issues by combining sense circuitry with the IMPLY operation. By sensing the values of the input memristors as the driver for the output memristor, the delay is reduced to a single step for any Boolean operation, including xor. The area is reduced to at most three memristors for each gate and consumes only 30 fJ. An ${N}$-bit ripple carry adder implementation is proposed, which uses these gates to achieve a total delay of ${N}+1$ with an area of 8${N}$ memristors and their drivers. The individual bits of the proposed adder can be also pipelined, reducing the latency to four steps per addition.

81 citations


Proceedings ArticleDOI
27 Mar 2017
TL;DR: This paper proposes a stochastic-binary hybrid design which splits the computation between the Stochastic and binary domains for near-sensor NN applications, and shows that retraining the binary portion of the NN computation can compensate for precision losses introduced by shorter stoChastic bit-streams.
Abstract: Recent advances in neural networks (NNs) exhibit unprecedented success at transforming large, unstructured data streams into compact higher-level semantic information for tasks such as handwriting recognition, image classification, and speech recognition. Ideally, systems would employ near-sensor computation to execute these tasks at sensor endpoints to maximize data reduction and minimize data movement. However, near-sensor computing presents its own set of challenges such as operating power constraints, energy budgets, and communication bandwidth capacities. In this paper, we propose a stochastic-binary hybrid design which splits the computation between the stochastic and binary domains for near-sensor NN applications. In addition, our design uses a new stochastic adder and multiplier that are significantly more accurate than existing adders and multipliers. We also show that retraining the binary portion of the NN computation can compensate for precision losses introduced by shorter stochastic bit-streams, allowing faster run times at minimal accuracy losses. Our evaluation shows that our hybrid stochastic-binary design can achieve 9.8x energy efficiency savings, and application-level accuracies within 0.05% compared to conventional all-binary designs.

80 citations


Journal ArticleDOI
TL;DR: In this paper, the authors proposed and designed an all optical full adder based on photonic crystal, which used four nonlinear resonant cavities inside a two-dimensional photonic lattice.

79 citations


Journal ArticleDOI
TL;DR: In this article, the authors use simulations based on the stochastic Landau-Lifshitz-Gilbert (sLLG) equation to demonstrate that similar impressive functions can be performed using unstable nanomagnets with energy barriers as low as a fraction of a kT.
Abstract: It has recently been shown that a suitably interconnected network of tunable telegraphic noise generators or “p-bits” can be used to perform even precise arithmetic functions like a 32-bit adder. In this letter, we use simulations based on the stochastic Landau–Lifshitz–Gilbert (sLLG) equation to demonstrate that similar impressive functions can be performed using unstable nanomagnets with energy barriers as low as a fraction of a kT. This is surprising because the magnetization of low-barrier nanomagnets is not telegraphic with discrete values of $\pm 1$ . Rather, it fluctuates randomly among all values between $-$ 1 and +1, and the output magnets are read with a thresholding device that translates all positive values to one and all negative values to zero. We present sLLG-based simulations demonstrating the operation of a 32-bit adder, with a network of several hundred nanomagnets, exhibiting a remarkably precise correlation: The input magnets { A } and { B } as well as the output magnets { S } all fluctuate randomly and yet the quantity $A+B$ $-$ $S$ is sharply peaked around zero! If we fix { A } and { B }, the sum magnets { S } rapidly converge to a unique state with $S=A+B$ so that the system acts as an adder. But unlike standard adders, the operation is invertible. If we fix { S } and { B }, the remaining magnets { A } converge to the difference $A=S-B$ . These examples emphasize a new direction for the field of nanomagnetics away from stable high-barrier magnets toward stochastic low-barrier magnets that not only operate with lower currents, but are also more promising for continued downscaling.

Proceedings ArticleDOI
18 Jun 2017
TL;DR: In this article, the authors propose an approach to map floating-point based DNNs to 8-bit dynamic fixed-point networks with integer power-of-two weights with no change in network architecture.
Abstract: While Deep Neural Networks (DNNs) push the state-of-the-art in many machine learning applications, they often require millions of expensive floating-point operations for each input classification. This computation overhead limits the applicability of DNNs to low-power, embedded platforms and incurs high cost in data centers. This motivates recent interests in designing low-power, low-latency DNNs based on fixed-point, ternary, or even binary data precision. While recent works in this area offer promising results, they often lead to large accuracy drops when compared to the floating-point networks. We propose a novel approach to map floating-point based DNNs to 8-bit dynamic fixed-point networks with integer power-of-two weights with no change in network architecture. Our dynamic fixed-point DNNs allow different radix points between layers. During inference, power-of-two weights allow multiplications to be replaced with arithmetic shifts, while the 8-bit fixed-point representation simplifies both the buffer and adder design. In addition, we propose a hardware accelerator design to achieve low-power, low-latency inference with insignificant degradation in accuracy. Using our custom accelerator design with the CIFAR-10 and ImageNet datasets, we show that our method achieves significant power and energy savings while increasing the classification accuracy.

Journal ArticleDOI
TL;DR: In this article, the authors proposed a novel design for realizing all optical 1*bit fulladder based on photonic crystals, which was realized by cascading two optical 1-bit half-adders.
Abstract: In this paper we proposed a novel design for realizing all optical 1*bit full-adder based on photonic crystals. The proposed structure was realized by cascading two optical 1-bit half-adders. The final structure is consisted of eight optical waveguides and two nonlinear resonant rings, created inside rod type two dimensional photonic crystal with square lattice. The structure has “X”, “Y” and “Z” as input and “SUM” and “CARRY” as output ports. The performance and functionality of the proposed structure was validated by means of finite difference time domain method.

Journal ArticleDOI
TL;DR: In this paper, a CAD tool is built and integrated into a standard digital flow to offer a wide range of cost-accuracy tradeoffs for any conventional design, including area, power, and delay savings.
Abstract: Energy-efficiency is a critical concern for many systems, ranging from Internet of things objects and mobile devices to high-performance computers. Moreover, after 40 years of prosperity, Moore’s law is starting to show its economic and technical limits. Noticing that many circuits are over-engineered and that many applications are error-resilient or require less precision than offered by the existing hardware, approximate computing has emerged as a potential solution to pursue improvements of digital circuits. In this regard, a technique to systematically tradeoff accuracy in exchange for area, power, and delay savings in digital circuits is proposed: gate-level pruning (GLP). A CAD tool is build and integrated into a standard digital flow to offer a wide range of cost-accuracy tradeoffs for any conventional design. The methodology is first demonstrated on adders, achieving up to 78% energy-delay-area reduction for 10% mean relative error. It is then detailed how this methodology can be applied on a more complex system composed of a multitude of arithmetic blocks and memory: the discrete cosine transform (DCT), which is a key building block for image and video processing applications. Even though arithmetic circuits represent less than 4% of the entire DCT area, it is shown that the GLP technique can lead to 21% energy-delay-area savings over the entire system for a reasonable image quality loss of 24 dB. This significant saving is achieved thanks to the pruned arithmetic circuits, which sets some nodes at constant values, enabling the synthesis tool to further simplify the circuit and memory.

Journal ArticleDOI
TL;DR: This paper presents a row-based design methodology covering cell placement, clock tree synthesis, and routing steps for large SFQ circuits, which can be reduced by 27% compared with the results of a conventional CMOS placement accompanied by an H-tree clock network.
Abstract: This paper presents a row-based design methodology covering cell placement, clock tree synthesis, and routing steps for large SFQ circuits. The proposed placement tool initiates by running a state-of-the-art CMOS placer, which places fixed-height but variable-width cells in rows on the chip. Cells in each row are then grouped together such that each group contains at most $k$ cells with the same logic level. Next, for clock routing, this paper proposes HL-tree, which adopts an H-tree with passive transmission line connections to distribute the clock to groups, and within each group, a linear path composed of splitters and Josephson transmission lines (JTLs) provides the clock to cells. Increasing $k$ reduces the chip area, but also may incur a performance loss. To evaluate the effectiveness of the proposed approach, place-and-route results of a 32-bit Kogge–Stone adder for different values of $k$ are reported. By using this new design methodology, the overall chip area can be reduced by 27% compared with the results of a conventional CMOS placement accompanied by an H-tree clock network.

Journal ArticleDOI
TL;DR: This paper uses individual micro controllers to emulate p-bits, and presents results for a 4-bit ripple carry adder with 48 p-bit and a 5-bit multiplier working in inverted mode as a factorizer, a first step towards implementing p- bits with nano devices, like stochastic Magnetic Tunnel Junctions.
Abstract: The common feature of nearly all logic and memory devices is that they make use of stable units to represent 0’s and 1’s. A completely different paradigm is based on three-terminal stochastic units which could be called “p-bits”, where the output is a random telegraphic signal continuously fluctuating between 0 and 1 with a tunable mean. p-bits can be interconnected to receive weighted contributions from others in a network, and these weighted contributions can be chosen to not only solve problems of optimization and inference but also to implement precise Boolean functions in an inverted mode. This inverted operation of Boolean gates is particularly striking: They provide inputs consistent to a given output along with unique outputs to a given set of inputs. The existing demonstrations of accurate invertible logic are intriguing, but will these striking properties observed in computer simulations carry over to hardware implementations? This paper uses individual micro controllers to emulate p-bits, and we present results for a 4-bit ripple carry adder with 48 p-bits and a 4-bit multiplier with 46 p-bits working in inverted mode as a factorizer. Our results constitute a first step towards implementing p-bits with nano devices, like stochastic Magnetic Tunnel Junctions.

Journal ArticleDOI
TL;DR: Key insights are provided into the role of noise mechanisms in size homeostasis, and an inextricable link between timer-based models of size control and heavy-tailed cell-size distributions is suggested.

Journal ArticleDOI
TL;DR: In this article, an alternative for the circuital realization of analog fractional-order differentiators and integrators without using ladder networks is presented by a mathematical manipulation of a rational function in a similar way to the reported for the synthesis of the variable-state filters.
Abstract: In this work, we propose an alternative for the circuital realization of analog fractional-order differentiators and integrators without using ladder networks. This alternative is obtained by a mathematical manipulation of a rational function in a similar way to the reported for the synthesis of the variable-state filters. The advantage of the proposed implementation is the requirement of only simple analog design blocks, such as integrators (of integer order), differential amplifiers and two-input adder amplifiers. Most important, contrarily to other reported solutions, the proposed realization can be fulfilled using commercially available resistors and capacitors, with a reduced number of calculations, and without negative impedance converters or inductors. In addition, the orders of the fractional derivative and integral can be modified just varying the gain of the differential amplifiers and adders. To validate the proposal of implementation, and as example of application, we present simulations (HSPICE, MATLAB) and experimental results of a first-order plus dead time plant controlled by fractional-order PI and PID controllers. The experimental results were obtained from a realization using field-programmable analog arrays. A comparison analysis highlights that the proposed alternative of implementation presents advantages regarding a Cauer-network-based realization in terms of number of active and passive elements, number of passive elements with non-commercial available values and design complexity.

Journal ArticleDOI
TL;DR: An efficient and flexible dual-field ECC processor which can support arbitrary elliptic curve standards and algorithms using the hardware–software approach is presented.
Abstract: Elliptic curve cryptography (ECC) has been widely used for the digital signature to ensure the security in communication. It is important for the ECC processor to support a variety of ECC standards to be compatible with different security applications. Thus, a flexible processor which can support different standards and algorithms is desired. In this paper, an efficient and flexible dual-field ECC processor using the hardware–software approach is presented. The proposed processor can support arbitrary elliptic curve. An elaborate modular arithmetic logic unit is designed. It can perform basic modular arithmetic operations and achieve high efficiency. Based on our designed instruction set, the processor can be programmed to perform various point operations based on different algorithms. To demonstrate the flexibility of our processor, a point multiplication algorithm with power analysis resistance is adopted. Our design is implemented in the field-programmable gate array platform and also in the application-specified integrated circuit. After implemented in the 55 nm CMOS process, the processor takes between 0.60 ms (163 bits ECC) and 6.75 ms (571 bits ECC) to finish one-point multiplication. Compared to other related works, the merits of our ECC processor are the high hardware efficiency and flexibility.

Journal ArticleDOI
TL;DR: In this paper, the authors exploit different adder compressors structures into the SAD hardware architecture and synthesize an 8-2 compressor with 4-2 compressors and Kogge-Stone adder in the recombination line.
Abstract: Sum of absolute differences (SAD) calculation is one of the most time-consuming operations of video encoders compatible with the high efficiency video coding standard. SAD hardware architectures employ an adder tree to accumulate the coefficients from absolute difference between two video blocks. This paper exploits different adder compressors structures into the SAD hardware architecture. The architectures were synthesized to 45-nm CMOS standard cells. Synthesis results show that SAD architecture using 8–2 compressor composed with 4–2 compressors and Kogge–Stone adder in the recombination line reduces power dissipation by 25.5% on average when compared with the SAD architecture using conventional adders from a state-of-the-art synthesis tool. Our throughput analysis shows that the designed SAD units are capable of encoding full HD ( $1920\times 1080$ ) videos in real time at 30 frames/s.

Posted Content
TL;DR: In this article, the authors propose a stochastic-binary hybrid neural network architecture for near-sensor NN applications, which splits the computation between the stochastastic and binary domains.
Abstract: Recent advances in neural networks (NNs) exhibit unprecedented success at transforming large, unstructured data streams into compact higher-level semantic information for tasks such as handwriting recognition, image classification, and speech recognition. Ideally, systems would employ near-sensor computation to execute these tasks at sensor endpoints to maximize data reduction and minimize data movement. However, near- sensor computing presents its own set of challenges such as operating power constraints, energy budgets, and communication bandwidth capacities. In this paper, we propose a stochastic- binary hybrid design which splits the computation between the stochastic and binary domains for near-sensor NN applications. In addition, our design uses a new stochastic adder and multiplier that are significantly more accurate than existing adders and multipliers. We also show that retraining the binary portion of the NN computation can compensate for precision losses introduced by shorter stochastic bit-streams, allowing faster run times at minimal accuracy losses. Our evaluation shows that our hybrid stochastic-binary design can achieve 9.8x energy efficiency savings, and application-level accuracies within 0.05% compared to conventional all-binary designs.

Journal ArticleDOI
TL;DR: A body-coupled communication (BCC) transceiver (TRX) that mitigates all the practical impairments of the body channel at once is presented, which has been the two major issues on the BCC.
Abstract: This paper presents a body-coupled communication (BCC) transceiver (TRX) that mitigates all the practical impairments of the body channel at once. The proposed pseudo orthogonal frequency-division multiplexing (P-OFDM) TRX combines baseband BPSK–OFDM with frequency-shift keying (FSK) to alleviate the impacts of variable ground effect and variable skin-electrode contact impedance, which have been the two major issues on the BCC. It can tolerate up to 20 dB of channel gain variation with measured bit error rate improvement of >70% compared to FSK modulation alone. The RC relaxed contact impedance monitor continuously monitors and compensates the variable skin-electrode contact impedance at both transmitter (TX) and receiver (RX). The proposed power-gated 8-point inverse fast Fourier transform/fast Fourier transform with no floating-point multipliers (FPMs) reduces the gate count and power by 54% and 30% compared to conventional FPMs, respectively. Additionally, the simple floating-point adder (FPA) reduces the gate count and energy consumption by 34% and 20% compared to conventional FPAs, respectively. A high input impedance glitch-free FSK demodulation RX with variable threshold limiter and all digital cycle correction is also proposed to support a scalable data rate (200 Kbps–2 Mbps). The 0.54 mm2 TRX in 65-nm CMOS consumes 1.1 mW.

Journal ArticleDOI
TL;DR: This paper is designing and simulating a fulladder/subtractor with minimum number of cells and complexities in three layers based on quantum-dot cellular automata technology.
Abstract: Nowadays, quantum-dot cellular automata (QCA) is one of the paramount modern technologies for designing logical structures at the nano-scale. This technology is being used in molecular levels and it is based on QCA cells. High speed data transfer and low consumable power are the advantages of this technology. In this paper, we are designing and simulating a fulladder/subtractor with minimum number of cells and complexities in three layers. QCA designer software has been used to simulate the proposed design.

Journal ArticleDOI
TL;DR: Magnetic tunnel junction (MTJ) devices are leveraged to develop a novel full adder (FA) based on 3- and 5-input majority gates based on Spin Hall effect (SHE) for changing the MTJ states resulting in low-energy switching behavior.
Abstract: Magnetic tunnel junction (MTJ)-based devices have been studied extensively as a promising candidate to implement hybrid energy-efficient computing circuits due to their nonvolatility, high integration density, and CMOS compatibility. In this paper, MTJs are leveraged to develop a novel full adder (FA) based on 3- and 5-input majority gates. Spin Hall effect (SHE) is utilized for changing the MTJ states resulting in low-energy switching behavior. SHE-MTJ devices are modeled in Verilog-A using precise physical equations. SPICE circuit simulator is used to validate the functionality of 1-bit SHE-based FA. The simulation results show 76% and 32% improvement over previous voltage-mode MTJ-based FA in terms of energy consumption and device count, respectively. The concatanatability of our proposed 1-bit SHE-FA is investigated through developing a 4-bit SHE-FA. Finally, delay and power consumption of an ${ {n}}$ -bit SHE-based adder has been formulated to provide a basis for developing an energy efficient SHE-based ${n}$ -bit arithmetic logic unit.

Journal ArticleDOI
Craig Gidney1
TL;DR: An n-bit controlled adder circuit with T-count of 8n+O(1), a temporary adder that can be computed for the same cost as the normal adder but whose result can be kept until it is later uncomputed without using T gates, and some other constructions whose T- Count is improved by the temporary logical-AND.
Abstract: We improve the number of T gates needed to perform an n-bit adder from 8n + O(1) to 4n + O(1). We do so via a "temporary logical-AND" construction which uses four T gates to store the logical-AND of two qubits into an ancilla and zero T gates to later erase the ancilla. This construction is equivalent to one by Jones, except that our framing makes it clear that the technique is far more widely applicable than previously realized. Temporary logical-ANDs can be applied to integer arithmetic, modular arithmetic, rotation synthesis, the quantum Fourier transform, Shor's algorithm, Grover oracles, and many other circuits. Because T gates dominate the cost of quantum computation based on the surface code, and temporary logical-ANDs are widely applicable, this represents a significant reduction in projected costs of quantum computation. In addition to our n-bit adder, we present an n-bit controlled adder circuit with T-count of 8n + O(1), a temporary adder that can be computed for the same cost as the normal adder but whose result can be kept until it is later uncomputed without using T gates, and discuss some other constructions whose T-count is improved by the temporary logical-AND.

Journal ArticleDOI
TL;DR: This paper presents a new design of three-valued logic gates on the basis of carbon nanotube transistors, and the proposed circuit is compared with the existing models of circuits to indicate that the proposed model outperform the existing model in terms of power and delay.

Journal ArticleDOI
TL;DR: Two new designs to implement a ternary half adder using Carbon Nanotubes Field Effect Transistors (CNFETs) show delay and power advantage up to 40 and 39% with less transistor count, so use of these half adders in complex arithmetic circuits will be advantageous.
Abstract: Ternary logic is a promising alternative to the conventional binary logic in VLSI design as it provides the advantages of reduced interconnects, higher operating speeds, and smaller chip area. This paper presents a pair of circuits for implementing a ternary half adder using carbon nanotube field-effect transistors. The proposed designs combine both futuristic ternary and conventional binary logic design approach. One of the proposed circuits for ternary to binary decoder simplifies further circuit implementation and provides excellent delay and power advantages in data path circuit such as adder. These circuits have been extensively simulated using HSPICE to obtain power, delay, and power delay product. The circuit performances are compared with alternative designs reported in recent literature. One of the proposed ternary adders has been demonstrated power, power delay product improvement up to 63% and 66% respectively, with lesser transistor count. So, the use of these half adders in complex arithmetic circuits will be advantageous.

Journal ArticleDOI
TL;DR: The comparison of simulation results of all the filters show that FIR filter with WT multiplier is the best optimised filter.
Abstract: This study represents designing and implementation of a low power and high speed 16 order FIR filter. To optimise filter area, delay and power, different multiplication techniques such as Vedic multiplier, add and shift method and Wallace tree (WT) multiplier are used for the multiplication of filter coefficient with filter input. Various adders such as ripple carry adder, Kogge Stone adder, Brent Kung adder, Ladner Fischer adder and Han Carlson adder are analysed for optimum performance study for further use in various multiplication techniques along with barrel shifter. Secondly optimisation of filter area and delay is done by using add and shift method for multiplication, although it increases power dissipation of the filter. To reduce the complexity of filter, coefficients are represented in canonical signed digit representation as it is more efficient than traditional binary representation. The finite impulse-response (FIR) filter is designed in MATLAB using equiripple method and the same filter is synthesised on Xilinx Spartan 3E XC3S500E target field-programmable gate array device using Very High Speed Integrated Circuit Hardware Description Language (VHDL) subsequently the total on-chip power is calculated in Vivado2014.4. The comparison of simulation results of all the filters show that FIR filter with WT multiplier is the best optimised filter.

Journal ArticleDOI
TL;DR: Two QCA full adder architectures are presented and evaluated: a new and efficient 1-bit QCAFull adder architecture and a 4-bitQCA ripple carry adder (RCA) architecture that outperform most results so far in the literature.
Abstract: Quantum-dot cellular automata (QCA) is a new and promising computation paradigm, which can be a viable replacement for the complementary metal–oxide–semiconductor technology at nano-scale level. This technology provides a possible solution for improving the computation in various computational applications. Two QCA full adder architectures are presented and evaluated: a new and efficient 1-bit QCA full adder architecture and a 4-bit QCA ripple carry adder (RCA) architecture. The proposed architectures are simulated using QCADesigner tool version 2.0.1. These architectures are implemented with the coplanar crossover approach. The simulation results show that the proposed 1-bit QCA full adder and 4-bit QCA RCA architectures utilise 33 and 175 QCA cells, respectively. Our simulation results show that the proposed architectures outperform most results so far in the literature.

Journal ArticleDOI
01 Aug 2017-Optik
TL;DR: A novel reversible full adder-subtractor circuit based on QCA is proposed which improves the cell count, area and total energy dissipation by almost 45% and 50% and 48%, respectively, as compared to the existing QCA-based single-layer and multilayerversible full adders.