scispace - formally typeset
Search or ask a question

Showing papers on "Adder published in 2022"



Proceedings ArticleDOI
17 Feb 2022
TL;DR: MATCHA as discussed by the authors accelerates TFHE gates using approximate multiplication-less integer FFTs and IFFTs and uses a pipelined datapath to improve the energy efficiency.
Abstract: Fully Homomorphic Encryption over the Torus (TFHE) allows arbitrary computations to happen directly on ciphertexts using homomorphic logic gates. However, each TFHE gate on state-of-the-art hardware platforms such as GPUs and FPGAs is extremely slow (> 0.2ms). Moreover, even the latest FPGA-based TFHE accelerator cannot achieve high energy efficiency, since it frequently invokes expensive double-precision floating point FFT and IFFT kernels. In this paper, we propose a fast and energy-efficient accelerator, MATCHA, to process TFHE gates. MATCHA supports aggressive bootstrapping key unrolling to accelerate TFHE gates without decryption errors by approximate multiplication-less integer FFTs and IFFTs, and a pipelined datapath. Compared to prior accelerators, MATCHA improves the TFHE gate processing throughput by 2.3x, and the throughput per Watt by 6.3x.

19 citations


Journal ArticleDOI
TL;DR: In this article , the three logic processing levels based on complementary photonic crystal logic devices through photonic integrated circuit modeling are presented and demonstrated, including AND, OR, NAND, NOR, XOR, FAN-OUT, HALF ADDER, and FULL ADDER.
Abstract: This paper presents and demonstrates the three logic processing levels based on complementary photonic crystal logic devices through photonic integrated circuit modeling. We accomplished a set of logic circuits including AND, OR, NAND, NOR, XOR, FAN-OUT, HALF ADDER, and FULL ADDER based on photonic crystal slab platforms. Furthermore, we achieved efficient all-optical logic circuits with contrast ratios as high as 5.5 dB, demonstrated in our simulation results, guaranteeing well-defined output power values for logic representations; a clock-rate up to 2 GHz; and an operating wavelength at λ ≈ 1550 nm. Thus, we can now switch up for high computing abstraction levels to build photonic integrated circuits rather than isolated gates or devices.

16 citations



Journal ArticleDOI
TL;DR: In this paper , the authors compared the delay and energy performance matrices of fin-shaped FET and negative capacitance FinFET based devices and circuits designed on the same technology node.

16 citations


Journal ArticleDOI
01 Jul 2022
TL;DR: In this article , the authors proposed a multifunctional all-optical nanostructure with ultra-fast and ultra-compact logic gates with a maximum time delay of approximately 280 fs, an area of 104 μm2, and a bit rate of 3.57 Tb/s.
Abstract: All-optical multifunctional structures are at the beginning of the growth and development path to achieve an all-optical integrated circuit (AOIC). The main challenge of designing multifunctional all-optical nanostructures is to maintain the overall function of the structure compared to single-function nanostructures. Simultaneous application of logic gates of AND, XOR, half-adder, 1-bit comparator, reversible Feynman logic gate along with desirable and suitable function are some of the proposed nanostructure properties. The propagation modes in the structure are extracted by the plane wave expansion (PWE) method. The overall operation of the nanostructure, simulation, and numerical analysis of the proposed nanostructure for the submitted applications are performed using the numerical method finite-difference time-domain (FDTD). In the proposed multifunctional nanostructure, ultra-fast and ultra-compact logic gates with a maximum time delay of approximately 280 fs, an area of 104 μm2, and a bit rate of 3.57 Tb/s are provided. In addition to the proposed ultra-compact logic gate, another advantage of the proposed multifunctional nanostructure is achieving the appropriate contrast ratio.

14 citations


Proceedings ArticleDOI
06 Apr 2022
TL;DR: It is proved that the space and time complexity of verifying a structurally complex multiplier using a word-level verification method is always exponential and the polynomial formal verification of the complex multiplier becomes possible if the correctness of each stage is verified using the proper verification method.
Abstract: With the growing demands for highly area-efficient, delay-optimized, and low-power designs, the complexity of digital circuits is increasing as well. Especially, a wide variety of arithmetic circuits, including different types of adders, multipliers, and dividers have been proposed to meet the demands in applications such as cryptography and Artificial Intelligence (AI). Some of these arithmetic circuits have highly parallel architectures and contain millions of gates; as a result, they are extremely error-prone. In the last 30 years, several formal verification methods have been proposed to verify arithmetic circuits. These methods report very good results when it comes to the verification of adders and structurally simple multipliers. Moreover, their space and time complexities are polynomial, i.e, they are scalable. However, when it comes to the verification of structurally complex multipliers, the story is different.In this paper, we investigate the space and time complexity of verifying a structurally complex multiplier using a word-level verification method. We prove that the space and time complexity is always exponential. Then, we introduce a new verification strategy that takes advantage of several verification engines. We show that the polynomial formal verification of the complex multiplier becomes possible if the correctness of each stage is verified using the proper verification method. Our verification strategy can be applied to other complex digital circuits.

13 citations


Journal ArticleDOI
TL;DR: In this paper , a new type of domain wall device is presented with multistates driven by nonvolatile spin-orbit torque (SOT) and Dzyaloshinskii-Moriya interaction, enabling time and energy efficient IMC with a full adder (FA) implementation based on magnetic tunnel junctions.
Abstract: Emerging in‐memory computing (IMC) technology promises to tackle the memory wall bottleneck in modern systems. Promoted as a promising building block, nonvolatile spin–orbit torque (SOT) memory devices with sub‐ns and sub‐pJ processing capabilities are thereby extensively pursued. Herein, a new type of domain wall device is experimentally presented with multistates driven by nonvolatile SOT and Dzyaloshinskii–Moriya interaction, enabling time and energy‐efficient IMC with a full adder (FA) implementation based on magnetic tunnel junctions. Complementary micromagnetic and device–circuit cosimulation results show that the write/read latency of the proposed FA can be shortened to 1.25 ns/0.22 ns with an averaged writing energy of 8.41 fJ bit−1, and the overall dynamic power is 26.25 μW, which is 4.43–51.96 times lower than state‐of‐the‐art alternatives. Moreover, the developed architecture can perform all 16 Boolean logic functions, warranting an extensive arithmetic operation. The experimental, micromagnetic, and circuit‐level simulation results show great potential in both fundamental research and new trajectories in technology development for nonvolatile in‐memory computing applications.

12 citations


Journal ArticleDOI
TL;DR: A series of multi-stage hybrid memristor-CMOS ternary combinational logic stages that are optimized for reducing silicon area occupation and show an improvement in data density, which also accounts for intermediate voltage buffering to alleviate the memristive loading problem.
Abstract: This paper presents a series of multi-stage hybrid memristor-CMOS ternary combinational logic stages that are optimized for reducing silicon area occupation. Prior demonstrations of memristive logic are typically constrained to single-stage logic due to the variety of challenges that affect device performance. Noise accumulation across subsequent stages can be amortized by integrating ternary logic gates, thus enabling higher density data transmission, where more complex computation can take place within a smaller number of stages when compared to single-bit computation. We present the design of a ternary half adder, a ternary full adder, a ternary multiplier, and a ternary magnitude comparator. These designs are simulated in SPICE using the broadly accessible Knowm memristor model, and we perform experimental validation of individual stages using an in-house fabricated Si-doped HfOx memristor which exhibits low cycle-to-cycle variation, and thus contributes to robust long-term performance. We ultimately show an improvement in data density in each logic block of between $5.2\times - 17.3\times $ , which also accounts for intermediate voltage buffering to alleviate the memristive loading problem.

12 citations


Journal ArticleDOI
TL;DR: Four approximate multipliers based on ML are proposed to offer various accuracy requirements for different applications and show superior performance in power and area for emerging nanotechnologies.
Abstract: Approximate computing at the nanoscale provides sufficiently accurate and often adaptive results to improve hardware efficiency for error-tolerant applications. Differently from conventional Boolean logic-based designs, many emerging nanotechnologies extensively assemble circuits using the voter-based majority logic (ML). In this letter, we investigate designs of approximate radix-4 Booth multipliers based on ML. Initially, we propose two new radix-4 Booth partial product (PP) generation methods by exploiting the characteristics of ML. Based on these methods, approximate PP generators are designed to produce single-sided or double-sided errors. The PPs are then reduced by using the features of errors to construct approximate multipliers. Specifically, complementary strategies guided by an analysis of error effects are developed to compensate for the accuracy loss and to reduce the hardware overhead during the PP reduction. The reduced PPs are then compressed by using full adders. Four approximate multipliers are proposed to offer various accuracy requirements for different applications. These designs show superior performance in power and area for emerging nanotechnologies. As case studies, image processing, a multiple-layer perceptron and a multi-task convolutional neural network are presented to show the validity and advantages of the proposed designs.

11 citations


Journal ArticleDOI
TL;DR: A generic design methodology for implementing FPGA-based application-specific approximate arithmetic operators with more non-dominated approximate multipliers with better hypervolume contribution than state-of-the-art designs for these benchmark applications with the proposed design methodology.
Abstract: Approximate arithmetic operators, such as adders and multipliers, are increasingly used to satisfy the energy and performance requirements of resource-constrained embedded systems. However, most of the available approximate operators have an application-agnostic design methodology, and the efficacy of these operators can only be evaluated by employing them in the applications. Furthermore, the various available libraries of approximate operators do not share any standard approximation-induction policy to design new operators according to an application’s accuracy and performance constraints. These limitations also hinder the utilization of machine learning models to explore and determine approximate operators according to an application’s requirements. In this work, we present a generic design methodology for implementing FPGA-based application-specific approximate arithmetic operators. Our proposed technique utilizes lookup tables and carry-chains of FPGAs to implement approximate operators according to the input configurations. For instance, for an \( \text{M}\times \text{N} \) accurate multiplier utilizing K lookup tables, our methodology utilizes K-bit configurations to design \( 2^K \) approximate multipliers. We then utilize various machine learning models to evaluate and select configurations satisfying application accuracy and performance constraints. We have evaluated our proposed methodology for three benchmark applications, i.e., biomedical signal processing, image processing, and ANNs. We report more non-dominated approximate multipliers with better hypervolume contribution than state-of-the-art designs for these benchmark applications with the proposed design methodology.

Journal ArticleDOI
TL;DR: An approximate carry select adder (CSLA) with reverse carry propagation (RCSLA) is showed in this article , where three types of implementations were designed in RCPFA based on the design parameters.
Abstract: ABSTRACT An approximate carry select adder (CSLA) with reverse carry propagation (RCSLA) is showed in this work. This RCSLA was designed with reverse carry propagate full adder (RCPFA). In RCPFA structure, the carry signal propagates in the reverse direction that is from MSB part to LSB part, then the carry input has greater importance compared to the output carry. Three types of implementations were designed in RCPFA based on the design parameters. This method was applied to RCA & CSLA to design other types of approximate adders. These designs and simulations were done in CADENCE Software tool with 45 nm COMS technology. The design parameters of the three CSLA implementations with RCPFA are compared with the existing CSLA adders.

Journal ArticleDOI
TL;DR: This work develops a compact-yet-efficient architecture using cooperative strand displacement reactions (cSDRs) to construct DNA full adder, providing the potential for application-specific circuit customization for scalable digital computing with minimal reactions.
Abstract: DNA logic circuits are based on DNA molecular programming that implements specific algorithms using dynamic reaction networks. Particularly, DNA adder circuits are key building blocks for performing digital computation. Nevertheless, existing circuit architectures are limited by scalability for implementing multi-bit adder due to the number of required gates and strands. Here, we develop a compact-yet-efficient architecture using cooperative strand displacement reactions (cSDRs) to construct DNA full adder. By exploiting a parity-check algorithm, double-logic XOR-AND gates are constructed with a single set of double-stranded molecule. One-bit full adder is implemented with three gates containing 13 strands, with up to 90% reduction in strand complexity compared to conventional circuit designs. Using this architecture and a transmitter on magnetic beads, we demonstrate DNA implementation of 6-bit adder on a scale comparable to that of a classic electronic full adder chip, providing the potential for application-specific circuit customization for scalable digital computing with minimal reactions.

Proceedings ArticleDOI
12 Jun 2022
TL;DR: This paper detail proposed solutions to address the new challenges, present measurement results for a SRAM-based 64x64 CIM manufactured by 12nm CMOS process, and present proposed architectures that support various neural network topologies.
Abstract: Recently SRAM-based digital compute-in memory (D-CIM) [1] has demonstrated excellent energy/area efficiency, with full precision of 4b/8b integer multiply-accumulate operations, it has better programmability, hardware reuse and scalability, in addition, it can effectively leverage technology scaling for better PPA. Nonetheless, several new challenges remain, including huge peak currents resulting from high parallel operation, long delays in adder trees, and scalable architectures that support various neural network topologies. In this paper, we detail proposed solutions to address the new challenges and present measurement results for a SRAM-based 64x64 CIM manufactured by 12nm CMOS process.



Journal ArticleDOI
TL;DR: In this paper , a low-power 1-bit fulladder (FA) cell is proposed based on the transmission gate (TG) to attain a special module for generating full-swing carry output, which benefits from the high driving capability for both Sum and Carry outputs when embedding in multistage structures like ripple-carry adders (RCAs), compressors, and multipliers.
Abstract: In this letter, a low-power 1-bit full-adder (FA) cell is proposed based on the transmission gate (TG) to attain a special module for generating full-swing Carry output. The cell benefits from the high driving capability for both Sum and Carry outputs when embedding in multistage structures like ripple-carry adders (RCAs), compressors, and multipliers. The proposed TG-based FA has a total die area of 60.02 $\mu \text{m}^{2}$ , while the average power, delay, and power-delay-product (PDP) are 10.829 $\mu \text{W}$ , 3.1954 ns, and 34.603 fJ, respectively. The results introduce the FA cell as an efficient gate for integrated circuits (ICs).

Journal ArticleDOI
16 Feb 2022-IT
TL;DR: This article extends existing compilation techniques for the Programmable Logic in-Memory (PLiM) computer architecture, by adapting state-of-the-art approximate computing techniques for arithmetic circuits by using Cartesian Genetic Programming and a Symbolic Computer Algebra-based technique with respect to error-metrics.
Abstract: Abstract With ReRAM being a non-volative memory technology, which features low power consumption, high scalability and allows for in-memory computing, it is a promising candidate for future computer architectures. Approximate computing is a design paradigm, which aims at reducing the complexity of hardware by trading off accuracy for area and/or delay. In this article, we introduce approximate computing techniques to in-memory computing. We extend existing compilation techniques for the Programmable Logic in-Memory (PLiM) computer architecture, by adapting state-of-the-art approximate computing techniques for arithmetic circuits. We use Cartesian Genetic Programming for the generation of approximate circuits and evaluate them using a Symbolic Computer Algebra-based technique with respect to error-metrics. In our experiments, we show that we can outperform state-of-the-art handcrafted approximate adder designs.

Journal ArticleDOI
TL;DR: In this article , the authors focus on techniques, methods used to approximate circuits and to use formal methods for solving challenges faced by traditional methods, major challenges in these methods are the ability to automatically synthesize approximation circuits without relying on the skill of designers.


Journal ArticleDOI
TL;DR: In this article , the authors propose three fault-tolerant carry lookahead adders that improve the cost in terms of quantum gates and qubits with respect to the rest of quantum circuits available in the literature.
Abstract: Abstract Adders are one of the most interesting circuits in quantum computing due to their use in major algorithms that benefit from the special characteristics of this type of computation. Among these algorithms, Shor’s algorithm stands out, which allows decomposing numbers in a time exponentially lower than the time needed to do it with classical computation. In this work, we propose three fault-tolerant carry lookahead adders that improve the cost in terms of quantum gates and qubits with respect to the rest of quantum circuits available in the literature. Their optimal implementation in a real quantum computer is also presented. Finally, the work ends with a rigorous comparison where the advantages and disadvantages of the proposed circuits against the rest of the circuits of the state of the art are exposed. Moreover, the information obtained from such a comparison is summarized in tables that allow a quick consultation to interested researchers.

Journal ArticleDOI
TL;DR: In this paper , a tunable optoelectronic full-adder using two photonic crystal ring resonators (PCRRs) is proposed, which consists of a matrix of silicon rods surrounded by silica rods coated with graphene nanoshells (GNSs).
Abstract: This paper reports a novel design of a tunable optoelectronic full-adder using two photonic crystal ring resonators (PCRRs). Every PCRR consists of a matrix of silicon rods surrounded by silica rods coated with graphene nanoshells (GNSs). The proposed full-adder is formed by three input ports, two PCRRs, and two output ports for ‘SUM’ and ‘CARRY’. The plane wave expansion technique is used to study the photonic band structure of the fundamental photonic crystal (PhC) microstructure, and the finite-difference time-domain method is also employed in the final design for solving Maxwell's equations to analyze the light propagation inside the structure. We can tune the PhC resonant mode for our desired application by setting the chemical potential of GNSs with an appropriate gate voltage. The numerical results reveal that when the chemical potential of GNSs changes, the switching mechanism occurs and manages the coupling and propagation direction of the input beam inside the structure. We systematically study the effects of physical parameters on the transmission, reflection, and absorption spectra . Our numerical results also demonstrate that the maximum delay is about 0.8 ps. The 663 μm 2 area of the proposed full-adder based on two-dimensional materials makes it a building block of every photonic integrated circuit used for data processing systems. • A fast and compact all-optical full-adder using photonic crystal ring resonators (PCRRs). • A maximum steady-state time of about 0.8 ps and the structure's total size of 663 μm 2 . • There is no need to increase the input intensity for the appearance of the nonlinear effect. • The ON-OFF contrast ratios for Sum and Carry are 16 dB and 14 dB, respectively. • The full-adder is applicable in optical integrated circuits for high-speed signal processing.

Journal ArticleDOI
TL;DR: In this paper , the authors proposed a reversible block called the RF-adder block, which is a domain coupling nano-technology that has drawn significant attention for less power consumption, area, and design overhead.

Journal ArticleDOI
TL;DR: In this paper , a hybrid partial product-based building blocks are proposed by considering the probability distribution of the input operands and an efficient hardware implementation of approximate 4×4 multipliers is achieved.
Abstract: Approximate recursive multipliers exhibit low-power operation because they are designed using smaller power-efficient approximate multiplier blocks. These building blocks can be configured by varying the approximation levels for a wide range of larger multiplier sizes. However, most of the building blocks proposed for recursive multipliers are either slightly inaccurate or hardware-efficient with limited accuracy. In this brief, hybrid partial product-based building blocks are proposed by considering the probability distribution of the input operands. An efficient hardware implementation of approximate 4×4 multipliers is achieved, while maintaining the required accuracy. Moreover, high-performance approximate NOR-based half adder (NxHA) and full adder (NxFA) cells are proposed for use in a 4×4 multiplier. Three different strategies (Ax8-1/2/3) are further proposed and analyzed for utilizing the 4×4 multipliers when designing larger multipliers. Ax8-2 provides the best trade-off among the designs with a moderate MRED. A reduction of 30 and 17 percent in the MRED is achieved compared to previous best energy-optimized and MRED-optimized designs. Among the designs with higher MREDs, Ax8-3 exhibits the smallest MRED and PDP. Moreover, it shows an improvement of 7 to 28 percent in delay compared to existing approximate recursive designs. As a case study, image multiplication is evaluated; a high peak signal-to-noise ratio (PSNR) with a value close to 50dB is obtained for the proposed multiplier designs.

Journal ArticleDOI
01 May 2022-Optik
TL;DR: In this article , an efficient one-bit ALU in QCA is suggested based on a new formulation that can perform eight mathematical operations and four logical operations, and the simulation results using QCADesigner2.0.3 software showed that the suggested circuits work well.


Journal ArticleDOI
TL;DR: In this paper , the authors introduced polarization-switching and charge-trapping effects in a single Fe FET and fabricated a multi-field-effect transistor with bipolar-like characteristics based on advanced 10 nm node fin field effect transistors (PS-CT FinFET) with 9 nm thick Hf0.5Zr0.2 films.
Abstract: Nonvolatile logic devices are crucial for the development of logic-in-memory (LiM) technology to build the next-generation non-von Neumann computing architecture. Ferroelectric field-effect transistors (Fe FET) are one of the most promising candidates for LiMs because of high compatibility with mainstream silicon-based complementary metal-oxide semiconductor processes, nonvolatile memory, and low power consumption. However, because of the unipolar characteristics of a Fe FET, a nonlinear XOR or XNOR logic gate function is difficult to realize with a single device. In addition, because single Fe polarization switch modulation is available in the devices, a reconfigurable logic gate usually needs multiple devices to construct and realize fewer logic functions. Here, we introduced polarization-switching (PS) and charge-trapping (CT) effects in a single Fe FET and fabricated a multi-field-effect transistor with bipolar-like characteristics based on advanced 10 nm node fin field-effect transistors (PS-CT FinFET) with 9 nm thick Hf0.5Zr0.5O2 films. The special hybrid effects of charge-trapping and polarization-switching enabled eight Boolean logic functions with a single PS-CT FinFET and 16 Boolean logic functions with two complementary PS-CT FinFETs were obtained with three operations. Furthermore, reconfigurable full 1 bit adder and subtractor functions were demonstrated by connecting only two n-type and two p-type PS-CT FinFET devices, indicating that the technology was promising for LiM applications.

Journal ArticleDOI
TL;DR: The main purpose of this paper is to design a new full adder circuit structure based on ternary quantum‐dot cellular automata technology with physical proofs, and the results show a significant improvement in circuit parameters.
Abstract: Downsizing computational modules can be effective in increasing computational speed, reducing energy consumption, and reducing the occupied area of the chip, but limitations associated with Complementary metal–oxide–semiconductor (CMOS) technology downsizing prompted researchers to look for alternative methods for transistors and circuit fabrication. A common issue in many of these methods is the move toward the production of components of logical circuits in nano dimensions. Quantum‐dot cellular automata (QCA) technology is inherently fast and is based on Coulomb interactions, and this property can make the design of computational circuits much easier. The main purpose of this paper is to design a new full adder circuit structure based on ternary quantum‐dot cellular automata technology with physical proofs. Accordingly, the proposed diagram block structure full adder circuit based on ternary QCA (TQCA) is presented, which includes 11 cells, the area is 0.0004μm2 and, the circuit cost is 0.044. For validation, the proposed structure was simulated with TQCAsim software. The results show a significant improvement in circuit parameters.

Journal ArticleDOI
TL;DR: In this article , the authors designed and implemented two new full adder circuits in QCA technology and then implemented ripple carry adder (RCA) circuits, which showed excellent performance in terms of QCA evaluation parameters, especially in cost and cost function.
Abstract: Due to the development of integrated circuits and the lack of responsiveness to existing technology, researchers are looking for an alternative technology. Quantum-dot cellular automata (QCA) technology is one of the promising alternatives due to its higher switch speed, lower power dissipation, and higher device density. One of the most important and widely used circuits in digital logic calculations is the full adder (FA) circuit, which actually creates the problem of finding its optimal design and increasing performance. In this paper, we designed and implemented two new FA circuits in QCA technology and then implemented ripple carry adder (RCA) circuits. The proposed FAs and RCAs showed excellent performance in terms of QCA evaluation parameters, especially in cost and cost function, compared to the other reported designs. The proposed adders’ approach was 46.43% more efficient than the best-known design, and the reason for this superiority was due to the coplanar form, without crossovers and inverter gates in the designs.

Journal ArticleDOI
TL;DR: This work constructed a process design kit (PDK) for path-finding to analyze the circuit layout in a 3nm technology node based on gate-all-around FET (GAA-FET) and provided a guide for determining the BEOL load and developing an improved wiring process.
Abstract: With the continuous development of front-end-of-line (FEOL) technology, the development of interconnection processes at nanoscale process nodes is becoming important. We conducted a post-layout circuit simulation to consider the effect of parasitic R and C components of middle-of-Line (MOL) and back-end-of-line (BEOL) on the circuit performance. We constructed a process design kit (PDK) for path-finding to analyze the circuit layout in a 3nm technology node based on gate-all-around FET (GAA-FET). It consists of the spice model library that satisfies the 3nm power performance area (PPA) target, and the layout versus schematic (LVS), parasitic extraction (PEX) model that checks whether the layout and schematic match, extracts the RC values in the FEOL MOL and BEOL areas. Subsequently, the effect of the interconnection on complex logic circuits (RO, full adder) was confirmed using PDK. As a result of quantifying the effects of FEOL, MOL, and BEOL on the circuit, circuit degradation due to the RC of MOL and BEOL accounts for more than 60%. Furthermore, we introduced the air spacer process as a way to improve the circuit performance by reducing the CMOL owing to the reduction in the dielectric constant of the spacer. When an air spacer is introduced, based on 9-stages FO1 INV RO with k = 7 at VDD = 0.7V, under iso-speed condition, the active power decreases by 30%, 35% when k is 3.3, 1.65, respectively. Under iso-power condition, frequency increases by 9%, 11% when k is 3.3, 1.65, respectively. And based on full adder with k = 7 at VDD = 0.7V, Under iso-speed conditions, the active power decreases by 47%, 58% when k is 3.3, 1.65, respectively. Under iso-power conditions, the delay decreases by 14%, 20% when k is 3.3, 1.65, respectively. PDP decreases by 22%, 32% when k is 3.3, 1.65, respectively. EDP decreases by 31%, 44% when k is 3.3, 1.65, respectively. In conclusion, in this work, we provide a guide for determining the BEOL load and developing an improved wiring process.