scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Faster and energy-efficient signed multipliers

01 Jan 2013-Vlsi Design (Hindawi)-Vol. 2013, pp 13
TL;DR: It is demonstrated that faster and energy-efficient column compression multiplication with very small area overheads is demonstrated by using a combination of two techniques: partition of the partial products into two parts for independent parallel column compression and acceleration of the final addition using new hybrid adder structures proposed here.
Abstract: We demonstrate faster and energy-efficient column compression multiplication with very small area overheads by using a combination of two techniques: partition of the partial products into two parts for independent parallel column compression and acceleration of the final addition using new hybrid adder structures proposed here. Based on the proposed techniques, 8-b, 16-b, 32- b, and 64-b Wallace (W), Dadda (D), and HPM (H) reduction tree based Baugh-Wooley multipliers are developed and compared with the regular W, D, Hbased Baugh-Wooley multipliers. The performances of the proposed multipliers are analyzed by evaluating the delay, area, and power, with 65 nm process technologies on interconnect and layout using industry standard design and layout tools. The result analysis shows that the 64-bit proposed multipliers are as much as 29%, 27%, and 21% faster than the regular W, D, H based Baugh-Wooley multipliers, respectively, with a maximum of only 2.4% power overhead. Also, the power-delay products (energy consumption) of the proposed 16-b, 32-b, and 64-b multipliers are significantly lower than those of the regular Baugh-Wooley multiplier. Applicability of the proposed techniques to the Booth-Encoded multipliers is also discussed.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: This paper introduces the circuit model of a SWSFET and the design of a unipolar inverter where only one kind of charge carrier contributes to the current flow and simulates two input unipolar logic gates such as NAND and NOR.
Abstract: The spatial wave-function switched field-effect transistor (SWSFET) has two or three low bandgap quantum well channels that can conduct carrier flow from source to drain of the SWSFET. Because of this property, SWSFETs are useful to implement different multivalued logic with reduced device count. In this paper, we introduce the circuit model of a SWSFET and the design of a unipolar inverter where only one kind of charge carrier contributes to the current flow. We also simulate two input unipolar logic gates such as NAND and NOR and demonstrate their universal property to implement other unipolar logic gates. We also simulate NOR gate and full adder circuits based on unipolar logic gates.

14 citations

Journal ArticleDOI
TL;DR: In this study, fully MUX based compressors, utilizing the CMOS transmission gate logic have been proposed to optimize the overall Power-Delay-Product (PDP) and multipliers designed using the proposed compressor blocks show improved results.
Abstract: The computing efficiency of modern column compression multipliers offers a highly efficient solution to the binary multiplication problem and is well suited for VLSI implementations. The various analyses are established more on compressors circuits particularly with Multiplexer (MUX) design. Conventionally, compressors are anatomized into XOR gate and MUX design. In this study, fully MUX based compressors, utilizing the CMOS transmission gate logic have been proposed to optimize the overall Power-Delay-Product (PDP). The proposed compressors are also used in the design and comparative analysis of 4×4-bit and 8×8-bit Wallace and Dadda multipliers operating in sub-threshold regime. The multipliers based on the proposed compressor designs have been simulated using 45 nm CMOS technology at various supply voltages, ranging from 0.3 to 0.5 V. The result shows on an average 89% improvement in the PDP of the proposed compressor blocks, when compared with the existing published results in sub-threshold regime. The multipliers designed using the proposed compressor blocks also show improved results.

13 citations

Proceedings ArticleDOI
16 Mar 2018
TL;DR: An efficient implementation of a high speed with low power multiplier using shift and adds methods and the implementation of Braun multiplier and Wallace Multipliers using Cadence (Encounter) RTL Complier with simulation which includes creating the Test circuit for each block that is combined together which forms Multiplier.
Abstract: In this paper the concept that is used is power efficient multipliers which are very important part of all VLSI system design which provides High speed with low power consumption which are the key requirements for any VLSI design. This paper proposes an efficient implementation of a high speed with low power multiplier using shift and adds methods and this paper presents the implementation of Braun multiplier and Wallace Multiplier using Cadence (Encounter) RTL Complier with simulation which includes creating the Test circuit for each block that is combined together which forms Multiplier. In this paper, Braun Multiplier and Wallace multiplier are simulated by creating the schematic circuit for each of the building blocks such as the AND gate, OR gate, NOT gate, EXOR gate, Half Adder, Full adder and are tested using a test circuit for each of the above blocks. These test circuits are simulated and synthesized using the Cadence tool. Symbol for these building blocks are generated and are called to construct the structure of Braun Multiplier and Wallace Multiplier. Then the multipliers are compared with respect to the number of transistors used which will provide the area occupied and power consumed. Cadence software is used to implement the schematic circuits of each block and all the blocks are simulated using cadence tool and also symbols are created which are assembled together to form a test circuit and all the analysis are tested and also synthesized using Cadence

4 citations

Journal ArticleDOI
TL;DR: Based on the exhaustive examination on Booth multiplication scheme, it is noticed that the recent implementation of approximate computing-based and modified two’s complementorbased multiplication algorithms outperform other multiplication schemes.
Abstract: The Booth multiplication scheme plays a major role in designing signed multiplier using multiplier encoder and by decreasing the number of intermediate products. Both radix-4 and radix-8 Booth encoding schemes are widely used due to simple and fast respectively. Multiplier is one of the basic as well as an important part in arithmetic unit of many highperformance operations like digital signal processing (DSP) and digital image processing (DIP) and other high-performance central processing unit (CPU) operation. In the past decade numerous ways of Booth multiplier circuits have been implemented by using different application specific integrated circuit (ASIC) technology like Taiwan semiconductor manufacturing technology (TSMC) 45 nm and 65 nm complementary metal oxide semiconductor (CMOS) process and some of the implementations have been proposed by field programmable gate array (FPGA). This work analyses the very large-scale integration (VLSI) characteristics such as area utilization, power consumption and speed of operation of different types of implementation of Booth multiplication scheme. Based on the exhaustive examination on Booth multiplication scheme, it is noticed that the recent implementation of approximate computing-based and modified two’s complementorbased multiplication algorithms outperform other multiplication schemes. Further, the VLSI technology using ST Microelectronics (STM) 28 nm and TSMC 45 nm CMOS processes beat the other implantation schemes by providing less-area and power as well as high-speed of multiplication, respectively.

1 citations

Proceedings ArticleDOI
23 Jan 2023
TL;DR: In this paper , a hybrid wallace tree multiplier of 8 bits is proposed, which includes a Single bit Hybrid Full Adder (HFA) based on the power-efficient Gate Diffusion Input (GDI) technique to create a Hybrid Array Wallace tree (HAWT) with fewer transistors and Max-resulting voltage swing.
Abstract: The multiplier is the building block of all arithmetic circuits, and it is widely employed in CPUs and other types of digital processing units as well as in certain kinds of integrated circuits. How quickly and how much energy a CPU uses up are indicators of its effectiveness. The multiplier circuit uses several adders, which increases the hardware complexity and hence slows down the processing and causes it to spend a lot of energy. Power efficiency and speed of the multiplier module must be drastically increased. In this study, a hybrid wallace tree multiplier of 8 bits is proposed. The central concept is to include a Single bit Hybrid Full Adder (HFA) based on the power-efficient Gate Diffusion Input (GDI) technique to create a Hybrid Array Wallace tree (HAWT) with fewer transistors and Maxresulting voltage swing. Simulated findings demonstrate significant gains over the state-of-the-art when comparing the suggested design executed in Tanner EDA at 250nm technology.
References
More filters
Journal ArticleDOI
TL;DR: A design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step, using straightforward diode-transistor logic.
Abstract: It is suggested that the economics of present large-scale scientific computers could benefit from a greater investment in hardware to mechanize multiplication and division than is now common. As a move in this direction, a design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step. Using straightforward diode-transistor logic, it appears presently possible to obtain products in under 1, ?sec, and quotients in 3 ?sec. A rapid square-root process is also outlined. Approximate component counts are given for the proposed design, and it is found that the cost of the unit would be about 10 per cent of the cost of a modern large-scale computer.

1,750 citations

Book
09 Sep 1999
TL;DR: An indispensable resource for instruction, professional development, and research, Computer Arithmetic: Algorithms and Hardware Designs, Second Edition combines broad coverage of the underlying theories of computer arithmetic with numerous examples of practical designs, worked-out examples, and a large collection of meaningful problems.
Abstract: Ideal for graduate and senior undergraduate courses in computer arithmetic and advanced digital design, Computer Arithmetic: Algorithms and Hardware Designs, Second Edition, provides a balanced, comprehensive treatment of computer arithmetic. It covers topics in arithmetic unit design and circuit implementation that complement the architectural and algorithmic speedup techniques used in high-performance computer architecture and parallel processing. Using a unified and consistent framework, the text begins with number representation and proceeds through basic arithmetic operations, floating-point arithmetic, and function evaluation methods. Later chapters cover broad design and implementation topics-including techniques for high-throughput, low-power, fault-tolerant, and reconfigurable arithmetic. An appendix provides a historical view of the field and speculates on its future.An indispensable resource for instruction, professional development, and research, Computer Arithmetic: Algorithms and Hardware Designs, Second Edition, combines broad coverage of the underlying theories of computer arithmetic with numerous examples of practical designs, worked-out examples, and a large collection of meaningful problems. This second edition includes a new chapter on reconfigurable arithmetic, in order to address the fact that arithmetic functions are increasingly being implemented on field-programmable gate arrays (FPGAs) and FPGA-like configurable devices. Updated and thoroughly revised, the book offers new and expanded coverage of saturating adders and multipliers, truncated multipliers, fused multiply-add units, overlapped quotient digit selection, bipartite and multipartite tables, reversible logic, dot notation, modular arithmetic, Montgomery modular reduction, division by constants, IEEE floating-point standard formats, and interval arithmetic.Features:* Divided into 28 lecture-size chapters * Emphasizes both the underlying theories of computer arithmetic and actual hardware designs * Carefully links computer arithmetic to other subfields of computer engineering * Includes 717 end-of-chapter problems ranging in complexity from simple exercises to mini-projects * Incorporates many examples of practical designs * Uses consistent standardized notation throughout * Instructor's manual includes solutions to text problems * An author-maintained website http://www.ece.ucsb.edu/~parhami/text_comp_arit.htm contains instructor resources, including complete lecture slides

1,517 citations


"Faster and energy-efficient signed ..." refers background or methods in this paper

  • ...The variable size of adder blocks always leads to faster performance than a fixed-size block adder [2, 17]; we, therefore, break down the ripple of gates in the BEC into variable-size groups according to the log 2 nmethod....

    [...]

  • ...In recent trends, the column compression multipliers are popular for high-speed computations due to their higher speeds [1, 2]....

    [...]

Journal ArticleDOI
C.R. Baugh1, Bruce A. Wooley
TL;DR: An algorithm for high-speed, two's complement, m-bit by n-bit parallel array multiplication is described, which is converted to an equivalent parallel array addition problem in which each partial product bit is the AND of a multiplier bit and a multiplicand bit.
Abstract: An algorithm for high-speed, two's complement, m-bit by n-bit parallel array multiplication is described. The two's complement multiplication is converted to an equivalent parallel array addition problem in which each partial product bit is the AND of a multiplier bit and a multiplicand bit, and the signs of all the partial product bits are positive.

663 citations


"Faster and energy-efficient signed ..." refers background or methods in this paper

  • ...s20 s19 s18 s17 s16 s15 c20 c19 c18 c17 c16 c15 p1[15] p1[14] p1[13] p1[12] p1[10] p1[8] p1[9] p1[11] 65 64 56 63...

    [...]

  • ...c15 s22 s21 s20 s19 s18 s17 s16 c22 c21 c20 c19 c18 c17 c16 p0[10] p0[9] p0[8] p6 p7 p5 p4 p3 p2 p1 p0 Figure 3: Reduction of the partial products of part0 based on the HPM reduction approach....

    [...]

  • ...Once each part of the partial products has been reduced to height of one bit column, we get the final partial products as follows: p0[10] p0[9] p0[8] p7 p6 p5 p4 p3 p2 p1 p0 p1[15] p1[14] p1[13] P1[12] p1[11] p1[10] p1[9] p1[8]....

    [...]

  • ...P1[15] P1[14] P1[13] P1[12] P1[11] P1[10] P1[9] P1[8] CLA...

    [...]

  • ...” Depending on the Cout of RCA (c[10]), the mux provides the final p[15:11] without having to ripple the carry through p1[15:11]....

    [...]

Journal ArticleDOI
TL;DR: The author develops an adder tree to sum this set when t= 1 the maximum number of regions intersections of n t-flats and shows that a tree will be dependent on both t and n.
Abstract: will be less than Cnt+1(t+1)! but may be space into which the latter may be divided multiplier into twenty 2-bit segments. He (and usually will be) more than (t+2)!. by a maximum possible number of mutual then develops an adder tree to sum this set When t= 1 the maximum number of regions intersections of n t-flats. In general, q will be of twenty entries. He then shows that a tree will be dependent on both t and n. It is first shown of nineteen adders (I believe twenty are

430 citations


"Faster and energy-efficient signed ..." refers background in this paper

  • ...The first column compression multiplier was introduced by Wallace in 1964 [3]....

    [...]

Journal ArticleDOI
TL;DR: This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSLA, and develops and compared with the regular SQRT C SLA architecture.
Abstract: Carry Select Adder (CSLA) is one of the fastest adders used in many data-processing processors to perform fast arithmetic functions. From the structure of the CSLA, it is clear that there is scope for reducing the area and power consumption in the CSLA. This work uses a simple and efficient gate-level modification to significantly reduce the area and power of the CSLA. Based on this modification 8-, 16-, 32-, and 64-b square-root CSLA (SQRT CSLA) architecture have been developed and compared with the regular SQRT CSLA architecture. The proposed design has reduced area and power as compared with the regular SQRT CSLA with only a slight increase in the delay. This work evaluates the performance of the proposed designs in terms of delay, area, power, and their products by hand with logical effort and through custom design and layout in 0.18-μm CMOS process technology. The results analysis shows that the proposed CSLA structure is better than the regular SQRT CSLA.

377 citations