# Design of high performance 64 bit MAC unit

VIT University

^{1}20 Mar 2013-pp 782-786

TL;DR: A design of high performance 64 bit Multiplier-and-Accumulator (MAC) is implemented in this paper and performs important operation in many of the digital signal processing (DSP) applications.

Abstract: A design of high performance 64 bit Multiplier-and-Accumulator (MAC) is implemented in this paper. MAC unit performs important operation in many of the digital signal processing (DSP) applications. The multiplier is designed using modified Wallace multiplier and the adder is done with carry save adder. The total design is coded with verilog-HDL and the synthesis is done using Cadence RTL complier using typical libraries of TSMC 0.18um technology. The total MAC unit operates at 217 MHz. The total power dissipation is 177.732 mW.

##### Citations

More filters

••

TL;DR: Synthesis results show that the proposed multiplier has the lowest area as compared to other tree-based multipliers, without compromising on the speed of the original Wallace multiplier.

Abstract: Multiplication is one of the most commonly used operations in the arithmetic Multipliers based on Wallace reduction tree provide an area-efficient strategy for high speed multiplication A number of modifications are proposed in the literature to optimize the area of the Wallace multiplier This paper proposed a reduced-area Wallace multiplier without compromising on the speed of the original Wallace multiplier Designs are synthesized using Synopsys Design Compiler in 90 nm process technology Synthesis results show that the proposed multiplier has the lowest area as compared to other tree-based multipliers The speed of the proposed and reference multipliers is almost the same

26 citations

••

20 Mar 2014TL;DR: MAC unit model is designed by incorporating the various multipliers such as Array Multiplier, Ripple Carry Array Multipler with Row Bypassing Technique, Wallace Tree Multipliers and DADDA MultiplIER in the multiplier module and the performance of MAC unit models is analyzed in terms of area, delay and power.

Abstract: In recent years, Multiply-Accumulate (MAC) unit is developing for various high performance applications. MAC unit is a fundamental block in the computing devices, especially Digital Signal Processor (DSP). MAC unit performs multiplication and accumulation process. Basic MAC unit consists of multiplier, adder, and accumulator. In the existing MAC unit model, multiplier is designed using modified Radix-2 booth multiplier. In this paper, MAC unit model is designed by incorporating the various multipliers such as Array Multiplier, Ripple Carry Array Multiplier with Row Bypassing Technique, Wallace Tree Multiplier and DADDA Multiplier in the multiplier module and the performance of MAC unit models is analyzed in terms of area, delay and power. The performance analysis of MAC unit models is done by designing the models in Verilog HDL. Then, MAC unit models are simulated and synthesized in Xilinx ISE 13.2 for Virtex-6 family 40nm technology.

18 citations

••

15 Jun 2017TL;DR: A Vedic multiplication algorithm is designed by using Vedic mathematics formula Urdhava Tiryakbhyam method means vertically and cross wise, which gets less time delay compared to other algorithms.

Abstract: Multiplier is main building block of all processor, which improves the speed of Digital Signal Processor (DSP). In special application in which we need to reduce the time delay. In proposed method, we design a Vedic multiplication algorithm by using Vedic mathematics formula Urdhava Tiryakbhyam method means vertically and cross wise. Vedic mathematics is mainly based on 16 Sutras and was rediscovered in early 20th century. In ancient time in India, people used this Sutra for decimal number multiplications effectively. The same basic concept of Vedic mathematics is applied to multiplication of binary number to make usable in the digital hardware system. The speed of the computation process is increased and the processing time is reduced due to decrease of combinational path delay compared to the existing multipliers. In our proposed multiplication algorithm, we get less time delay compared to other algorithms.[1]

17 citations

••

TL;DR: The proposed circuit has 90% improvement in terms of power over complementary metal–oxide–semiconductor (CMOS) circuits and will give rise to a new thread of research in the field of real-time signal and image treatment.

Abstract: Quantum dot cellular automata (QCA) is a hopeful technology in the field of nanotechnology that seems to suite well with signal-processing needs. It is concerned with great interest because of its benefits such as ultra-low power consumption, small size and can operate at one Terahertz. The multiply accumulator (MAC) unit is considered as one of the essential operations in digital signal processing (DSP). In the real-time DSP systems, several applications like speech processing, video coding, and digital filtering etc. require MAC operations. However, the power dissipation and area are the most significant aspects in these systems. Here, the authors design low power MAC unit based on QCA technology. QCADesigner version 2.0.3 is used to validate the accuracy of the proposed circuit. The reliability of this unit is taken at different temperatures. The power dissipation is estimated using QCAPro tool. The total power consumed by this unit is 2.183 μW. The proposed circuit has 90% improvement in terms of power over complementary metal–oxide–semiconductor (CMOS) circuits. Since the works in the field of QCA logic signal processing has started to progress, the suggested contribution will give rise to a new thread of research in the field of real-time signal and image treatment.

14 citations

##### References

More filters

••

TL;DR: A design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step, using straightforward diode-transistor logic.

Abstract: It is suggested that the economics of present large-scale scientific computers could benefit from a greater investment in hardware to mechanize multiplication and division than is now common. As a move in this direction, a design is developed for a multiplier which generates the product of two numbers using purely combinational logic, i.e., in one gating step. Using straightforward diode-transistor logic, it appears presently possible to obtain products in under 1, ?sec, and quotients in 3 ?sec. A rapid square-root process is also outlined. Approximate component counts are given for the proposed design, and it is found that the cost of the unit would be about 10 per cent of the cost of a modern large-scale computer.

1,750 citations

••

29 Feb 2012

TL;DR: The pertinent choice for selecting the adder topology with the tradeoff between delay, power consumption and area is presented and ripple carry adder is presented.

Abstract: Adders form an almost obligatory component of every contemporary integrated circuit. The prerequisite of the adder is that it is primarily fast and secondarily efficient in terms of power consumption and chip area. This paper presents the pertinent choice for selecting the adder topology with the tradeoff between delay, power consumption and area. The adder topology used in this work are ripple carry adder, carry lookahead adder, carry skip adder, carry select adder, carry increment adder, carry save adder and carry bypass adder. The module functionality and performance issues like area, power dissipation and propagation delay are analyzed at 0.12 µm 6metal layer CMOS technology using microwind tool.

128 citations

••

TL;DR: A modification to the Wallace reduction is presented that ensures that the delay is the same as for the conventional Wallace reduction, producing implementations with 80 percent fewer half adders than standard Wallace multipliers, with a very slight increase in the number of full adders.

Abstract: Wallace high-speed multipliers use full adders and half adders in their reduction phase. Half adders do not reduce the number of partial product bits. Therefore, minimizing the number of half adders used in a multiplier reduction will reduce the complexity. A modification to the Wallace reduction is presented that ensures that the delay is the same as for the conventional Wallace reduction. The modified reduction method greatly reduces the number of half adders; producing implementations with 80 percent fewer half adders than standard Wallace multipliers, with a very slight increase in the number of full adders.

128 citations

••

24 Dec 2003TL;DR: In this paper, a detailed analysis for several sizes of Wallace and Dadda multipliers is presented, and it is shown that despite the presence of a larger carry propagating adder, their design yields a slightly faster multiplier.

Abstract: The two well-known fast multipliers are those presented by Wallace and Dadda. Both consist of three stages. In the first stage, the partial product matrix is formed. In the second stage, this partial product matrix is reduced to a height of two. In the final stage, these two rows are combined using a carry propagating adder. In the Wallace method, the partial products are reduced as soon as possible. In contrast, Dadda's method does the minimum reduction necessary at each level to perform the reduction in the same number of levels as required by a Wallace multiplier. It is generally assumed that, for a given size, the Wallace multiplier and the Dadda multiplier exhibit similar delay. This is because each uses the same number of pseudo adder levels to perform the partial product reduction. Although the Wallace multiplier uses a slightly smaller carry propagating adder, usually this provides no significant speed advantage. A closer examination of the delays within these two multipliers reveals this assumption to be incorrect. This paper presents a detailed analysis for several sizes of Wallace and Dadda multipliers. These results indicate that despite the presence of the larger carry propagating adder, Dadda's design yields a slightly faster multiplier.

120 citations