# A binary high speed floating point multiplier

VIT University

^{1}01 Mar 2017-pp 316-321

TL;DR: The multiplier which was developed handles both overflow and underflow cases and the speed of operation is increased compared with Carry Save Multiplier.

Abstract: Objective: To implement an algorithm for improving the speed of Floating Point Multiplication. Methods/Statistical analysis: Recursive Dadda algorithm is used for implementing the floating point multiplier. IEEE 754 single precision binary floating point representation is used for representing Floating Point number. For the multiplication of mantissa Carry Save multiplier is replaced by Dadda multiplier for improving the speed. Using Verilog HDL multiplier is implemented and it is targeted to Xilinx vertex-5 FPGA. Improvements: The speed of operation is increased compared with Carry Save Multiplier. The multiplier which we developed handles both overflow and underflow cases.

##### Citations

More filters

••

01 Aug 2018TL;DR: floating point multiplier in round to zero mode is investigated and truncated wallace tree is proposed for mantissa multiplication to reduce number of full and half adders when truncation of binary bits is employed.

Abstract: Hardware implementation of digital signal processing algorithms such as filters largely requires multipliers. For addressing dynamic range of data to be processed floating point representation are preferred over fixed point. But floating point multiplier imposes challenges to designer due to their significant delay and area. Here, floating point multiplier in round to zero mode is investigated and truncated wallace tree is proposed for mantissa multiplication. Comparison reveals that number of full adders is reduced by 30% and number of half adders is reduced by 39.7% when truncation of binary bits is employed. With the help of Verilog description and Xilinx Vivado design suite existing and proposed structure were implemented targeting Artix-7 FPGA.

2 citations

### Cites methods from "A binary high speed floating point ..."

...Another variant is Dadda tree [4] similar to Wallace but utilizes less number of adders as shown in [18]....

[...]

••

01 Nov 2019

TL;DR: This paper studies the performance characteristics of three different floating-point multiplier schemes, namely binary array multiplier and scaled versions of Vedic multiplier and Wallace tree multiplier using a simple ripple-carry adder design for the addition of intermediate products resulting from mantissa multiplication.

Abstract: Floating-point representation is flexible and extremely scalable compared to fixed-point representation due to its high dynamic range and accuracy in modeling fractional numbers, which are the prerequisites of many fields of computation such as signal and graphics processing, and astronomical and subatomic physics calculations. The IEEE 754 format defines the standard for single-precision (32-bit) floating-point numbers and splits a 32-bit number into three parts, namely the sign, exponent, and mantissa/significand. The multiplier design influences the overall performance, area, and latency of a floating-point multiplier. This paper studies the performance characteristics of three different floating-point multiplier schemes, namely binary array multiplier and scaled versions of Vedic multiplier and Wallace tree multiplier using a simple ripple-carry adder design for the addition of intermediate products resulting from mantissa multiplication. The designs coded in Verilog HDL are simulated using Xilinx ISim. RTL blocks are synthesized using Xilinx ISE 14.7 with implementations targeted on a Spartan6 XC6SLX45 FPGA device. The floating-point multipliers are analyzed and compared based on performance characteristics for efficiency, such as area, latency, static and dynamic power consumption, and power delay product. Notably, the designed Wallace tree multiplier exhibits a marked 31.13% latency reduction over the conventional array multiplier with a concomitant increase of 53.54% in the device area.

2 citations

### Cites background or methods from "A binary high speed floating point ..."

...Various implementations of floating-point multipliers such as the Array multiplier [3], Vedic multiplier [5], Braun multiplier [9], Karatsuba multiplier [13], and Wallace tree multiplier [8,9] have been designed, synthesized and implemented in target FPGA devices successfully....

[...]

...This paper analyses and reports a comparative study on the performance characteristics of various schemes that have been proposed and implemented for single-precision binary floatingpoint multipliers [1,3]....

[...]

••

TL;DR: In this paper , a single-precision floating-point multiplier (FPM) structure is proposed to find the possibilities for the reduction in delay and area of the FPM.

•

01 Jan 2020

TL;DR: In this paper, a math coprocessor for the AMIR CPU that can perform addition, subtraction, multiplication and division on IEEE-754 single precision floating-point numbers is presented.

Abstract: Math coprocessors are vital components in modern computing to improve the overall performance of the system. The AMIR CPU is a homegrown softcore 32-bit CPU that can only handle integer numbers making it inadequate for high-performance real-time systems. The aim of this project is to design and develop a math coprocessor for the AMIR CPU that can perform addition, subtraction, multiplication and division on IEEE-754 single precision floating-point numbers. The design of the math coprocessor is devised and improved based on past works on IEEE 754 floating-point operations and math coprocessor implementations. The architecture of the proposed math coprocessor consists of a control unit with instruction decode, floating-point computation unit and a register file. The architecture type is a serial controller with pipelined data path. The proposed math coprocessor retrieves instruction from the instruction register, decodes it, retrieves operands from the CPU register, performs computation then stores the results into the internal register, pending retrieval from the AMIR CPU. The proposed math coprocessor managed to achieve at least 99.999% accuracy for all four arithmetic operations with a maximum frequency of 63.8 MHz, while utilizing less than 30% of the available resource on board an Intel Cyclone IV EP4CE10E22C8 FPGA. The design is not without flaws as the proposed design has problems with instruction queueing due to the absence of an instruction buffer. Nevertheless, with further improvements and features, the proposed math coprocessor has the potential to enable the AMIR CPU to be used in a wide range of applications.

##### References

More filters

••

19 Apr 1995

TL;DR: Using higher-level languages, like VHDL, facilitates the development of custom operators without significantly impacting operator performance or area, as well as properties, including area consumption and speed of working arithmetic operator units used in real-time applications.

Abstract: Many algorithms rely on floating point arithmetic for the dynamic range of representations and require millions of calculations per second. Such computationally intensive algorithms are candidates for acceleration using custom computing machines (CCMs) being tailored for the application. Unfortunately, floating point operators require excessive area (or time) for conventional implementations. Instead, custom formats, derived for individual applications, are feasible on CCMs, and can be implemented on a fraction of a single FPGA. Using higher-level languages, like VHDL, facilitates the development of custom operators without significantly impacting operator performance or area. Properties, including area consumption and speed of working arithmetic operator units used in real-time applications, are discussed.

248 citations

••

TL;DR: An assessment of the strengths and weaknesses of using FPGA's for floating-point arithmetic.

Abstract: We present empirical results describing the implementation of an IEEE Standard 754 compliant floating-point adder/multiplier using field programmable gate arrays. The use of FPGA's permits fast and accurate quantitative evaluation of a variety of circuit design tradeoffs for addition and multiplication. PPGA's also permit accurate assessments of the area and time costs associated with various features of the IEEE floating-point standard, including rounding and gradual underflow. These costs are analyzed, along with the effects of architectural correlation, a phenomenon that occurs when the cost of combining architectural features exceeds the sum of separate implementation. We conclude with an assessment of the strengths and weaknesses of using FPGA's for floating-point arithmetic. >

93 citations

••

03 Nov 2002TL;DR: A group of IEEE 754-style floating-point units targeted at Xilinx VirtexII FPGA are presented, taking advantage of special features of the technology to produce optimised components.

Abstract: The paper presents a group of IEEE 754-style floating-point units targeted at Xilinx VirtexII FPGA. Special features of the technology are taken advantage of to produce optimised components. Pipelined designs are given that show the latency of /spl sim/100 MHz single-precision components. Non-pipelined reference designs are included for future comparison purposes.

74 citations

••

VIT University

^{1}TL;DR: Simulation and small signal analysis of dc-dc boost converter with closed loop control and complete state-space analysis are presented to obtain output voltage to duty ratio transfer-functions for both ideal and Non-ideal boost converter.

Abstract: This paper presents simulation and small signal analysis of dc-dc boost converter with closed loop control. Small signal model of the boost converter used to analyze the small deviations around the steady-state operating point which will help in modeling the closed loop converter parameters. Complete state-space analysis is done to obtain output voltage to duty ratio transfer-functions for both ideal and Non-ideal boost converter. PI controller is designed using root locus plots for both ideal and Non-ideal cases. The model of the converter is designed and simulated for both the cases with closed loop voltage mode control providing load disturbance using MATLAB. Results are observed and compared for both ideal and Non-ideal cases.

29 citations

••

02 Jul 2008TL;DR: Three existing floating point multiplication rounding algorithms are modified for interval arithmetic - the Even-Seidel, Quach and Yu-Zyner algorithms - and a new rounding scheme is proposed that has the best performance/area ratio.

Abstract: Floating point multipliers with two differently rounded results for the same operation can be used for increasing the performance of interval multiplication. The present paper stands by this idea, by investigating the idea of using three existing floating point multiplication rounding algorithms for such multipliers - the Even-Seidel, Quach and Yu-Zyner algorithms. These three rounding schemes are modified for interval arithmetic; furthermore, a new rounding scheme is proposed. The estimates rendered by our analysis show that the proposed scheme has the best performance/area ratio.

5 citations