scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

Run-time reconfigurable multi-precision floating point multiplier design for high speed, low-power applications

TL;DR: This paper presents a run-time-reconfigurable floating point multiplier implemented on FPGA with custom floating point format for different applications and can have 6 modes of operations depending on the accuracy or application requirement.
Abstract: Floating point multiplication is one of the crucial operations in many application domains such as image processing, signal processing etc. But every application requires different working features. Some need high precision, some need low power consumption, low latency etc. But IEEE-754 format is not really flexible for these specifications and also design is complex. Optimal run-time reconfigurable hardware implementations may need the use of custom floating-point formats that do not necessarily follow IEEE specified sizes. In this paper, we present a run-time-reconfigurable floating point multiplier implemented on FPGA with custom floating point format for different applications. This floating point multiplier can have 6 modes of operations depending on the accuracy or application requirement. With the use of optimal design with custom IPs (Intellectual Properties), a better implementation is done by truncating the inputs before multiplication. And a combination of Karatsuba algorithm and Urdhva-Tiryagbhyam algorithm (Vedic Mathematics) is used to implement unsigned binary multiplier. This further increases the efficiency of the multiplier.
Citations
More filters
Proceedings ArticleDOI
20 May 2016
TL;DR: The intent of this design is to reduce the area and combinational path delay to enhance the speed of operation which is attained by parallelism in multiplier which is used for mantissa multiplication and performance is compared with latest research papers regarding delay.
Abstract: Many fields of science, engineering, finance, mathematical optimization methods, Artificial Neural Networks, signal and image processing algorithms requires the operations and manipulations of real numbers. Floating-point operations are most extensively adopted approach for exploiting real numbers. The speed of Floating-point arithmetic unit is very crucial performance parameter which impinges the operation of the system. On that account a 32 bit floating point arithmetic unit is designed for different applications which insists for eminent speed. The intent of this design is to reduce the area and combinational path delay to enhance the speed of operation which is attained by parallelism in multiplier which is used for mantissa multiplication. For Floating-point multiplier Booth recoded multiplier is used where the number of partial product are reduced which in turns boost the speed of multiplication. The proposed module is implemented on Spartan 6 FPGA. Performance of the floating point arithmetic unit is compared with latest research papers regarding delay and it is ascertained that there is 59% of optimization in critical path delay of floating point multiplier and 50 % of optimization of floating point adder. The result illustrates that proposed arithmetic unit has a great impact on convalescent the speed and area of the design.

10 citations

Proceedings ArticleDOI
01 May 2016
TL;DR: A high speed MIPS based 32 bit RISC processor with single precision floating point unit for DSP applications is proposed and results indicates that the proposed design is optimized in speed as well as in area.
Abstract: With the advent of technology, digital signal processing applications are flourishing prominently in space, medical and many commercial related areas. RISC processor is the heart of many high speed applications of embedded and digital signal processing. Floating point representation has prevalent ascendancy over fixed point numbers as it endeavors dynamic range of values. Hence in this paper a high speed MIPS based 32 bit RISC processor with single precision floating point unit for DSP applications is proposed. The inclination of the entire design is towards improving the performance of floating point arithmetic unit so as the performance of the entire RISC processor is ameliorated. The proposed processor is proficient of executing arithmetic, logical, floating point, data transfer, memory, shifting and rotating instructions. The complex multiplication are frequently used in the DSP applications and thus a special instruction for complex multiplication is incorporated. The multiplication engross most of the time, power and area of any operation, on that account the multiplier are reduced in number from four to two as compared to conventional complex multiplication method. The design is coded in Verilog HDL, simulated on Xilinx ISE 13.1 and synthesized on Spartan 6. Results indicates that the proposed design is optimized in speed as well as in area.

6 citations

Journal ArticleDOI
TL;DR: The efficient multimode floating point arithmetic unit forEEE 754 floating point number system is designed and analysed, which gives a better implementation in terms of area of hardware and the number of LUTs used in FPGA is reduced.
Abstract: This Paper Presents a Design and Analysis of Multimode Single Precision Floating PointArithmetic Unit Using VERILOG Hardware Description Language on FPGA. The multimode floatingpoint arithmetic unit have addition, subtraction, multiplication and division operations. The device usedis Zed Board Zynq Evaluation and Developed Kit (xc7z020clg484-1) on which the proposed design willbe physically verified. We design and analyse the efficient multimode floating point arithmetic unit forIEEE 754 floating point number system, which gives a better implementation in terms of area ofhardware. We have four separate units for four different arithmetic operations, by combining additionand subtraction unit into one and multiplication and division unit into one and by efficient optimization.The result of this combination is to reduce the number of LUTs used in FPGA. Thus the total area ofhardware required will be reduced. The LUTs reduction is 14% and area reduction is 19%.

1 citations


Cites background from "Run-time reconfigurable multi-preci..."

  • ...Multiplication of mantissas which uses 24x24 bit integer multiplier is the most critical part in floating point multiplication [10]....

    [...]

Journal Article
TL;DR: This paper concentrates on the related work on design of Vedic ALU, till date from 46 different IEEE papers and provides some common problem statements and solutions that can be used to design the bestALU, which is not done till date.
Abstract: The digital India is playing an important role nowadays. Indian Vedas and Upanishads are also important and have a great history. That too Vedic mathematics was, is and will be an important basic logic for all mathematical related applications in various fields of science and technology. VLSI is an ever growing field which involves Analog, Digital and Mixed mode Designs. Also in VLSI research, an ALU is a very important system which involves user specifications while designing, for applications like microprocessor, microcontroller, signal processing and various fields. This paper concentrates on the related work on design of Vedic ALU, till date from 46 different IEEE papers and provides some common problem statements and solutions that can be used to design the best ALU, which is not done till date. Importance of reconfigurable feature is also explained in this paper.
References
More filters
StandardDOI
01 Jan 2008

1,354 citations


"Run-time reconfigurable multi-preci..." refers background or methods in this paper

  • ...This floating point multiplier can have 6 modes of operations depending on the accuracy or application requirement....

    [...]

  • ...So, to obtain the a simple XOR gate as the sign...

    [...]

Journal Article
TL;DR: A reduced-bit multiplication algorithm based on the ancient Vedic multiplication formulae, Urdhva tiryakbhyam and Nikhilam, is proposed and is further optimized by use of some general arithmetic operations such as expansion and bit-shifting to take advantage of bit-reduction in multiplication.
Abstract: A reduced-bit multiplication algorithm based on the ancient Vedic multiplication formulae is proposed in this paper. Both the Vedic multiplication formulae, Urdhva tiryakbhyam and Nikhilam, are first discussed in detail. Urdhva tiryakbhyam, being a general multiplication formula, is equally applicable to all cases of multiplication. It is applied to the digital arithmetic and is shown to yield a multiplier architecture which is very similar to the popular array multiplier. Due to its structure, it leads to a high carry prop- agation delay in case of multiplication of large numbers. Nikhilam Sutra, on the other hand, is more efficient in the multiplication of large numbers as it reduces the multiplication of two large numbers to that of two smaller numbers. The framework of the proposed algorithm is taken from this Sutra and is further optimized by use of some general arithmetic operations such as expansion and bit-shifting to take advantage of bit-reduction in multiplication. We illustrate the proposed algorithm by reducing a general 4£4-bit multiplication to a single 2 £ 2-bit multiplication operation.

105 citations

Proceedings ArticleDOI
01 Dec 2012
TL;DR: An improved version of tree based Wallace tree multiplier architecture using Booth Recoder using Booth algorithm and compressor adders is proposed, which shows that the proposed architecture is around 67 percent faster than the existing Wallace-tree multiplier.
Abstract: A Wallace tree multiplier using Booth Recoder is proposed in this paper. It is an improved version of tree based Wallace tree multiplier architecture. This paper aims at additional reduction of latency and area of the Wallace tree multiplier. This is accomplished by the use of Booth algorithm and compressor adders. The coding is done in Verilog HDL and synthesized for Xilinx Virtex 6 FPGA device. The result shows that the proposed architecture is around 67 percent faster than the existing Wallace-tree multiplier, 53 percent faster than the Vedic multiplier, 22 percent faster than the radix-8 Booth multiplier, 18 percent faster than the radix-16 Booth Multiplier. In terms of area also, the proposed multiplier is much efficient.

50 citations

Journal ArticleDOI
TL;DR: A simple digital multiplier architecture based on the Urdhva Tiryakbhyam (Vertically and Cross wise) Sutra of Vedic Mathematics is presented and an improved technique for low power and high speed multiplier of two binary numbers (16 bit each) is developed.
Abstract: High-speed parallel multipliers are one of the keys in RISCs (Reduced Instruction Set Computers), DSPs (Digital Signal Processors), and graphics accelerators and so on. Array multiplier, Booth Multiplier and Wallace Tree multipliers are some of the standard approaches used in implementation of binary multiplier which are suitable for VLSI implementation. A simple digital multiplier (henceforth referred to as Vedic Multiplier in short VM) architecture based on the Urdhva Tiryakbhyam (Vertically and Cross wise) Sutra of Vedic Mathematics is presented. An improved technique for low power and high speed multiplier of two binary numbers (16 bit each) is developed. An algorithm is proposed and implemented on 16nm CMOS technology. The designed 16x16 bit multiplier dissipates a power of 0.17 mW. The propagation delay time of the proposed architecture is 27.15ns. These results are many improvements over power dissipations and delays reported in literature for Vedic and Booth Multiplier.

32 citations

26 Mar 2017
TL;DR: In this article, a vedic multiplier using Urdhva Tiryagbhyam sutra in Xilinx ISE is proposed and the design takes lesser time for operation than currently available multipliers.
Abstract: Today's technology has raised demand for Fast and real time signal processing operation. Multiplication is one of the most important arithmetic operations. In this paper, we have proposed design of vedic multiplier using Urdhva Tiryagbhyam sutra in Xilinx ISE. This design takes lesser time for operation than currently available multipliers .It encompasses wide era of image processing and digital signal processing in much efficient way with increase in speed and thus leading to higher performance rating

24 citations