scispace - formally typeset
Search or ask a question
Proceedings ArticleDOI

An Iterative Logarithmic Multiplier with Improved Precision

TL;DR: The authors present a method which combines the Mitchell's approximation and hardware truncation scheme in a novel way resulting in an iterative multiplier with improved precision and area, which significantly reduce the overall hardware requirement of the multiplier.
Abstract: Recent studies have demonstrated the potential for achieving higher area and power saving with approximate computation in error tolerant applications involving signal and image processing. Multiplication is a major mathematical operation in these applications which when performed in logarithmic number system results in faster and energy efficient design. In this paper, the authors present a method which combines the Mitchell's approximation and hardware truncation scheme in a novel way resulting in an iterative multiplier with improved precision and area. Further, proposed truncation approach and fractional predictor significantly reduce the overall hardware requirement of the multiplier. Experimental results prove the superiority of the proposed multiplier over previous designs.
Citations
More filters
Journal ArticleDOI
TL;DR: The designs of both non-iterative and iterative approximate logarithmic multipliers (ALMs) are studied to further reduce power consumption and improve performance and it is found that the proposed approximate LMs with an appropriate number of inexact bits achieve higher accuracy and lower power consumption than conventional LMs using exact units.
Abstract: In this paper, the designs of both non-iterative and iterative approximate logarithmic multipliers (ALMs) are studied to further reduce power consumption and improve performance. Non-iterative ALMs, that use three inexact mantissa adders, are presented. The proposed iterative ALMs (IALMs) use a set-one adder in both mantissa adders during an iteration; they also use lower-part-or adders and approximate mirror adders for the final addition. Error analysis and simulation results are also provided; it is found that the proposed approximate LMs with an appropriate number of inexact bits achieve higher accuracy and lower power consumption than conventional LMs using exact units. Compared with conventional LMs with exact units, the normalized mean error distance of 16-bit approximate LMs is decreased by up to 18% and the power-delay product has a reduction of up to 37%. The proposed approximate LMs are also compared with previous approximate multipliers; it is found that the proposed approximate LMs are best suitable for applications allowing larger errors, but requiring lower energy consumption. Approximate Booth multipliers fit applications with less stringent power requirements, but also requiring smaller errors. Case studies for error-tolerant computing applications are provided.

109 citations

Proceedings ArticleDOI
02 Nov 2020
TL;DR: This work proposes an optimally approximated and unbiased floating-point approximate multiplier with runtime configurability that improves energy efficiency up to 122× for machine learning on CIFAR-10, with almost negligible accuracy loss.
Abstract: Approximate computing is a promising alternative to improve energy efficiency for IoT devices on the edge. This work proposes an optimally approximated and unbiased floating-point approximate multiplier with runtime configurability. We provide a theoretically sound formulation that turns multiplication approximation to an optimization problem. With the formulation and findings, a multilevel architecture is proposed to easily incorporate runtime configurability and module execution parallelism. Finally, an optimization scheme is applied to improve the area, making it linearly dependent on the precision, instead of quadratically or exponentially as in prior work. In addition to the optimal approximation and configurability, the proposed design has an efficient circuit implementation that uses inversion, shift and addition instead of complex arithmetic operations. When compared to the prior state-of-the-art approximate floating-point multiplier, ApproxLP [30], the proposed design outperforms in all aspects including accuracy, area, and delay. By replacing the regular full-precision multiplier in GPU, the proposed design can improve the energy efficiency for various edge computing tasks. Even with Level 1 approximation, the proposed design improves energy efficiency up to 122× for machine learning on CIFAR-10, with almost negligible accuracy loss.

16 citations


Cites methods from "An Iterative Logarithmic Multiplier..."

  • ...To approximate from a higher design level, [35] proposed a pipelined log-based approximation using the classical Mitchell multiplier with an iterative procedure to improve accuracy....

    [...]

Proceedings ArticleDOI
10 May 2017
TL;DR: The design of both non-iterative and iterative approximate LMs (IALM) are studied to further reduce the power consumption and improve the performance and it is found that the proposed approximate L Ms with appropriate number of inexact bits has achieved even higher accuracy and lower power consumption compared with the conventional LMs using exact units.
Abstract: Lower power has been a main challenge for IC design Approximate computing provides a new approach for low power design Logarithmic multiplier (LM) is a kind of approximate multipliers in nature In this paper, the design of both non-iterative and iterative approximate LMs (IALM) are studied to further reduce the power consumption and improve the performance Non-iterative approximate LMs (ALM) that use three inexact mantissa adders are presented The proposed IALMs use set-one adder in both mantissa adders during the iteration and they also use lower-part-or adders and approximate mirror adders for the final addition The error analysis and simulation results are also provided It is found that the proposed approximate LMs with appropriate number of inexact bits has achieved even higher accuracy and lower power consumption compared with the conventional LMs using exact units To be exact, compared with conventional LMs with exact units, the normalized mean error distance (NMED) of 16-bit approximate LMs is decreased by up to 18% and the power-delay product (PDP) has a reduction of up to 37% The proposed approximate LMs are also compared with previous approximate Booth multipliers It is found that approximate LMs are more suitable for applications allowing large errors but require less power consumption, while approximate Booth multipliers fit for applications allowing larger power but require less errors

13 citations

Journal ArticleDOI
HyunJin Kim1
TL;DR: This paper presents a low‐cost two‐stage approximate multiplier for bfloat16 (brain floating‐point) data processing and applies it to the convolutional neural network (CNN) inferences, which shows small accuracy drops with well‐known pre‐trained models for the ImageNet database.

6 citations

Journal ArticleDOI
TL;DR: Imani et al. as mentioned in this paper proposed a piecewise-linearly-approximated and unbiased floating-point approximate multiplier with run-time configurability to improve energy efficiency for IoT devices on the edge.
Abstract: Approximate computing is a promising alternative to improve energy efficiency for IoT devices on the edge. This work proposes a piecewise-linearly-approximated and unbiased floating-point approximate multiplier with run-time configurability. We provide a theoretically sound formulation that turns multiplication approximation to an optimization problem. With the formulation and findings, a multi-level architecture is proposed to easily incorporate run-time configurability and module execution parallelism. Finally, the proposed multiplier is further optimized to reduce the circuit implementation complexity, making the multiplier linearly dependent on the precision requirement, instead of quadratically or exponentially as in prior work. When compared to the prior state-of-the-art approximate floating-point multiplier, ApproxLP M. Imani et al , “ApproxLP: Approximate multiplication with linearization and iterative error control,” in Proc. ACM/IEEE Des. Autom. Conf. , 2019, pp. 1–6., the proposed multiplier outperforms in all the aspects including accuracy, area, and delay. By replacing a full-precision floating-point multiplier in GPU, the proposed design can improve the energy efficiency for various edge computing tasks. Even with Level 1 approximation, the proposed multiplier improves energy efficiency up to 20× for machine learning on CIFAR-10, with almost negligible accuracy loss.

5 citations

References
More filters
Journal ArticleDOI
John N. Mitchell1
TL;DR: A method of computer multiplication and division is proposed which uses binary logarithms and an error analysis is given and a means of reducing the error for the multiply operation is shown.
Abstract: A method of computer multiplication and division is proposed which uses binary logarithms. The logarithm of a binary number may be determined approximately from the number itself by simple shifting and counting. A simple add or subtract and shift operation is all that is required to multiply or divide. Since the logarithms used are approximate there can be errors in the result. An error analysis is given and a means of reducing the error for the multiply operation is shown.

488 citations

Journal ArticleDOI
TL;DR: The main result in this paper establishes the energy savings derived by using probabilistic AND as well as NOT gates constructed from an idealized switch that produces a Probabilistic bit (PBIT).
Abstract: The main result in this paper establishes the energy savings derived by using probabilistic AND as well as NOT gates constructed from an idealized switch that produces a probabilistic bit (PBIT). A probabilistic switch produces the desired value as an output that is 0 or 1 with probability p, represented as a PBIT, and, hence, can produce the wrong output value with a probability of (1-p). In contrast with a probabilistic switch, a conventional deterministic switch produces a BIT whose value is always correct. Our switch-based gate constructions are a particular case of a systematic methodology developed for building energy-aware networks for computing, using PBITS. Interesting examples of such networks include AND, OR, and NOT gates (or, as functions, Boolean conjunction, disjunction, and negation, respectively). To quantify the energy savings, novel measures of "technology independent" energy complexity are also introduced - these measures parallel conventional machine-independent notions of computational complexity such as the algorithm's running time and space. Networks of switches can be related to Turing machines and to Boolean circuits, both of which are widely known and well-understood models of computation. Our gate and network constructions lend substance to the following thesis (established for the first time by K.V. Palem): the mathematical technique referred to as randomization yielding probabilistic algorithms results in energy savings through a physical interpretation based on statistical thermodynamics and, hence, can serve as a basis for energy-aware computing. While the estimates of the energy saved through PBIT-based probabilistic computing switches and networks developed rely on the constructs and thermodynamic models due to Boltzmann, Gibbs, and Planck, this work has also led to the innovation of probabilistic CMOS-based devices and computing frameworks. Thus, for completeness, the relationship between the physical models on which this work is based and the electrical domain of CMOS-based switching is discussed.

194 citations


Additional excerpts

  • ...On the other hand, iterative multipliers tend to improve the precision of the result with each successive iteration....

    [...]

  • ...Index Terms—Logarithmic Number System, Mitchell Algorithm, Iterative Multiplier, Logarithmic Shifter I. INTRODUCTION Many real time applications involving signal and image processing possess an inherent quality of error resilience [1]....

    [...]

Journal ArticleDOI
TL;DR: Two unique algorithms are developed and implemented with low-power and fast circuits that reduce the maximum percent errors that result from binary-to-binary logarithm conversion to 0.9299 percent, 0.4314 percent, and 0.1538 percent.
Abstract: We present a unique 32-bit binary-to-binary logarithm converter including its CMOS VLSI implementation. The converter is implemented using combinational logic only and it calculates a logarithm approximation in a single clock cycle. Unlike other complex logarithm correcting algorithms, three unique algorithms are developed and implemented with low-power and fast circuits that reduce the maximum percent errors that result from binary-to-binary logarithm conversion to 0.9299 percent, 0.4314 percent, and 0.1538 percent. Fast 4, 16, and 32-bit leading-one detector circuits are designed to obtain the leading-one position of an input binary word. A 32-word/spl times/5-bit MOS ROM is used to provide 5-bit integers based on the corresponding leading-one position. Both converter area and speed have been considered in the design approach, resulting in the use of a very efficient 32-bit logarithmic shifter in the 32-bit logarithmic converter. The converter is implemented using 0.6/spl mu/m CMOS technology, and it requires 1,600/spl lambda//spl times/2,800/spl lambda/ of chip area. Simulations of the CMOS design for the 32-bit logarithmic converter, operating at V/sub DD/ equal to 5 volts, run at 55 MHz, and the converter consumes 20 milliwatts.

165 citations


Additional excerpts

  • ...Detailed error analysis and hardware synthesis results of various iterative logarithmic multipliers are carried out in Section V and conclusions are drawn in Section VI....

    [...]

Journal ArticleDOI

154 citations


Additional excerpts

  • ...Logarithmic multipliers can be classified broadly into two categories, iterative and non-iterative, as mentioned above....

    [...]

  • ...Detailed error analysis and hardware synthesis results of various iterative logarithmic multipliers are carried out in Section V and conclusions are drawn in Section VI....

    [...]

  • ...As is well known, multiplication is a common operation in these applications which can be simplified by applying logarithmic transformation as was established by Mitchell [2]....

    [...]

Journal ArticleDOI
TL;DR: An approximation to the computation of the base two logarithm of a binary number, realized with binary circuitry, is described and a method for the reduction of the resulting approximation error by a factor six is given.
Abstract: An approximation to the computation of the base two logarithm of a binary number, realized with binary circuitry, is described. It is known that the logarithm can be obtained approximately from the binary number itself by simple counting and shifting. A method for the reduction of the resulting approximation error by a factor six is given. The same principle can be used for further reduction of the error. The realization involving not only counting and shifting but also binary decision making and addition is described. Technical data about the performance of the constructed computer are given.

142 citations


Additional excerpts

  • ...Detailed error analysis and hardware synthesis results of various iterative logarithmic multipliers are carried out in Section V and conclusions are drawn in Section VI....

    [...]

Trending Questions (1)
How many transistors in a multiplier?

Experimental results prove the superiority of the proposed multiplier over previous designs.