scispace - formally typeset
Open AccessJournal ArticleDOI

Design and Analysis of Approximate Compressors for Multiplication

TLDR
The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio.
Abstract
Inexact (or approximate) computing is an attractive paradigm for digital processing at nanometric scales. Inexact computing is particularly interesting for computer arithmetic designs. This paper deals with the analysis and design of two new approximate 4-2 compressors for utilization in a multiplier. These designs rely on different features of compression, such that imprecision in computation (as measured by the error rate and the so-called normalized error distance) can meet with respect to circuit-based figures of merit of a design (number of transistors, delay and power consumption). Four different schemes for utilizing the proposed approximate compressors are proposed and analyzed for a Dadda multiplier. Extensive simulation results are provided and an application of the approximate multipliers to image processing is presented. The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio (more than 50 dB for the considered image examples).

read more

Content maybe subject to copyright    Report

For Peer Review Only
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
1
Design and Analysis of
Approximate Compressors for Multiplication
A. Momeni, J. Han, Member, P.Montuschi, Senior Member and F. Lombardi, Fellow
Abstract—Inexact (or approximate) computing is an attractive
paradigm for digital processing at nanometric scales. Inexact
computing is particularly interesting for computer arithmetic
designs. This paper deals with the analysis and design of two new
approximate 4-2 compressors for utilization in a multiplier.
These designs rely on different features of compression, such that
imprecision in computation (as measured by the error rate and
the so-called normalized error distance) can meet with respect to
circuit-based figures of merit of a design (number of transistors,
delay and power consumption). Four different schemes for
utilizing the proposed approximate compressors are proposed
and analyzed for a Dadda multiplier. Extensive simulation results
are provided and an application of the approximate multipliers
to image processing is presented. The results show that the
proposed designs accomplish significant reductions in power
dissipation, delay and transistor count compared to an exact
design; moreover, two of the proposed multiplier designs provide
excellent capabilities for image multiplication with respect to
average normalized error distance and peak signal-to-noise ratio
(more than 50dB for the considered image examples).
Index TermsCompressor, Dadda Multiplier, Inexact
Computing, Approximate Circuits
I. INTRODUCTION
OST
computer arithmetic applications are
implemented using digital logic circuits, thus
operating with a high degree of reliability and
precision. However, many applications such as in multimedia
and image processing can tolerate errors and imprecision in
computation and still produce meaningful and useful results.
Accurate and precise models and algorithms are not always
suitable or efficient for use in these applications. The
paradigm of inexact computation relies on relaxing fully
precise and completely deterministic building modules when
for example, designing energy-efficient systems. This allows
imprecise computation to redirect the existing design process
of digital circuits and systems by taking advantage of a
decrease in complexity and cost with possibly a potential
increase in performance and power efficiency. Approximate
(or inexact) computing relies on using this property to design
simplified, yet approximate circuits operating at higher
performance and/or lower power consumption compared with
precise (exact) logic circuits [1].
___________________________________________
A Momeni and F. Lombardi are with the Department of Electrical and
Computer Engineering, Northeastern University, Boston, MA 02115, USA;
{lombardi@ece.neu.edu, momeni.a@husky.neu.edu}. J. Han is with the
Department of Electrical and Computer Engineering, University of Alberta,
Edmonton, Canada; {jhan8@ualberta.ca}, P. Montuschi is withthe
Department of Control and Computer Engineering, Politecnico di Torino,
Turin, Italy;{paolo.montuschi@polito.it)
Addition and multiplication are widely used operations in
computer arithmetic; for addition full-adder cells have been
extensively analyzed for approximate computing [2-4]. [1] has
compared these adders and proposed several new metrics for
evaluating approximate and probabilistic adders with respect
to unified figures of merit for design assessment for inexact
computing applications. For each input to a circuit, the error
distance (ED) is defined as the arithmetic distance between an
erroneous output and the correct one [1]. The mean error
distance (MED) and normalized error distance (NED) are
proposed by considering the averaging effect of multiple
inputs and the normalization of multiple-bit adders. The NED
is nearly invariant with the size of an implementation and is
therefore useful in the reliability assessment of a specific
design. The tradeoff between precision and power has also
been quantitatively evaluated in [1].
However, the design of approximate multipliers has
received less attention. Multiplication can be thought as the
repeated sum of partial products; however, the straightforward
application of approximate adders when designing an
approximate multiplier is not viable, because it would be very
inefficient in terms of precision, hardware complexity and
other performance metrics. Several approximate multipliers
have been proposed in the literature [4] [5] [6] [7]. Most of
these designs use a truncated multiplication method; they
estimate the least significant columns of the partial products as
a constant. In [4], an imprecise array multiplier is used for
neural network applications by omitting some of the least
significant bits in the partial products (and thus removing
some adders in the array). A truncated multiplier with a
correction constant is proposed in [5]. For an n×n multiplier,
this design calculates the sum of the n+k most significant
columns of the partial products and truncates the other n-k
columns. The n+k bit result is then rounded to n bits. The
reduction error (i.e. the error generated by truncating then-k
least significant bits) and rounding error (i.e. the error
generated by rounding the result to n bits) are found in the
next step. The correction constant (n+k bits) is selected to be
as close as possible to the estimated value of the sum of these
errors to reduce the error distance.
A truncated multiplier with constant correction has the
maximum error if the partial products in the n-k least
significant columns are all ones or all zeros. A variable
correction truncated multiplier has been proposed in [6].This
method changes the correction term based on column n-k-1. If
all partial products in columnn-k-1 are one, then the correction
term is increased. Similarly, if all partial products in this
column are zero, the correction term is decreased.
In [7], a simplified (and thus inaccurate) 2x2 multiplier
block is proposed for building larger multiplier arrays. In the
design of a fast multiplier, compressors have been widely used
M
Page 1 of 13 Transactions on Computers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review Only
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
2
[8-10] to speed up the partial product reduction tree and
decrease power dissipation. Optimized designs of 4-2 exact
compressors have been proposed in [8, 11 - 16]. [17] [18] have
also considered compression for approximate multiplication.
In [17], an approximate signed multiplier has been proposed
for use in arithmetic data value speculation (AVDS);
multiplication is performed using the Baugh-Wooley
algorithm. However, no new design is proposed for the
compressors for the inexact computation. Designs of
approximate compressors have been proposed in [18];
however, these designs do not target multiplication. It should
be noted that the approach of [7] improves over [17] [18] by
utilizing a simplified multiplier block that is amenable to
approximate multiplication.
Initially in this paper, two novel approximate 4-2
compressors are proposed and analyzed. It is shown that these
simplified compressors have better delay and power
consumption than the optimized (exact) 4-2 compressor
designs found in the technical literature [8]. These
approximate compressors are then used in the restoration
module of a Dadda multiplier; four different schemes are
proposed for inexact multiplication. Extensive simulation
results are provided at circuit-level for figures of merit, such
as delay, transistor count, power dissipation, error rate and
normalized error distance under CMOS feature sizes of 32, 22
and 16 nm. The application of these multipliers to image
processing is then presented. The results of two examples of
multiplication of two images are reported; these results show
that the third and fourth approximate multipliers yield an
output product image that has a very high quality and
resemblance to the image generated by an exact multiplier, i.e.
excellent values for the average NED and the Peak Signal-to-
Noise Ratio (PSNR) are found (for the PSNR more than
50db). The analysis and simulation results show that the
proposed approximate designs for both the compressor and the
multiplier are viable candidates for inexact computing.
This paper is organized as follows. Section 2 is a review of
existing schemes for (exact) compressors. The two new
designs of an approximate 4-2 compressor are presented in
Section 3.Multiplication and four different approximate
multipliers are proposed in Section 4. Simulation results for
the approximate compressors and multipliers are provided in
Section 5. The application of the proposed approximate
multipliers to image processing is presented in Section 6.
Section 7 concludes the manuscript.
II.
E
XACT COMPRESSORS
The main goal of either multi-operand carry-save addition
or parallel multiplication is to reduce n numbers to two
numbers; therefore, n-2 compressors (or n-2 counters) have
been widely used in computer arithmetic. An-2 compressor
(Figure 1) is usually a slice of a circuit that reduces n numbers
to two numbers when properly replicated. In slice i of the
circuit, the n-2 compressor receives n bits in position i and one
or more carry bits from the positions to the right, such as i – 1
or i – 2. It produces two output bits in positions i and i + 1 and
one or more carry bits into the higher positions, such as i + 1
or c n hown in
Fig th e
i + 2.For the orrect operatio of the circuit s
ure 1, e following inequality must be satisfi d






 (1)
Figure 1.Schematic diagram of n-2 compressors in a multi operand addition
circuit [13]
Where
denotes the number of carry bits from slice ito
slice i+ j.
A widely used structure for compression is the 4-2
compressor; a 4-2 compressor (Figure 2) can be implemented
with a carry bit between adjacent slices (
). The carry bit
from the position to the right is denoted as c
in
while the carry
bit into the higher position is denoted as c
out
. The two output
bits in positions i and i + 1are also referred to as the sum and
carry respectively.
Figure2.4-2 compressor
The following equations give the outputs of the 4-2
r, e e truth table. compresso whil Tabl 1 shows its
      (2)

󰇛
 
󰇜

󰇛
 
󰇜
 (3)

󰇛
   
󰇜
󰇛󰇜
 (4)
The common implementation of a 4-2 compressor is
accomplished by utilizing two full-adder (FA) cells (Figure 3)
[8]. Different designs have been proposed in the literature for
4-2 compressor [8, 11-16].
Figure 4 shows the optimized design of an exact4-2
compressor based on the so-called XOR-XNOR gates [8]; a
XOR-XNOR gate simultaneously generates the XOR and
XNOR output signals. The design of [8] consists of three
XOR-XNOR (denoted by XOR
*
) gates, one XOR and two 2-1
MUXes. The critical path of this design has a delay of 3Δ,
where Δ is the unitary delay through any gate in the design.
Page 2 of 13Transactions on Computers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review Only
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
3
Figure 3. Implementation of 4-2 Compressor
TABLE
I
T
RUTH TABLE OF 4-2 COMPRESSOR
c
in
X
4
X
3
X
2
X
1
c
out
carry sum
0 0 0 0 0 0 0 0
0 0 0 0 1 0 0 1
0 0 0 1 0 0 0 1
0 0 0 1 1 1 0 0
0 0 1 0 0 0 0 1
0 0 1 0 1 1 0 0
0 0 1 1 0 1 0 0
0 0 1 1 1 1 0 1
0 1 0 0 0 0 0 1
0 1 0 0 1 0 1 0
0 1 0 1 0 0 1 0
0 1 0 1 1 1 0 1
0 1 1 0 0 0 1 0
0 1 1 0 1 1 0 1
0 1 1 1 0 1 0 1
0 1 1 1 1 1 1 0
1 0 0 0 0 0 0 1
1 0 0 0 1 0
1 0
1 0 0 1 0 0 1 0
1 0 0 1 1 1 0 1
1 0 1 0 0 0 1 0
1 0 1 0 1 1 0 1
1 0 1 1 0 1 0 1
1 0 1 1 1 1 1 0
1 1 0 0 0 0 1 0
1 1 0 0 1 0 1 1
1 1 0 1 0 0 1 1
1 1 0 1 1 1 1 0
1 1 1 0 0 0 1 1
1 1 1 0 1 1 1 0
1 1 1 1 0 1 1 0
1 1 1 1 1 1 1 1
III. PROPOSED APPROXIMATE COMPRESSORS
In this section, two designs of an approximate compressor
are proposed. Intuitively to design an approximate 4-2
compressor, it is possible to substitute the exact full-adder
cells in Figure3 by an approximate full-adder cell (such as the
first design proposed in [2]). However, this is not very
efficient, because it produces at least 17 incorrect results out
of 32 possible outputs, i.e. the error rate of this inexact
compressor is more than 53% (where the error rate is given
by the ratio of the number of erroneous outputs over the total
number of outputs). Two different designs are proposed next
to reduce the error rate; these designs offer significant
performance improvement compared to an exact compressor
with respect to delay, number of transistors and power
consumption.
Figure4. Optimized 4-2 compressor of [8]
A. Design 1
As shown in Table I, the carry output in an exact
compressor has the same value of the input c
in
in 24 out of 32
states. Therefore, an approximate design must consider this
feature. In Design 1, the carry is simplified to c
in
by changing
o e other 8 outputs. the value f th
  (5)
Since the Carry output has the higher weight of a binary bit,
an erroneous value of this signal will produce a difference
value of two in the output. For example, if the input pattern is
“01001” (row 10 of Table II), the correct output is “010” that
is equal to 2. By simplifying the carry output to c
in
, the
approximate compressor will generate the “000” pattern at the
output (i.e. a value of 0). This substantial difference may not
be acceptable; however, it can be compensated or reduced by
simplifying the c
out
and sum signals. In particular, the
simplification of sum to a value of 0 (second half of Table II)
reduces the difference between the approximate and the exact
outputs as well as the complexity of its design. Also, the
presence of some errors in the sum signal will results in a
reductions of the delay of producing the approximate sum and
the overall delay of the design (because it is on the critical
path).
 
󰇛 

󰇜 (6)
In the last step, the change of the value of c
out
in some
states, may reduce the error distance provided by approximate
carry and sum and also more simplification in the proposed
design.

󰇛




󰇜
(7)
Page 3 of 13 Transactions on Computers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review Only
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
4
Although the above mentioned simplifications of carry and
sum increase the error rate in the proposed approximate
compressor, its design complexity and therefore the power
consumption are considerably decreased. This can be realized
by comparing (2)-(4) and (5)-(7).Table II shows the truth table
of the first proposed approximate compressor. It also shows
the difference between the inexact output of the proposed
approximate compressor and the output of the exact
compressor. As shown in Table II, the proposed design has 12
incorrect outputs out of 32 outputs (thus yielding an error rate
of 37.5%). This is less than the error rate using the best
approximate full-adder cell of [2].
TABLE II
T
RUTH TABLE OF THE FIRSTAPPROXIMATE 4-2 COMPRESSOR
c
in
X
4
X
3
X
2
X
1
c
out
’ carry’ sum'
Difference
0 0 0 0 0 0 0 1 1
0 0 0 0 1 0 0 1 0
0 0 0 1 0 0 0 1 0
0 0 0 1 1 0 0 1 -1
0 0 1 0 0 0 0 1 0
0 0 1 0 1 1 0 0 0
0 0 1 1 0 1 0 0 0
0 0 1 1 1 1 0 1 0
0 1 0 0 0 0 0 1 0
0 1 0 0 1 1 0 0 0
0 1 0 1 0 1 0 0 0
0 1 0 1 1 1 0 1 0
0 1 1 0 0 0 0 1 -1
0 1 1 0 1 1 0 1 0
0 1 1 1 0 1 0 1 0
0 1 1 1 1 1 0 1
-1
1 0 0 0 0 0 1 0 1
1 0 0 0 1 0 1 0 0
1 0 0 1 0 0 1 0 0
1 0 0 1 1 0 1 0 -1
1 0 1 0 0 0 1 0 0
1 0 1 0 1 1 1 0 1
1 0 1 1 0 1 1 0 1
1 0 1 1 1 1 1 0 0
1 1 0 0 0 0 1 0 0
1 1 0 0 1 1 1 0 1
1 1 0 1 0 1 1 0 1
1 1 0 1 1 1 1 0 0
1 1 1 0 0 0 1 0 -1
1 1 1 0 1 1 1 0 0
1 1 1 1 0 1 1
0 0
1 1 1 1 1 1 1 0 -1
(5)-(7) are the logic expressions for the outputs of the first
design of the approximate 4-2 compressor proposed in this
manuscript.
The gate level structure of the first proposed design (Figure
6) shows that the critical path of this compressor has still a
delay of 3Δ, so it is the same as for the exact compressor of
Figure 5. However, the propagation delay through the gates of
this design is lower than the one for the exact compressor. For
example, the propagation delay in the XOR* gate that
generates both the XOR and XNOR signals in [8], is higher
than the delay through a XNOR gate of the proposed design.
Therefore, the critical path delay in the proposed design is
lower than in the exact design and moreover, the total number
of gates in the proposed design is significantly less than that in
the optimized exact compressor of [8].
B. Design 2
A second design of an approximate compressor is proposed
to further increase performance as well as reducing the error
rate. Since the carry and c
out
outputs have the same weight,
the proposed equations for the approximate carry and c
out
in
the previous part can be interchanged. In this new design,
carry uses the right hand side of (7) and c
out
is always equal to
c
in
; since c
in
is zero in the first stage, c
out
and c
in
will be zero in
all stages. So, c
in
and c
out
can be ignored in the hardware
design. Figure 7shows the block diagram of this approximate
p ons below describe its outputs. 4-2 com ressor and the expressi
 󰇛 

󰇜 (8)

󰇛




󰇜
(9)
Figure 6. Gate level implementation of Design 1
Figure7. Approximate 4-2 compressor, Design 2
Note that (9) is the same as (7) and (8) is the same as (6) for
c
in
= 0. Figure 8 shows the gate level implementation of the
second proposed design. The delay of the critical path of this
approximate design is 2Δ, so it is 1Δ less than the previous
designs; moreover, a further reduction in the number of gates
is accomplished.
Figure 8. Gate level implementation of Design 2
Table III shows the truth table of the second approximate
design for a 4-2 compressor; this Table also shows the
difference between the exact decimal value of the addition of
the inputs and the decimal value of the outputs produced by
the approximate compressor. For example when all inputs are
Page 4 of 13Transactions on Computers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

For Peer Review Only
> REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER (DOUBLE-CLICK HERE TO EDIT) <
5
1, the decimal value of the addition of the inputs is 4.
However, the approximate compressor produces a 1 for the
carry and sum. The decimal value of the outputs in this case is
3; Table II shows that the difference is -1.
TABLE III
T
RUTH TABLE OF SECOND PROPOSED 4-2 COMPRESSOR
X
4
X
3
X
2
X
1
carry’ sum'
difference
0 0 0 0 0 1 1
0 0 0 1 0 1 0
0 0 1 0 0 1 0
0 0 1 1 0 1 -1
0 1 0 0 0 1 0
0 1 0 1 1 0 0
0 1 1 0 1 0 0
0 1 1 1 1 1 0
1 0 0 0 0 1 0
1 0 0 1 1 0 0
1 0 1 0 1 0 0
1 0 1 1 1 1 0
1 1 0 0 0 1 -1
1 1 0 1 1 1 0
1 1 1 0 1 1 0
1 1 1 1 1 1 -1
This design has therefore 4 incorrect outputs out of 16
outputs, so its error rate is now reduced to 25%. This is a very
positive feature, because it shows that on a probabilistic basis,
the imprecision of the proposed design is smaller than the
other available schemes.
IV. M
ULTIPLICATION
In this section, the impact of using the proposed
compressors for multiplication is investigated. A fast (exact)
multiplier is usually composed of three parts (or modules) [8].
Partial product generation.
A Carry Save Adder (CSA) tree to reduce the partial
products’ matrix to an addition of only two operands
A Carry Propagation Adder (CPA) for the final
computation of the binary result.
In the design of a multiplier, the second module plays a
pivotal role in terms of delay, power consumption and circuit
complexity. Compressors have been widely used [9, 10] to
speed up the CSA tree and decrease its power dissipation, so
to achieve fast and low-power operation. The use of
approximate compressors in the CSA tree of a multiplier
results in an approximate multiplier.
A 8×8 unsigned Dadda tree multiplier is considered to
assess the impact of using the proposed compressors in
approximate multipliers. The proposed multiplier uses in the
first part AND gates to generate all partial products. In the
second part, the approximate compressors proposed in the
previous section are utilized in the CSA tree to reduce the
partial products. The last part is an exact CPA to compute the
final binary result. Figure 9(a) shows the reduction circuitry of
an exact multiplier for n=8. In this figure, the reduction part
uses half-adders, full-adders and 4-2 compressors; each partial
product bit is represented by a dot. In the first stage, 2 half-
adders, 2 full-adders and 8 compressors are utilized to reduce
the partial products into at most four rows. In the second or
final stage, 1 half-adder, 1 full-adder and 10 compressors are
used to compute the two final rows of partial products.
Therefore, two stages of reduction and 3 half-adders, 3 full-
adders and 18 compressors are needed in the reduction
circuitry of an 8×8Dadda multiplier.
In this paper, four cases are considered for designing an
approximate multiplier.
Figure 9. Reduction circuitry of an 8×8Dadda multiplier, (a) using Design
1 compressors, (b) using Design 2 compressors
In the first case (Multiplier 1), Design 1 is used for all 4-2
compressors in Figure 9(a).
In the second case (Multiplier 2), Design 2 is used for the
4-2 compressors. Since Design 2 does not have c
in
and
c
out
, the reduction circuitry of this multiplier requires a
lower number of compressors (Figure 9(b)). Multiplier 2
uses 6 half-adders, 1 full-adder and 17 compressors.
In the third case (Multiplier 3), Design 1 is used for the
compressors in then-1 least significant columns. The other
n most significant columns in the reduction circuitry use
exact 4-2 compressors.
Page 5 of 13 Transactions on Computers
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60

Citations
More filters
Journal ArticleDOI

Design of Power and Area Efficient Approximate Multipliers

TL;DR: Synthesis results reveal that two proposed multipliers achieve power savings of 72% and 38%, respectively, compared to an exact multiplier, and have better precision when compared to existing approximate multipliers.
Journal ArticleDOI

Design of Approximate Radix-4 Booth Multipliers for Error-Tolerant Computing

TL;DR: The results show that the proposed 16-bit approximate radix-4 Booth multiplier with approximate factors of 12 and 14 are more accurate than existing approximate Booth multipliers with moderate power consumption and the proposed R4ABM2 multiplier with an approximation factor of 14 is the most efficient design.
Journal ArticleDOI

A Review, Classification, and Comparative Evaluation of Approximate Arithmetic Circuits

TL;DR: A review and classification are presented for the current designs of approximate arithmetic circuits including adders, multipliers, and dividers including improvements in delay, power, and area for the detection of differences in images by using approximate dividers.
Journal ArticleDOI

Dual-Quality 4:2 Compressors for Utilizing in Dynamic Accuracy Configurable Multipliers

TL;DR: Four 4:2 compressors, which have the flexibility of switching between the exact and approximate operating modes, are proposed, which are used in the structures of parallel multipliers provides configurable multipliers whose accuracies may change dynamically during the runtime.
Journal ArticleDOI

Approximate Multipliers Based on New Approximate Compressors

TL;DR: Novel approximate compressors and an algorithm to exploit them for the design of efficient approximate multiplier circuits are proposed and synthesized approximate multipliers for several operand lengths using a 40-nm library.
References
More filters
Book

Computer Arithmetic: Algorithms and Hardware Designs

TL;DR: An indispensable resource for instruction, professional development, and research, Computer Arithmetic: Algorithms and Hardware Designs, Second Edition combines broad coverage of the underlying theories of computer arithmetic with numerous examples of practical designs, worked-out examples, and a large collection of meaningful problems.
Book

Digital arithmetic

TL;DR: Digital Arithmetic, two of the field's leading experts, deliver a unified treatment of digital arithmetic, tying underlying theory to design practice in a technology-independent manner, to develop sound solutions, avoid known mistakes, and repeat successful design decisions.
Journal ArticleDOI

Bio-Inspired Imprecise Computational Blocks for Efficient VLSI Implementation of Soft-Computing Applications

TL;DR: It is shown that these proposed Bio-inspired Imprecise Computational blocks (BICs) can be exploited to efficiently implement a three-layer face recognition neural network and the hardware defuzzification block of a fuzzy processor.
Journal ArticleDOI

New Metrics for the Reliability of Approximate and Probabilistic Adders

TL;DR: New metrics are proposed for evaluating the reliability as well as the power efficiency of approximate and probabilistic adders and it is shown that the MED is an effective metric for measuring the implementation accuracy of a multiple-bit adder and that the NED is a nearly invariant metric independent of the size of an adder.
Proceedings ArticleDOI

IMPACT: imprecise adders for low-power approximate computing

TL;DR: This paper proposes logic complexity reduction as an alternative approach to take advantage of the relaxation of numerical accuracy, and demonstrates this concept by proposing various imprecise or approximate Full Adder cells with reduced complexity at the transistor level, and utilizing them to design approximate multi-bit adders.
Related Papers (5)
Frequently Asked Questions (2)
Q1. What are the contributions mentioned in the paper "Design and analysis of approximate compressors for multiplication" ?

This paper deals with the analysis and design of two new approximate 4-2 compressors for utilization in a multiplier. Extensive simulation results are provided and an application of the approximate multipliers to image processing is presented. The results show that the proposed designs accomplish significant reductions in power dissipation, delay and transistor count compared to an exact design ; moreover, two of the proposed multiplier designs provide excellent capabilities for image multiplication with respect to average normalized error distance and peak signal-to-noise ratio ( more than 50dB for the considered image examples ). 

Current and future research addresses the tradeoffs of the different figures of merit in the proposed designs to establish conditions by which combined metrics can be attained. Moreover, physical designs of the approximate multipliers are being pursued to further confirm the analysis presented in this paper. In conclusion, this paper has shown that by an appropriate design of an approximate compressor, multipliers can be designed for inexact computing ; these multipliers offer significant advantages in terms of both circuit-level and error 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 For Peer Review O nly > REPLACE THIS LINE WITH YOUR PAPER IDENTIFICATION NUMBER ( DOUBLE-CLICK HERE TO EDIT ) < 10 figures of merit. Although not discussed and beyond the scope of this manuscript, the proposed designs may also be useful in other arithmetic circuits for applications in which inexact computing can be used.