Showing papers on "Carry-lookahead adder published in 1996"

PDF

Open Access

Journal Article•DOI•

Area-time-power tradeoffs in parallel adders

[...]

C. Nagendra¹, Mary Jane Irwin, Robert Michael Owens²•Institutions (2)

Advanced Micro Devices¹, Pennsylvania State University²

01 Dec 1996-IEEE Transactions on Circuits and Systems Ii: Analog and Digital Signal Processing

TL;DR: A uniform static CMOS layout methodology whereby short circuit power mininization is used as the optimization criterion is adopted and a large adder design space is formulated from which an architect can choose an adder with the desired characteristics.

...read moreread less

Abstract: In this paper, several classes of parallel, synchronous adders are surveyed based on their power, delay and area characteristics. The adders studied include the linear time ripple carry and Manchester carry chain adders, the square-root time carry skip and carry select adders, the logarithmic time carry lookahead adder and its variations, and the constant time signed-digit and carry-save adders. Most of the research in the last few decades has concentrated on reducing the delay of addition. With the rising popularity of portable computers, however, the emphasis is on both high speed and low power operation. In this paper we adopt a uniform static CMOS layout methodology whereby short circuit power mininization is used as the optimization criterion. The relative merits of the different adders are evaluated by performing a detailed transistor-level simulation of the adders using HSPICE. Among the two's complement adders, a variation of the carry lookahead adder, called ELM, was found to have the best power-delay product. Based on the results of our experiments, a large adder design space is formulated from which an architect can choose an adder with the desired characteristics.

...read moreread less

221 citations

Proceedings Article•DOI•

Clock-delayed domino for adder and combinational logic design

[...]

Gin Yee¹, Carl Sechen¹•Institutions (1)

University of Washington¹

07 Oct 1996

TL;DR: An innovative dynamic logic family, clock-delayed (CD) domino, was developed to provide gates with either inverting or non-inverting outputs, and the high speed and layout compactness of dynamic logic.

...read moreread less

Abstract: An innovative dynamic logic family, clock-delayed (CD) domino, was developed to provide gates with either inverting or non-inverting outputs, and the high speed and layout compactness of dynamic logic. The characteristics of CD domino are demonstrated in two carry lookahead adder designs and three MCNC combinational logic benchmark circuits. The CD domino designs are compared to designs using static CMOS and standard domino logic. A circuit design tool was developed to automate the design of CD domino circuits. Simulations show a 32-bit CD domino adder comprised of four 8-bit full adders to be 30% faster than a 32-bit standard domino adder, anal a 32-bit CD domino adder comprised of a single 32-bit full adder to be 45% faster. In the combinational logic benchmark circuits, complex inverting and non-inverting gates were used to implement C1355, C3540, and b9. The CD domino circuits were 22%, 43% and 34% faster than their static CMOS counterparts of C1355, C3540 and b9, respectively.

...read moreread less

62 citations

Proceedings Article•DOI•

A 3.5 ns, 64 bit, carry-lookahead adder

[...]

D. Dozza¹, M. Gaddoni¹, G. Baccarani¹•Institutions (1)

University of Bologna¹

12 May 1996

TL;DR: The adder has a novel array structure which represents a variant of the architecture suggested by Brent and Kung, however, it does not require the back propagation of the signals which is necessary for the intermediate carry bits; hence only log/sub 2/ n logic levels are employed for the generation of all the carry signals.

...read moreread less

Abstract: A 3.5 ns, 64 bit, carry-lookahead adder has been designed in full-custom domino logic and manufactured in a standard 1 /spl mu/m CMOS technology featuring two metal levels. The adder has a novel array structure which represents a variant of the architecture suggested by Brent and Kung. As opposed to the latter, however, it does not require the back propagation of the signals which is necessary for the intermediate carry bits; hence only log/sub 2/ n logic levels are employed for the generation of all the carry signals. Furthermore, the structure is highly regular and modular and can be assembled with n log/sub 2/ n identical cells with a fan-out of 2. Therefore, a compact circuit is achieved with excellent performance. The occupied area is 3370/spl times/482 /spl mu/m/sup 2/ with a worst-case 650 mW power dissipation at 100 MHz.

...read moreread less

15 citations

Patent•

Calculating the average of four integer numbers rounded away from zero in a single instruction cycle

[...]

Roney S. Wong¹•Institutions (1)

Samsung¹

22 Apr 1996

TL;DR: In this article, the n-bit average of four signed or unsigned n-bits integer operands (A, B, C and D) rounded away from zero as prescribed in the MPEG standard is calculated in one instruction cycle by appending two bits to a left side of each of the operands, summing the extended operands to provide an n+2 bit sum, removing the two least significant bits of the n + 2 bits sum, and incrementing the n -bit sum as appropriate.

...read moreread less

Abstract: The n-bit average of four signed or unsigned n-bit integer operands (A, B, C and D) rounded away from zero as prescribed in the MPEG standard is calculated in one instruction cycle by appending two bits to a left side of each of the operands to provide four n+2 bit extended operands, summing the extended operands to provide an n+2 bit sum, removing the two least significant bits of the n+2 bit sum to provide an n-bit sum, and incrementing the n-bit sum as appropriate. An append circuit (302) appends two bits to the left sides of the operands, and the extended operands are coupled to an adder circuit (306) that includes adder logic (308) and an n-bit carry lookahead adder (310). The adder logic (308) provides the two least significant bits of the sum of the extended operands, along with n partial sum bits and n partial carry bits to the adder (310). The adder (310) provides a sum output, representing the n most significant bits of the sum of the extended operands, and a sum-plus-one output representing the sum output incremented by one. A multiplexer (314) under control of a control circuit (312) selects one of the sum and sum-plus-one outputs as the n-bit average based on inspection of the two least significant bits and the most significant bit of the sum of the extended operands, and a mode signal indicative of whether the operands are signed or unsigned values.

...read moreread less

15 citations

Patent•

Fast alignment unit for multiply-add floating point unit

[...]

Christopher Hans Olson¹, Martin S. Schmookler¹•Institutions (1)

IBM¹

08 Oct 1996

TL;DR: In this paper, the shift amount generator includes a multiple input adder utilizing carry save adder and carry lookahead adder techniques to minimize delay, and separate decoders for each multiplexer or group of multiplexers.

...read moreread less

Abstract: A floating point arithmetic unit performs a multiply-add function B+(A*C) in which an alignment shifter is responsive to an input signal representative of the B mantissa. The shifter includes a sequential stack of multiplexers, typically three (3), for shifting the B mantissa to align it with the A*C product, and a complementer contained between two of the multiplexers to invert the signals when B is a negative number. A shift amount generator responsive to the A, B and C exponents produces control signals for the multiplexers. The shift amount generator includes a multiple input adder utilizing carry save adder and carry lookahead adder techniques to minimize delay, and separate decoders for each multiplexer or group of multiplexers. The generator also includes a Leading Zeros Anticipator (LZA) circuit for the most significant bits to limit shift amount signals that are within the shifting range of the shifter, which reduces the delay attributed to the carry lookahead adder. The multiplexers are arranged in a sequence such that the control signals for the first multiplexers are dependent only on the least significant bits and thus can be generated earliest, and therefore the delay of these multiplexers and the delay of the complementer is in parallel with the delay for producing the control signals to the last multiplexers.

...read moreread less

14 citations

Journal Article•DOI•

Efficient arithmetic using self-timing

[...]

R. Ramachandran¹, Shih-Lien Lu•Institutions (1)

LSI Corporation¹

01 Dec 1996-IEEE Transactions on Very Large Scale Integration Systems

TL;DR: The work presented here examines the implementation of the most basic element in any datapath-an adder, a carry elimination adder (CEA), which uses self-timing at both the algorithmic and implementation levels and presents a minimal hardware high speed addition mechanism.

...read moreread less

Abstract: Recent advances in VLSI technology have facilitated high levels of integration and the implementation of faster circuits on a chip. Most of the improvements in the performance of digital systems have been brought about by such faster technologies. However, these improvements in technology have brought along with them a host of other constraints. In the faster deep submicron technologies, the wire delays constitute a significant portion of the overall delay of the system and hence some of the advantages of faster technologies are lost. The high level of integration necessitates clock distribution schemes which minimize skew across the die. These result in area penalties and adversely affect the level of integration possible at the chip level. Hence, changes in the basic architecture of computing elements of a system, which when implemented in silicon introduces reduced interconnect delays and simpler clock distribution networks, will result in more effective performance improvements. The work presented here examines the implementation of the most basic element in any datapath-an adder. The adder, a carry elimination adder (CEA), uses self-timing at both the algorithmic and implementation levels and presents a minimal hardware high speed addition mechanism. The adder exploits the nature of the input operands dynamically, which results in its average case convergence time approaching that of the ubiquitous carry lookahead adder (CLA) and the hardware complexity of a carry ripple adder (CRA). Use of self-timing results in the elimination of a global clock and hence clock-skew.

...read moreread less

8 citations

Journal Article•DOI•

NONRESTORING RADIX-2k SQUARE ROOTING ALGORITHM

[...]

A. E. Bashagha¹, M.K. Ibrahim¹•Institutions (1)

De Montfort University¹

01 Jun 1996-Journal of Circuits, Systems, and Computers

TL;DR: This paper presents a new high radix square rooting algorithm where a number of square root bits (one digit) are generated in one step, which offers a higher speed than that of the conventional bit parallel binary one.

...read moreread less

Abstract: This paper presents a new high radix square rooting algorithm where a number of square root bits (one digit) are generated in one step. Therefore, the proposed algorithm offers a higher speed than that of the conventional bit parallel binary one. This algorithm can be considered as a generalisation of the conventional bit parallel binary algorithm, and therefore it can be implemented using the existing simple binary elements. The proposed algorithm makes use only of the odd values of the square root to generate the possible values of the radicand and therefore, it requires less area than the conventional restoring high radix algorithm which uses all the values of the square root. This algorithm is general for any radix. Any adder can be used in the basic cell, it can be a carry ripple adder or a carry lookahead adder. As an example of a radix-2k square root architecture, a 9-bit radix-23 architecture is presented in this paper.

...read moreread less

3 citations

Proceedings Article•DOI•

Optimization of spanning tree carry lookahead adders

[...]

J. Blackburn, L. Arndt, E.E. Swartzlander

03 Nov 1996

TL;DR: This paper examines the optimization of the 64-bit spanning tree carry lookahead adder by sizing the transistors in the different Manchester carry chain blocks and by adjusting the block widths within the carry tree to reduce the critical delay paths of the carry signals.

...read moreread less

Abstract: This paper examines the optimization of the 64-bit spanning tree carry lookahead adder by sizing the transistors in the different Manchester carry chain blocks and by adjusting the block widths within the carry tree to reduce the critical delay paths of the carry signals. Previous spanning tree designs are re-simulated using HSPICE, with parameters for a 0.35 /spl mu/m CMOS process, to compare against the circuits designed for this paper. After analyzing many different configurations using the 16-bit carry select boundary, two circuits employing an 8-bit carry select boundary are designed and simulated.

...read moreread less

2 citations

Proceedings Article•DOI•

High performance GaAs pseudo dynamic class of logic

[...]

J.F. Lopez, R. Sarmiento, A. Nunez, K. Eshraghian

23 Sep 1996

TL;DR: In this paper Pseudo Dynamic Latched Logic (PDLL) is introduced, this class of logic takes benefits of both static and dynamic structures, by using a permanently refreshing circuitry which allows functionality even at low frequencies and high temperatures.

...read moreread less

Abstract: In this paper Pseudo Dynamic Latched Logic (PDLL) is introduced. This class of logic takes benefits of both static and dynamic structures, by using a permanently refreshing circuitry which allows functionality even at low frequencies and high temperatures. Moreover, because of its dynamic structure, complex gates are possible with a subsequent delay-area-power reduction. PDLL performance is demonstrated by implementing a 4-bit carry lookahead adder fully operative in a range of 6 to 100/spl deg/C. The adder operates at 0.8 GHz with an associated power dissipation of only 5.2 mW.

...read moreread less

1 citations