# A Dynamic Four-Bit Carry Lookahead Adder Circuit for Complementary Gallium Arsenide (CGaAs) Fabrication Processes

Khaled Ali Shehata, Douglas J. Fouts and Sherif Michael Department of Electrical and Computer Engineering, Naval Postgraduate School, Monterey, CA93943

Abstract- The design and implementation of three, 4-bit carry lookahead adders using Complementary Gallium Arsenide (CGaAs) HIGFETs is presented, including static, pipelined static and dynamic versions. The designs are compared for speed, power consumption and layout area. The dynamic implementation operates at the highest speed (1.2 GHz) and consumes the best power (0.01  $\mu$ W/ MHz/gate).

#### I. INTRODUCTION

All of the various arithmetic operations (add, subtract, multiply and divide) can be implemented by appropriate combinations of the add function. Thus, addition is the universal data operation for a computer Arithmetic Logic Unit (ALU) and the speed of a digital arithmetic processor depends on the speed of the adders used. The carry lookahead adder is used to speed up carry propagation in the addition operation. The carries entering all bit positions of a parallel adder are generated simultaneously using additional logic circuits. This results in a constant addition time independent of the adder length. However, for long words, carry lookahead is usually performed in 4-bit groups to reduce implementation costs.

Gallium Arsenide (GaAs) has many advantages over Silicon (Si) in its electrical properties. The high-frequency performance of GaAs digital ICs is due to its high electron mobility. Intrinsic resistivity of GaAs is very high falling into the semi-insulating range, which increases the radiation immunity. GaAs transistors are used for digital ICs when the application requires very high speed. CGaAs technology uses both N-channel and P-channel transistors which dramatically decreases the power consumption of the circuit. The P-channel transistors are very slow due to the low hole mobility of GaAs.

This paper presents the design, implementation and evaluation of a dynamic 4-Bit Carry Lookahead Adder (4-B CLA) for fabrication with the Motorola complementary gallium arsenide (CGaAs) [1, 2] fabrication processes. The circuit is then compared for speed, power consumption and layout area against static and pipelined static CGaAs circuits that perform the same logic function.

# II. TWO PHASE DYNAMIC FET LOGIC (TPDL) ARCHETECTURE

Dynamic circuits are non-ratioed logic. Therefore, the transistor sizes can be minimized (in most cases) to reduce layout area and power dissipation without affecting the noise margin or the speed [4]. The dynamic design implemented in this paper uses Two-Phase Dynamic FET Logic (TPDL) [3]. CGaAs TPDL circuits use only the fast Nchannel transistors in evaluating the function. The slow P-channel transistors are used only for precharging the output nodes [3]. Also, there is no direct current path from supply voltage to ground at any time, which eliminates static and short-circuit power consumption in this logic family. Therefore, the TPDL design is significantly faster than the other designs.

The TPDL circuits consist of two main stages, a  $\phi_1$  stage and a  $\phi_2$  stage, as shown in Figure 1. Both clock phases  $\phi_1$  and  $\phi_2$  are non-overlapped in the logic low level. Each stage consists of pass gates, a clocked precharge PFET, a clocked discharge NFET and a N-transistor logic block. The outputs of  $\phi_1$ stages are connected to the inputs of  $\phi_2$  stages and vise versa. During  $\phi_1$  high and  $\phi_2$  low, the first stage is evaluated, the second stage is precharged and the output of the first stage is stored on the second-stage inputs. During  $\phi_2$  high and  $\phi_1$  low, the first stage is precharged, the second stage is evaluated and the output of the second stage is stored on the first-stage inputs. When both  $\phi_1$  and  $\phi_2$  are simultaneously high, both stages ( $\phi_1$  and  $\phi_2$ ) will be evaluated and their outputs will be isolated from the next stages by the off pass gates so there is no corruption of data. Because of the use of pass gates in front of each evaluating logic block, TPDL designs are self latching and well suited for pipelined architectures. TPDL systems can be pipelined to reach the maximum operating frequency without having to add additional storage elements (pipeline registers).



Figure 1: CGaAs TPDL Basic Circuit Topology

## II. CGaAs 4-Bit CLA CIRCUIT DESIGN

The design of a 4-Bit CLA circuit will be explained in detail. The circuits are designed simulated using HSPICE simulation tools then implemented using CADENCE tools. Each block of the circuit is designed and simulated separately and then optimized for layout area and maximum operating frequency. Finally, all blocks are integrated to form the 4-Bit CLA. Propagation delay, determines the maximum operating frequency of the circuit. The CGaAs 4-Bit CLA was simulated with a 1.75 volt power supply. Each output of the circuit was loaded by two inverters (fan-out of two). For the static and pipelined static circuits, the loads were static inverters. While for the TPDL circuit, the load was TPDL inverters.

For the static design, due to the difference in the propagation paths of all the summations and carry out signals, they have different propagation delays. Therefore, the maximum frequency is limited by the longest signal path (longest propagation delay). The critical propagation delay measured from the change in carry in, from logic low to logic high, to the change in the summation output S<sub>3</sub>, from logic high to logic low, is 1.9ns. The duty cycle of the applied input signal should be equal to or longer than the longest propagation delay of the circuit to prevent race conditions. This will limit the maximum frequency of the input signal to 260 MHz (1/(2\*1.9ns)). The 4-Bit CLA circuit consumes an average power of 26 mW at the maximum operating frequency and uses 236 transistors. The summation and the carry out signals do not arrive at the circuit output simultaneously. Thus, the circuit requires a register at the output to

hold the information and apply it to the next stage simultaneously. This will add circuitry and increase the layout area, the transistor count and the power consumption of the circuit. Also, the maximum operating frequency of the circuit will be decreased due to the added delay through the register file.

The pipelined architecture solves the above problems. A pipelined three stage-adder designed here increases the maximum frequency of operation but at the same time increases the transistor count, the power consumption and the layout area. The use of these pipeline registers will assure that all summation and carry out signals will be delivered to the output terminals simultaneously. The number of transistors used in the pipelined adder circuit is 450 transistors. The maximum frequency of operation is limited by the longest stage delay. The circuit works properly up to 550 MHz (more than double that of the static design) and consumes 77.4 mW at that frequency.

The TPDL 4-B CLA circuit described here uses the architecture shown in Figure 1 and has a maximum operating frequency of 1.2 GHz.It consumes 61 mW at the maximum operating frequency when powered from 1.75 V power supply. The fill time of the TPDL 4-Bit CLA circuit is 3 clock periods.

## III. COMPARISON BETWEEN CGaAs STATIC, PIPLINED STATIC AND TPDL 4-Bit CLA

In this section, the comparison between these different designs (static, pipelined static and TPDL) for speed, power consumption and layout area will be performed. Table 1 lists the maximum operating frequency of each design and the power consumption at that frequency. Also, the number of transistors used in each circuit are listed in the table. The CGaAs TPDL CLA has the highest operating frequency of all the studied CGaAs CLA logic designs. The maximum frequency is more than double that of the pipelined static adder and more than four times that of the static adder. The power consumption at the maximum frequency is less than the power consumed by the pipelined adder at half of the maximum frequency.

For the comparison to be fair, it is important to compare the power consumption of all circuits at the same frequency. The average power consumption of static, pipelined static and TPDL adders at 0.26 GHz are 26 mW, 42.74 mW and 23.82 mW, respectively. At 550 MHz, the pipelined static adder consumes 77.4 mW while the TPDL adder consumes 43.66 mW. The static adder will not work at all at this frequency. Figure 2 shows the power consumption of the three adder designs and the frequency ranges of their operation. From this figure, it can be seen

that power consumption increases as the frequency increases for the static adder and the TPDL adder. However, the rate at which power increases for the static circuit is greater than for the TPDL circuit. The power consumption increase for the static adder is linear with the increase in frequency. The rate of power consumption increase for the TPDL adder decreases as the frequency increases and approximates a logarithmic function. At any frequency, the power consumption of the TPDL adder is about half of that for the pipelined static adder. The delay-power product of both the static and the TPDL adders is plotted in Figure 3. The power-delay product decreases with decreasing the power supply because of the decrease in the leakage current. It can be noted from this figure the large difference in power-delay product between the TPDL and the static designs.

TABLE 1: Comparison of CGaAs 4-Bit CLA Designs

| Used<br>Logic<br>Family          | F <sub>max</sub><br>[GHz] | P [mW] | Transistor<br>Count |  |  |
|----------------------------------|---------------------------|--------|---------------------|--|--|
| Static0.26Piplined0.55Static0.55 |                           | 26     | 236                 |  |  |
|                                  |                           | 77.4   | 516                 |  |  |
| TPDL 1.22                        |                           | 61.79  | 450                 |  |  |



Figure 2: Power Consumption of 4-Bit CLAs

## III. LOADING AND POWER SUPPLY EFFECTS ON THE CIRCUIT PERFORMANCE

Loading effects on the performance of the designed CLA circuits have also been studied. The three designs (static, pipelined static and TPDL) of the CLA have been simulated in HSPICE with a 1.75 volt power supply. The output load was varied to measure the maximum operating frequency of the circuit when driving different loads. The number of loads changed from one to ten and the maximum operating frequency of each adder was recorded for each load. Figure 4 shows the maximum frequency of operation for the three adders driving different loads.



Figure 4: Loading Effects on CGaAs 4-Bit CLAs

For the static adder, the limiting parameter for the maximum frequency of operation is the propagation delay through the entire adder. Increasing the load will increase the output capacitance of the adder which increases the charging and discharging times of the output nodes. Therefore, the maximum frequency of the circuit decreases linearly with increasing output load from one to ten.

For the pipelined static adder, the limiting parameter for the maximum operating frequency is the longest stage propagation delay. Fortunately, the longest delay of the three stages is for the middle stage. Increasing the load will only limit the maximum frequency of the last stage. Therefore, increasing the load from one to six will not affect the maximum frequency of the adder. As the load increases to seven, the propagation delay through the last stage becomes longer than for the middle stage and the last stage delay becomes the critical delay, which limits the maximum frequency of operation. Beyond a fan-out of seven, the maximum frequency decreases linearly with increasing load.

For the TPDL adder, the load capacitance is separated from the output by a transmission gate. Thus, increasing the load capacitance will not increase the output capacitance of the TPDL circuit. The limiting factor for the maximum operating frequency is the charge redistribution problem. This problem is common for all the dynamic circuit designs. This adds another advantage for the TPDL designs.

The power supply and input signal levels have also been varied to study their effects on the maximum operating frequency and the power consumption of the different logic designs of the 4-Bit CLA. The highest power supply voltage used is limited by the source-drain leakage current, while the highest input voltage level is limited by the gate leakage current of the transistors. The power supply and the peak-to-peak input voltage are varied from 1.75 volts to 1.00 volt in 0.25 volt steps. The maximum frequency of operation for each circuit, and its power consumption at that frequency for each power supply voltage, are listed in Table 2. The TPDL adder can function properly up to 292 MHz at a power supply of 1.00 volt. The power consumption is 2.1 mW, which is less than one-tenth of the power consumed by the static adder for proper functioning at the same frequency.

| TABLE 2: Performances of CGaAs 4-Bit CLA Desig | gns |
|------------------------------------------------|-----|
|------------------------------------------------|-----|

| Used Family                        |                        | 1.75 V | 1.5 V | 1.25 V | 1.0 V |
|------------------------------------|------------------------|--------|-------|--------|-------|
| Static<br>Design                   | Fmax[GHz]              | 0.262  | 0.217 | 0.151  | 0.091 |
|                                    | P <sub>av</sub> [mW]   | 26.0   | 12.0  | 4.82   | 1.56  |
| pipe-<br>lined<br>Static<br>Design | F <sub>max</sub> [GHz] | 0.55   | 0.413 | 0.262  | 0.135 |
|                                    | P <sub>av</sub> [mW]   | 77.4   | 34.1  | 12.0   | 3.25  |
| TPDL<br>Design                     | F <sub>max</sub> [GHz] | 1.22   | 1.09  | 0.758  | 0.292 |
|                                    | P <sub>av</sub> [mW]   | 61.79  | 30.07 | 12.39  | 2.1   |

### V. CONCLUSIONS

Two-Phase Dynamic Logic (TPDL) is the optimal dynamic logic family ever reported in CGaAs technology. The TPDL circuit has the best performance among the studied logic families because it is the fastest and it has the lowest delay power product (0.01  $\mu$ W/gate/MHz) at all operating frequencies. The use of TPDL architecture increases the throughput of CGaAs 4-Bit CLA circuit. The results presented in this paper are promising. TPDL CGaAs is an excellent candidate for the next generation of high speed, high density and low power ICs such as DSP chips and digital communication ICs.

#### References

- D. E. Grider et al. "0.7 Micron Gate Length Complementary Al<sub>0</sub>, 75Ga<sub>0</sub>, 25As/In<sub>0</sub>, 25Ga<sub>0</sub>, 75As/ GaAs HIGFET Technology for High Speed/Low Power Digital Circuits" IEEE IEDM, pp. 331-334, 1992.
- Bruce Bernhardt, M. LaMacchia, J. Abrokwath, J. Hallmark, R. Lucero, B. Mathes, B. Crawforth, D. Foster, K. Clauss, S. Emmert, T. Lien, E. Lopez, V. Mazzotta and B. Oh, "Complementary GaAs: A High Speed BiCMOS Alternative" IEEE GaAs IC Symposium San Diego, Cal. Invited Paper, pp. 18-21, October 1995.
- Khaled Ali Shehata, "Low-Power, High-Speed Dynamic Logic Families for Complementary Gallium Arsenide (CGaAs) Fabrication Processes" Ph.D. dissertation, Naval Postgraduate School, Monterey, California, September, 1996.
- Kevin R. Nary and Stephen I. Long "A 1 mW 500 MHz 4-Bit adder Using Two-Phase Dynamic FET Logic Gates" IEEE GaAs IC Symposium, pp. 97-100, 1992.