- [13] E. Ott, C. Grebogi, and J. A. Yorke, "Controlling chaos," Phys. Rev. Lett., vol. 64, pp. 1196–1199, 1990. - [14] F. R. Marotto, "Snap-back repellers imply chaos in $\Re^n$ ," *J. Math. Anal. Appl.*, vol. 63, pp. 199–223, 1978. - [15] T. Y. Li and J. A. Yorke, "Period three implies chaos," Amer. Math. Monthly, vol. 82, pp. 481–485, 1975. - [16] G. H. Golub and C. F. Van Loan, Matrix Computations. Baltimore, MD: Johns Hopkins Univ. Press, 1983. - [17] K. Shiraiwa and M. Kurata, "A generalization of a theorem of Marotto," in *Proc. Japan Acad.*, vol. 55, 1980, pp. 286–289. - [18] T. Ushio and K. Hirai, "Chaos in nonlinear sampled-data control systems," *Int. J. Contr.*, vol. 38, pp. 1023–1033, 1983. - [19] T. Ushio and K. Hirai, "Chaotic behavior in piecewise-linear sampled-data control systems," *Int. J. Nonlinear Mech.*, vol. 20, pp. 493–506, 1985. - [20] L. Chen and K. Aihara, "Chaos and asymptotical stability in discrete-time neural networks," *Physica D*, vol. 104, pp. 286–325, 1997. - [21] G. Chen, S.-B. Hsu, and J. Zhou, "Snapback repellers as a cause of chaotic vibration of the wave equation with a van der Pol boundary condition and energy injection at the middle of the span," *J. Math. Phys.*, vol. 39, pp. 6459–6489, 1998. - [22] E. Bollt, "Stability of order: An example of chaos "near" a linear map," Int. J. Bifurcat. Chaos, vol. 9, no. 10, pp. 2081–2090, 1999. # Clock-Gating and Its Application to Low Power Design of Sequential Circuits Qing Wu, Massoud Pedram, and Xunwei Wu Abstract—This paper models the clock behavior in a sequential circuit by a quaternary variable and uses this representation to propose and analyze two clock-gating techniques. It then uses the covering relationship between the triggering transition of the clock and the active cycles of various flip flops to generate a derived clock for each flip flop in the circuit. A technique for clock gating is also presented, which generates a derived clock synchronous with the master clock. Design examples using gated clocks are provided next. Experimental results show that these designs have ideal logic functionality with lower power dissipation compared to traditional designs. $\it Index\ Terms$ —Clock gating, CMOS, logic, low power, sequential circuit, synthesis. ## I. INTRODUCTION The sequential circuits in a system are considered major contributors to the power dissipation since one input of sequential circuits is the clock, which is the only signal that switches all the time. In addition, the clock signal tends to be highly loaded. To distribute the clock and control the clock skew, one needs to construct a clock network (often a clock tree) with clock buffers. All of this adds to the capacitance of the clock net. Recent studies indicate that the clock signals in digital computers consume a large (15–45%) percentage of the system power [1]. Thus, the circuit power can be greatly reduced by reducing the clock power dissipation. Manuscript received September 7, 1997; revised January 29, 1999. This work was supported in part by DARPA under Contract F33615-95-C-1627 and in part by the NNSF of China under Grant 69773034. This paper was recommended by Associate Editor M. Glessner - Q. Wu and M. Pedram are with the Department of Electrical Engineering-Systems, University of Southern California, Los Angeles, CA 90089 USA. - X. Wu is with the Institute of Circuits and Systems, Ningbo University, Ningbo, Zhejiang 315211, China. Publisher Item Identifier S 1057-7122(00)02319-9. Most efforts for clock power reduction have focused on issues such as reduced voltage swings, buffer insertion, and clock routing [2]. In many cases, switching of the clock causes a great deal of unnecessary gate activity. For that reason, circuits are being developed with controllable clocks. This means that from the master clock other clocks are derived which, based on certain conditions, can be slowed down or stopped completely with respect to the master clock. Obviously, this scheme results in power savings due to the following factors. - The load on the master clock is reduced and the number of required buffers in the clock tree is decreased. Therefore, the power dissipation of clock tree can be reduced. - The flip flop receiving the derived clock is not triggered in idle cycles and the corresponding dynamic power dissipation is thus saved. - 3) The excitation function of the flip flop triggered by the derived clock may be simplified since it has a do not care condition in the cycle when the flip flop is not triggered by the derived clock. The clock-gating problem has been studied in [3]-[5]. In [3] the authors presented a technique for saving power in the clock tree by stopping the clock fed into idle modules. However, a number of engineering issues related to the design of the clock tree were not addressed and, hence, the proposed approach has not been adopted in practice. In [4], a precomputation-based technique is used to generate a signal to control the load enable pin of the flip flops in the data path. The control signal is derived by investigating the relationship between the latched input and the primary outputs of the combinational blocks in the data path. The technique is useful only if the outputs of the block can be precomputed (predicted) for certain input assignments. In [5], the authors use a latch to gate the clock in control-dominated circuits. The problem is that the additional latch receives the clock's triggering signal, which results in extra power dissipation in the latch itself. Besides, this scheme results in the derived clock having a considerable skew with respect to the master clock. This paper investigates various issues in deriving a gated clock from a master clock. In Section II, a quaternary variable is used to model the clock behavior and to discuss its triggering action on flip flops. Based on this analysis, two clock-gating schemes are proposed. In Section III, we use the covering relation between the clock and the transition behaviors of the triggered flip flops to derive conditions for gating the master clock. Two common sequential circuits, i.e., 8421 BCD code up-counter and three-excess counter, are then described to illustrate the procedure for finding a derived clock. In Section IV, a new technique for clock gating is presented which generates a clock synchronous with the master clock. This eliminates the additional skew between the master clock and the derived clock. Thus, the designed sequential circuit is a synchronous one. Finally, we present circuit simulation results to prove the quality of the derived clock and its ability to reduce power dissipation in the circuit. #### II. DESCRIPTION FOR CLOCK BEHAVIOR AND CLOCK-GATING In a synchronous system, a flip flop is triggered by a certain directional transition of a clock signal. For the clock to be another signal rather than the master clock, it must offer the same directional transition to trigger the flip flop and it must be in step with the master clock. For the clock signal clk in a circuit if we denote its logic values before and after a transition as clk(t) and $clk^+(t)$ , respectively, four combinations can be used to express different behaviors of the clock as shown in Table I, where a special quaternary variable clk denotes the corresponding behavior. The four values are $(0, \alpha, \beta, 1)$ , where $\alpha, \beta$ represent two kinds of transition behaviors and 0, 1 represent two kinds of holding behaviors. (Note that although they have the same forms as signal values 0 and 1, their meanings are different.) In addition, we can also define a literal operation to identify the behavior of a clock $$clk^{b} = \begin{cases} 1, & \text{if } c\tilde{l}k = b \\ 0, & \text{if } c\tilde{l}k \neq b \end{cases}$$ (1) where $b \in \{0, \alpha, \beta, 1\}$ . Thus, the rising transition $clk^{\alpha}$ and the falling transition $clk^{\beta}$ of a clock are binary variables and can serve as arguments of Boolean operations. For example, from Table I we have $$clk^{0} = \overline{clk} \cdot \overline{clk^{+}} \quad clk^{\alpha} = \overline{clk} \cdot clk^{+}$$ $$clk^{\beta} = clk \cdot \overline{clk^{+}} \quad \text{and} \quad clk^{1} = clk \cdot clk^{+}.$$ Assume that there are n flip flops in a sequential circuit and that their outputs and clock inputs are denoted by $Q_i$ and $clk_i$ , $i=0,1,\cdots,n-1$ , respectively. For a synchronous sequential circuit we have $clk_i=clk$ , namely, all flip flops are triggered by the same master clock signal clk. However, if a flip flop $Q_i$ is to be disconnected from the master clock during some (idle) cycles, then we have to use a derived clock for $Q_i$ . Notice that this derived clock should be in step with the master clock for the circuits to remain synchronous. Generally, we consider that the derived clock is obtained from the master clock clk and the outputs of other flip flops $Q_0, \cdots, Q_{i-1}, Q_{i+1}, \cdots, Q_{n-1}$ (which make transitions following the triggering transition of their respective clocks.) Since both AND gating and OR gating can be used for controlling the master clock, we have the following two clock-gating forms: $$clk_i = g_i + p_i \cdot clk \tag{2}$$ $$clk_i = g_i \cdot (p_i + clk) \tag{3}$$ where $g_i$ and $p_i$ are functions of flip-flop outputs $Q_0, \dots, Q_{i-1}, Q_{i+1}, \dots, Q_{n-1}$ . Furthermore, we assume that the delay on the $p_i$ 's and $g_i$ 's is shorter than the clock pulse width. Consider a flip flop triggered by the falling clock transition as an example (i.e., a negative edge-triggered flip flop). The timing relationships of clk, $p_i$ , $p_i \cdot clk$ , and $p_i + clk$ are shown in Fig. 1. Note that $p_i$ which exhibits a delay with respect to the falling transition of the clock, may have glitches (represented by vertical grid lines) and has its final stable value in the zone where clk = 0. We can see that $p_i + clk$ cannot prevent the glitches and may even lead to an extra glitch. Therefore, only (2) is suitable for the negative edge-triggered flip flops while (3) is not. Note that $g_i$ in (2) must be glitch-free when clk = 0. The above discussion shows that the falling transition of $clk_i$ in (2) occurs for the following two cases. - 1) When $g_i = 0$ and $p_i = 1$ , falling transition of clk leads to falling transition of the derived clock $clk_i$ . Therefore, $p_i$ may be named the transition propagate term. - 2) When $g_i = 1$ and $g_i$ make a falling transition, the derived clock $clk_i$ makes a falling transition since clk and, hence, $p_i \cdot clk$ are 0 at that time instance. Therefore, $g_i$ may be named the transition generate term. From this analysis, we obtain $$clk_i^{\beta} = g_i^{\beta} + \overline{g_i} \cdot p_i \cdot clk^{\beta}. \tag{4}$$ Similarly, we can show that the derived clock signal in (3) is suitable for the flip flops triggered by the rising transition of the clock. Here, $g_i$ in (3) must be glitch free when clk = 1. The rising transition of $clk_i$ can be expressed as $$clk_i^{\alpha} = g_i^{\alpha} + g_i \cdot \overline{p_i} \cdot clk^{\alpha}. \tag{5}$$ TABLE I QUATERNARY REPRESENTATION FOR BEHAVIORS OF A SIGNAL | | ~<br>clk | $clk(t) \rightarrow$ | clk <sup>+</sup> (t) | Behavior | | |------------------------------|----------|----------------------|----------------------|--------------|--| | | 0 | 0 | 0 | 0-holding | | | | α | 0 | 1 | α-transition | | | | β | 1 | 0 | β-transition | | | | 1 | 1 | 1 | 1-holding | | | clk — | | | | | | | $p_i(g_i)$ | | | | 1 | | | <i>p<sub>i</sub></i> ⁺clk ── | | | | | | | p <sub>i</sub> +clk | | | | # | | Fig. 1. Timing relationship of clk, $p_i(g_i)$ , $p_i \cdot clk$ and $p_i + clk$ . ${\small \textbf{TABLE}} \quad \textbf{II} \\ \textbf{NEXT STATES AND STATE BEHAVIORS OF AN 8421 BCD CODE UP COUNTER} \\ \textbf{TABLE} \quad \textbf{II} \\ \textbf{NEXT STATES AND STATE BEHAVIORS OF AN 8421 BCD CODE UP COUNTER} \\ \textbf{TABLE} \quad \textbf{II} \\ \textbf{NEXT STATES AND STATE BEHAVIORS OF AN 8421 BCD CODE UP COUNTER} \\ \textbf{NEXT STATES AND STATE BEHAVIORS OF AN 8421 BCD CODE UP COUNTER} \\ \textbf{NEXT STATES AND STATE BEHAVIORS OF AN 8421 BCD CODE UP COUNTER COUNTER BEHAVIORS OF AN 8421 BCD COUNTER BEHAVIORS OF AN 8421 BCD COUNTER BEHAVIORS OF AN 8421 BCD COUN$ extra glitch | $Q_3$ | $Q_2$ | $Q_1$ | $Q_0$ | $Q_3^+$ | $Q_2^+$ | $Q_1^+$ | $Q_0^+$ | $ ilde{\mathcal{Q}}_3$ | $ ilde{Q}_2$ | $ ilde{Q}_1$ | $ ilde{Q}_0$ | |-------|-------|-------|-------|---------|---------|---------|---------|------------------------|--------------|--------------|--------------| | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | α | | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 0 | 0 | 0 | α | β | | 0 | 0 | 1 | 0 | 0 | 0 | 1 | 1 | 0 | 0 | 1 | α | | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | α | β | β | | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | α | | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | α | β | | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | α | | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | α | β | β | β | | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | α | | 1 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | β | 0 | 0 | β | It should be pointed out that the attached circuitry needed for generating the derived clock should be simple to avoid excessive power dissipation due to this overhead circuitry. Therefore, $g_i$ and $p_i$ in (2) and (3) should be relatively simple functions. Especially, we require $g_i$ to be simple to avoid dangerous glitches. Note that if $g_i = 0$ , $p_i = 1$ in (2) or $g_i = 1$ , $p_i = 0$ in (3), we return to the condition of applying the master clock clk in a synchronous sequential circuit. #### III. DESIGN OF SEQUENTIAL CIRCUITS BASED ON DERIVED CLOCK Assume that the derived clock for the flip flop $Q_i$ is $clk_i$ . Falling transitions of $clk_i$ have to cover all cycles when the flip flop $Q_i$ makes transitions $Q_i^{\alpha}$ and $Q_i^{\beta}$ . The covering relation can be expressed as $$clk_i^{\beta} \ge (Q_i^{\alpha} + Q_i^{\beta}). \tag{6}$$ Since AND and OR operations on Boolean variables can be interpreted as minimum and maximum operations on these variables, i.e., $x \cdot y = \min(x, y)$ and $x + y = \max(x, y)$ , we can obtain the following equations from (6): $$clk_i^{\beta} \cdot (Q_i^{\alpha} + Q_i^{\beta}) = (Q_i^{\alpha} + Q_i^{\beta}) \tag{7}$$ Fig. 2. (a) Next state Karnaugh maps, (b) behavior Karnaugh maps, and (c) simplified next state Karnaugh maps. $$clk_i^{\beta} + (Q_i^{\alpha} + Q_i^{\beta}) = clk_i^{\beta}. \tag{8}$$ Therefore, we should first obtain $(Q_i^{\alpha} + Q_i^{\beta})$ and then generate the derived clock $clk_i$ for flip flop $Q_i$ . We will show the procedure by using design examples. Example 1—Design of an 8421 BCD Code Up Counter: The next states and state behaviors of an 8421 BCD code up counter are shown in Table II, where behavior of each flip flop $(Q_i \to Q_i^+)$ is denoted by $Q_i$ . From Table II, the corresponding next state Karnaugh maps and behavior Karnaugh maps may be obtained, as shown in Fig. 2(a) and (b). In these maps, an empty box represents the don't care condition. The two transition functions for each flip flop can be derived from their corresponding behavior Karnaugh maps as below $$Q_3^{\alpha} = Q_2 \cdot Q_1 \cdot Q_0 \cdot clk^{\beta}, \quad Q_3^{\beta} = Q_3 \cdot Q_0 \cdot clk^{\beta} \tag{9}$$ $$Q_2^{\alpha} = \overline{Q_2} \cdot Q_1 \cdot Q_0 \cdot clk^{\beta}, \quad Q_2^{\beta} = Q_2 \cdot Q_1 \cdot Q_0 \cdot clk^{\beta}$$ (10) $$Q_1^{\alpha} = \overline{Q_3} \cdot \overline{Q_1} \cdot Q_0 \cdot clk^{\beta}, \quad Q_1^{\beta} = \overline{Q_3} \cdot Q_1 \cdot Q_0 \cdot clk^{\beta}$$ (11) $$Q_0^{\alpha} = \overline{Q_0} \cdot clk^{\beta}, \quad Q_0^{\beta} = Q_0 \cdot clk^{\beta}. \tag{12}$$ Therefore, we have $$Q_3^{\alpha} + Q_3^{\beta} = (Q_3 + Q_2 \cdot Q_1) \cdot Q_0 \cdot clk^{\beta}$$ (13) $$Q_2^{\alpha} + Q_2^{\beta} = Q_1 \cdot Q_0 \cdot clk^{\beta} \tag{14}$$ $$Q_1^{\alpha} + Q_1^{\beta} = \overline{Q_3} \cdot Q_0 \cdot clk^{\beta} \tag{15}$$ $$Q_0^{\alpha} + Q_0^{\beta} = clk^{\beta}. \tag{16}$$ From (13)–(15) we find that $Q_0 \cdot clk^{\beta} \geq (Q_i^{\alpha} + Q_i^{\beta})$ , (i = 1, 2, 3). From (12), we see that $Q_0^{\beta} = Q_0 \cdot clk^{\beta}$ can serve as the needed falling transition trigger for flip flops $Q_1$ , $Q_2$ , and $Q_3$ , namely $clk_1^{\beta} = clk_2^{\beta} = clk_3^{\beta} = Q_0^{\beta}$ . Comparing these with (4), we get $g_i = Q_0$ , $p_i = 0$ , and $clk_i = Q_0$ (i = 1, 2, 3). As for $clk_0$ , (16) indicates that the clock for $Q_0$ is no other than the master clock clk. Since we only need to take care of the excitation input when the flip flop receives a triggering falling clock transition (i.e., $\beta$ entries in $Q_0$ map), we do not care what the excitation inputs in other conditions are. Therefore, the next state Karnaugh maps for flip flops $Q_1$ , $Q_2$ , and $Q_3$ in Fig. 2(a) can be simplified to those shown in Fig. 2(c). From Fig. 2(a) and (c) we can get the corresponding both synchronous and asynchronous designs, as shown in Fig. 3. (We say asynchronous, because now not all flip flops are triggered at the same time.) Obviously, the corresponding combinational circuits are simpler. Furthermore, since three flip flops $Q_3$ , $Q_2$ , $Q_1$ have no Fig. 3. Circuit realizations of BCD code up counter: (a) Synchronous design. (b) Asynchronous design. TABLE III THE NEXT STATES AND STATE BEHAVIORS OF AN EXCESS-THREE CODE UP COUNTER | _ | | | | | | | | | | | | |-------|-------|-------|-------|---------|---------------|---------|---------|---------------|--------------|---------------------------|-------------------------| | $Q_3$ | $Q_2$ | $Q_1$ | $Q_0$ | $Q_3^+$ | $Q_2^{\star}$ | $Q_1^+$ | $Q_0^+$ | $\tilde{Q}_3$ | $ ilde{Q}_2$ | $\tilde{\mathcal{Q}}_{I}$ | $\tilde{\mathcal{Q}}_0$ | | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 0 | 0 | α | β | β | | 0 | 1 | 0 | 0 | 0 | 1 | 0 | 1 | 0 | 1 | 0 | α | | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 0 | 0 | 1 | α | β | | 0 | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 | 1 | α | | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 0 | α | β | β | β | | 1 | 0 | 0 | 0 | 1 | 0 | 0 | 1 | 1 | 0 | 0 | α | | 1 | 0 | 0 | 1 | 1 | 0 | 1 | 0 | 1 | 0 | α | β | | 1 | 0 | 1 | 0 | 1 | 0 | 1 | 1 | 1 | 0 | 1 | α | | 1 | 0 | 1 | 1 | 1 | 1 | 0 | 0 | 1 | α | β | β | | 1 | 1 | 0 | 0 | 0 | 0 | 1 | 1 | β | β | α | α | dynamic power dissipation half of the time when there is no clock triggering, and because the simpler combinational circuits has lower node capacitance, the asynchronous design is saving power. Example 2—Design of an Excess-Three Code Up Counter: The next state and state transition of an excess-three code up counter are shown in Table III. Transition functions for each flip flop can be derived as below $$Q_3^{\alpha} = Q_2 \cdot Q_1 \cdot Q_0 \cdot clk^{\beta}, \quad Q_3^{\beta} = Q_3 \cdot Q_2 \cdot \overline{Q_1} \cdot clk^{\beta}$$ (17) $$Q_2^{\alpha} = \overline{Q_2} \cdot Q_1 \cdot Q_0 \cdot clk^{\beta},$$ $$Q_2^{\beta} = (Q_3 \cdot Q_2 \cdot \overline{Q_1} + Q_2 \cdot Q_1 \cdot Q_0) \cdot clk^{\beta}$$ (18) $$Q_1^{\alpha} = (Q_3 \cdot Q_2 \cdot \overline{Q_0} + \overline{Q_1} \cdot Q_0) \cdot clk^{\beta}, \quad Q_1^{\beta} = Q_1 \cdot Q_0 \cdot clk^{\beta}$$ (19) $$Q_0^{\alpha} = \overline{Q_0} \cdot clk^{\beta}, \quad Q_0^{\beta} = Q_0 \cdot clk^{\beta}. \tag{20}$$ Therefore, we have $$Q_3^{\alpha} + Q_3^{\beta} = (Q_3 \cdot Q_2 \cdot \overline{Q_1} + Q_2 \cdot Q_1 \cdot Q_0) \cdot clk^{\beta} = Q_2^{\beta}$$ $$\tag{21}$$ $$Q_{2}^{\alpha} + Q_{2}^{\beta} = (Q_{3} \cdot Q_{2} \cdot \overline{Q_{1}} + Q_{1} \cdot Q_{0}) \cdot clk^{\beta}$$ $$= Q_{1}^{\beta} + \overline{Q_{1}} \cdot (Q_{3} \cdot Q_{2}) clk^{\beta}$$ (22) $$Q_1^\alpha + Q_1^\beta = (Q_3 \cdot Q_2 \cdot \overline{Q_0} + Q_0) \cdot clk^\beta = Q_0^\beta + \overline{Q_0} \cdot (Q_3 \cdot Q_2) \cdot clk^\beta \ \ \textbf{(23)}$$ $$Q_0^{\alpha} + Q_0^{\beta} = clk^{\beta}. \tag{24}$$ Based on (2) and (4), (22) and (23) can be reexpressed as $$Q_2^{\alpha} + Q_2^{\beta} = [Q_1 + (Q_3 \cdot Q_2) \cdot clk]^{\beta}$$ (25) $$Q_1^{\alpha} + Q_1^{\beta} = [Q_0 + (Q_3 \cdot Q_2) \cdot clk]^{\beta}. \tag{26}$$ Obviously, if we take $clk_3=Q_2, clk_2=[Q_1+(Q_3\cdot Q_2)\cdot clk]$ , $clk_1=[Q_0+(Q_3\cdot Q_2)\cdot clk]$ , and $clk_0=clk$ , the covering relation will set the excitation functions of all the four flip flops as $D_i=\overline{Q_i}$ (i=0,1,2,3). On the other hand, if we use the master clock for triggering all four flip flops, we obtain the following complicated excitation functions: $$\begin{split} D_3 &= Q_2 \cdot Q_1 \cdot Q_0 + Q_3 \cdot \overline{Q_2}, \\ D_2 &= \overline{Q_2} \cdot Q_1 \cdot Q_0 + \overline{Q_3} \cdot \overline{Q_1} + \overline{Q_3} \cdot \overline{Q_0}, \\ D_1 &= Q_3 \cdot Q_2 + \overline{Q_1} \cdot Q_0 + Q_1 \cdot \overline{Q_0}, \\ D_1 &= \overline{Q_1} \cdot \overline{Q_1} \cdot \overline{Q_1} \cdot \overline{Q_1} \cdot \overline{Q_1}, \end{split}$$ Since the above $D_3$ , $D_2$ , and $D_1$ have complicated forms, their corresponding synchronous circuit realization will have a complicated combinational circuit with more node capacitance and, hence, higher power dissipation. On the other hand, the corresponding asynchronous circuit realization with $D_i = \overline{Q_i}$ is much simpler. There is power saving since Fig. 4. BCD code up- counter by gating clock. (a) Asynchronous design. (b) Synchronous design. Fig. 5. Power dissipation diagram. the four flip flops are isolated from the triggering clock in the idle cycles. ### IV. SYNCHRONOUS DERIVED CLOCK AND ITS APPLICATION In Example 1 of Section III, we take $clk_i^{\,\beta}=Q_0^{\,\beta}, (i=1,2,3)$ . From (12) we can also write $clk_i^{\,\beta}$ as $clk_i^{\,\beta}=Q_0\cdot clk_i^{\,\beta}, (i=1,2,3)$ . Comparing this with (4), we have $g_i = 0$ , $p_i = Q_0$ , and $clk_i = Q_0 \cdot clk$ . According to this form of the derived clock we get another asynchronous design, as shown in Fig. 4(a). At first glance, the circuit has one AND gate more than the design in Fig. 3(b). Furthermore, it appears that the derived clock $clk_{1-3}$ may have an increased phase delay. However, the timing relation shown in Fig. 1 indicates that the transition delay of $clk_{1-3}$ is independent of the delay of the $Q_0$ output. The delay between clk and $clk_{1-3}$ is only $2t_g$ ( $t_g$ is the average delay of a gate), which is less than the delay of the flip-flop output. Based on the above discussion, we can rewrite $clk_i = Q_0 \cdot clk$ as $clk_i^* = \overline{Q_0} + \overline{clk}$ . Furthermore, we take $\overline{clk}$ from the previous stage of the clock tree. Thus, we obtain a new design, as shown in Fig. 4(b). If we consider delay of the inverter and NOR gate being roughly the same, the falling transitions of clk and $clk_{1-3}^*$ in the circuit will occur simultaneously. This design is synchronous in the sense that all flip flops are triggered in synchrony with the global clock. We simulated the new design in Fig. 4(b) by SPICE 3f3 using 2 $\mu$ CMOS technology, which proved that the new design has an ideal logic operation. We also measure the power dissipation of two synchronous designs in Figs. 3(a) and 4(b). The power dissipation diagrams are shown in Fig. 5 and prove that the new design reduces the power dissipation by 22%. ### V. CONCLUSION The behavioral description of a clock is the basis to analyze its triggering action on flip flops. Based on it, two types of clock-gating were introduced to form a derived clock. We showed that the procedure for designing a derived clock could be systematized so as to isolate the triggered flip flop from the master clock in its idle cycles. The achieved power saving can be significant. However, the additional clock skew may lower the maximum operation frequency. Based on analyzing the timing relation in clock gating, we then presented a new technique for generating the derived clock, which is synchronous with the master clock. Circuit simulation proved the quality of the new derived clock and its capability to reduce power dissipation. More work is needed to develop a systematic design procedure and an algorithm for realizing the proposed design principles for clock gating in large sequential circuits. The engineering issues mentioned in [3] have thus been resolved for practical application, opening the path for widespread adoption of the clock-gating technique in low-power design of custom IC's. #### REFERENCES - M. Pedram, "Power minimization in IC design: Principles and applications," ACM Trans. Design Automation, vol. 1, no. 1, pp. 3–56, Jan. 1996. - [2] G. Friedman, "Clock distribution design in VLSI circuits: An overview," in *Proc. IEEE ISCAS*, San Jose, CA, May 1994, pp. 1475–1478. - [3] E. Tellez, A. Farrah, and M. Sarrafzadeh, "Activity-driven clock design for low power circuits," in *Proc. IEEE ICCAD*, San Jose, CA, Nov. 1995, pp. 62–65. - [4] M. Alidina and J. Monteiro et al., "Precomputation-based sequential logic optimization for low power," *IEEE Trans. VLSI Syst.*, vol. 2, pp. 426–436, Dec. 1994. - [5] L. Benini and G. De Micheli, "Symbolic techniques of clock-gating logic for power optimization of control-oriented synchronous networks," in *Proc. European Design Test Conf.*, Paris, France, 1997, pp. 514–520. # A Complete Operational Amplifier Noise Model: Analysis and Measurement of Correlation Coefficient Jiansheng Xu, Yisong Dai, and Derek Abbott Abstract—In contrast to the general operational amplifier (op amp) noise model widely used, we propose a more complete and applicable noise model, which considers the correlation between equivalent input voltage noise source $e_n$ and current noise source $i_n$ . Based on the super-position theorem and equivalent circuit noise theory, our formulae for the equivalent input noise spectrum density of an op amp noise are applied to both the inverting and noninverting input terminals. By measurement, we demonstrate that the new expressions are significantly more accurate. In addition, details of the measurement method for our noise model parameters are given. A commercial operational amplifier (Burr–Brown OPA37A) is measured by means of a low-frequency noise power spectrum measuring system and the measured results of its noise model parameters, including the spectral correlation coefficient (SCC), are finally given. ${\it Index\ Terms} {\it --} {\bf Noise\ models,\ operational\ amplifiers,\ spectral\ correlation\ coefficient.}$ #### I. INTRODUCTION Recently, integrated operational amplifiers (op amps) have been used in more and more practical applications. With the continual improvement of their noise characteristics, they have been commonly found in the design of preamplifier circuits. For this reason, the calculation of the circuit noise of an op amp and its low-noise design are paid more attention than ever. At present, the noise models [1]–[3] of the overwhelming majority of op amps are illustrated as in Fig. 1(a) and (b). The commonly accepted two-port noise model is in Fig. 1(a). The op amp is considered noiseless and the equivalent voltage noise source $e_n$ Manuscript received June 1, 1998; revised May 20, 1999. This work was supported in part by the China Natural Science Foundation under Contract 69672023. This paper was recommended by Associate Editor K. Halonen. - J. Xu and Y. Dai are with the School of Information Science and Engineering, Jilin University of Technology, Changchun, China 130025. - D. Abbott is with the Centre for Biomedical Engineering (CBME), Electrical and Electronic Engineering Department, the University of Adelaide, Adelaide, SA5005, Australia. Publisher Item Identifier S 1057-7122(00)02323-0. and current noise source $i_n$ are referred back to the input terminals. Fig. 1(b) is commonly adopted when the positive terminal is grounded. To simplify calculation, in some models only $e_n$ is adopted and $i_n$ is neglected [4], [5]. The advantage of these equivalent circuits is simplicity and convenience. However, in the area of small-signal detection, the requirements of noise specifications in the course of calculation and design of a low-noise circuit become higher. The shortcoming of Fig. 1(a) and (b) is obvious: the correlation between voltage noise source $e_n$ and current noise source $i_n$ is not considered, giving rise to inaccuracy. At present, methods for measuring $e_n$ and $i_n$ [6], [7] use a small value of source resistance to measure an equivalent input voltage noise $e_n$ and use a very large source resistance to measure an equivalent input current noise $i_n$ . Because the correlation is not considered in this method, the measuring method is only an approximate solution. In fact, it can be calculated that the neglect of the correlation item can lead to, at most, a 40% measurement error [7]. Thus, it is commonly believed that the method can give only an approximate solution, and cannot give an accurate solution. To solve this problem, a more complete op amp noise model is presented in this paper, based on Fig. 1(c), which considers the correlation between $e_n$ and $i_n$ for each input terminal and then the formula of equivalent input noise power spectrum density for the inverting and noninverting input terminals can be derived. With different source resistors, the noise model parameters of an op amp have been measured by means of a low-frequency noise measuring system and the noise model parameters, including the spectral correlation coefficient, are presented. # II. A COMPLETE NOISE MODEL AND ITS EQUIVALENT INPUT NOISE POWER SPECTRUM In order to improve precision of the noise model, based on Fig. 1(a) and (b), we use one equivalent voltage noise source and one equivalent current noise source at each op amp input terminal in our model. Second, it should be pointed out that the correlation between $e_n$ and $i_n$ at each input terminal should be considered for completeness. Let $\gamma = \gamma_1 + j\gamma_2$ be the spectral correlation coefficient (SCC), given by $\gamma = S_{ei}(f)/\sqrt{S_e(f)S_i(f)}$ , in which $S_e(f)$ , $S_i(f)$ are the power spectral densities of the voltage noise $e_n$ and current noise $i_n$ , respectively, and $S_{ei}(f)$ is the cross-spectral density [8] between $e_{n1}$ and $i_{n1}$ . Also let $\gamma' = \gamma'_1 + j \gamma'_2$ be the SCC between $e_{n2}$ and $i_{n2}$ , in which $i_{n1}$ and $i_{n,2}$ are current noises at two input terminals of an op amp. Thus, it can be concluded that there is no correlation between them. Fig. 1(c) is a complete op amp noise model including eight parameters, i.e., $e_{n,1}$ , $i_{n1}$ , $\gamma = \gamma_1 + j\gamma_2$ , $e_{n2}$ , $i_{n2}$ , and $\gamma' = \gamma'_1 + j\gamma'_2$ , each of which varies with frequency. It is obvious that all these parameters cannot be calculated by use of internal noise sources of an op amp, for noise sources in an op amp are so many that it is very difficult to calculate them separately and accurately. However, they can be calculated by measuring equivalent input noise power spectrum with different source resistors. Now the relation between the eight parameters and equivalent input noise power spectrum can be derived as follows. Let $Z_1=R_1+jX_1$ , $Z_2=R_2+jX_2$ and $Z_f=R_f+jX_f$ , $e_1^2=4\kappa TR_1$ , $e_2^2=4\kappa TR_2$ , $i_f^2=4\kappa T/R_f$ , where $e_1^2$ and $e_2^2$ are the thermal noise spectrum of resistance $R_1$ and $R_2$ , $i_f^2$ is the current noise spectrum of resistance $R_f$ . According to Fig. 2(a), its equivalent noise circuit can be drawn as in Fig. 2(b). According to the superposition theorem, the gain of each noise source can be calculated first and then the total output noise can be obtained by addition of each noise source power. Multiplication by the square of the noise bandwidth finally gives the output noise power