## Worst-Case Noise Area Prediction of On-Chip Power Distribution Network

Xiang Zhang \*, Jingwei Lu\*\*, Yang Liu+, and Chung-Kuan Cheng\*\*

\*ECE Dept., University of California, San Diego, CA, USA, Email: xiz110@ucsd.edu

\*Institute of Electronic CAD, Xidian University, Xi'an, China, Email:liuyang@mail.xidian.edu.cn

\*\*CSE Dept., University of California, San Diego, CA, USA, Email: jlu@cs.ucsd.edu, ckcheng@ucsd.edu

Abstract—

We propose a prediction of the worst-case noise area of the supply voltage on the power distribution network (PDN). Previous works focus on the worst-peak droop to sign off PDN. In this work, we (1) study the behavior of circuit delay over the worst-area noise (2) study the worst-case noise area of a lumped PDN model (3) develop an algorithm to generate the worst-case current for general PDN cases (4) predict the longest delay of a datapath due to power integrity. Experimental results show that the worst-area noise induces additional delay than that of the worst-peak noise.

#### I. INTRODUCTION

The aggressive advances in process technology increase the current demand and tighten the design rules. Such variation causes transistor delay [1], clock jitter [2] and many other negative effects, which degrade the overall performance [3]. As a result, PDN analysis becomes an important research topic [4]. PDN noise comes from the DC resistance and loop inductance of power/ground lines, which results in IR drop and inductive noise  $(L\frac{di}{dt})$  at the load [5].

Figure 1 shows a typical PDN that consists of a voltage regulator module (VRM), PCB/package loop parasitics and ondie power grid with decoupling capacitors. A successful PDN design requires the power/ground loops presenting acceptable impedances at all frequencies of interest.



Fig. 1. A typical circuit diagram characterizing the impedance of PDN.

Many previous works focused on the worst voltage drop in time- [6], [7], [8] and in frequency-domain [9], [10], [11]

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from Permissions@acm.org.

SLIP '14, June 01 - 02 2014, San Francisco, CA, USA Copyright 2014 ACM 978-1-4503-3053-4/14/06 \$15.00. http://dx.doi.org/10.1145/2633948.2633950

PDN analysis. Kouroussis et al. [12] proposed a vectorless approach for PDN integrity verification. This was later extended by Ferzli et al. [13] to a geometric approach for early estimation. Smith et al. [4] developed a method to systematically characterize the PDN noise. Ketkar et al. [14] studied micro-architecture based framework for PDN analysis. Chiprout [15] discussed pre-silicon stimulus and post-silicon activity generation to excite the worst-case voltage drop. Abdul Ghani et al. [16] verified the PDN using node and branch dominance. Swaminathan et al. [17] used power transmission line to reduce the PDN noise.

Traditional PDN analysis concentrates on limiting the peak voltage drop. By applying constant supply voltage minus peak voltage on slow-slow(ss) corner transistors, designers may figure out the maximum drop that the critical path can tolerate to close the timing. However, this leads to an over-design as the duration of the peak drop of supply noise may be very short in real applications. Figure 2 shows two periodic supply voltage noise patterns applied to a datapath. The nominal delay of the circuit under  $V_{dd}=1V$  is  $D_0^{-1}$ . The dash curve has a peak voltage drop of 0.25V and noise area of 0.025T, which induces  $1.11D_0$  signal delay. The dot curve has a peak voltage drop of 0.2V and noise area of 0.066T, which induces  $1.23D_0$  signal delay. Due to larger noise area, the dot curve induces 1.% larger delay, despite its 20% smaller peak noise.



Fig. 2. A datapath of inverter chain under two supply patterns. The dash curve induces larger delay despite smaller peak noise. (period  $T = T_1 - T_0$ )

In this paper, we focus on the prediction of the worst-area noise of a PDN under a certain time window and the worst-case load current profile which generates the worst-area noise. We then predict the maximum circuit delay under such voltage noise profile. The importance of the noise area estimation on PDN analysis have been proposed and discussed by Intel [1] and Hashimoto's group on device level [19]. However, to the best of our knowledge, none of prior works

 $<sup>^1</sup>D_0 \approx 100 ps$  according to our HSPICE simulation with 45nm PTM HP model [18].

provides quantitative analysis on the impact of noise area over the performance. Moreover, there is no prediction about the worst-case noise area. Our major contributions are as follows.

- We discuss the impact of the voltage noise area on the circuit performance and compare it with that of the peak voltage noise.
- We study the closed-form expression of the worst noise area of a RLC tank case.
- We develop an algorithm to generate the worst-case current stimulus for general PDN systems in O(n) time<sup>2</sup>.
- We investigate the circuit delay under a complete PDN path and design experiments to validate our methods.

The remainder of the paper is organized as follows. In Section II, we propose the formulation of our problem. In Section III, we study the upper-bound of the voltage noise area for a RLC tank case. As real PDN systems consist of uncertain frequency components, we develop an algorithm to handle the general cases in Section IV. It creates the current stimulus to maximize the voltage noise area in linear time. The experimental results are discussed in Section V. Finally, we conclude the paper in Section VI.

#### II. PROBLEM FORMULATION

We formulate the problem as maximizing the voltage noise area by designing current wave. A general PDN system, as Figure 1 shown, is characterized by the impulse response on the load node, *i.e.* h(t) (Figure 3(a)). Based on h(t) and a window size T, we design the current stimulus such that the voltage response has the maximum noise integral (area) within all possible intervals of length T on the time domain.

Current stimuli  $i_k(t)$  at node k are caused by circuit activities. We lumped all the on-die load into a single load current i(t) for our analysis. As part of transistors are active at each time, the magnitude of i(t) varies within a range. The range is application dependent and can be approximated through the system-level simulation or post-silicon measurement. The assumptions of current constraints and zero transition time are used in many previous works [12], [13]. We follow the assumption of zero transition time and bound the total current demand by  $i(t) \in [0,1]$  in the rest of the paper.

The voltage noise v(i,t) of the PDN system is the convolution of i(t) and h(t) as Eq. 1.

$$v(i,t) = \int_0^{+\infty} h(\tau)i(t-\tau)d\tau \text{ s.t. } i(t) \in [0,1], \ t \ge 0$$
 (1)

Note that we can scale v(i,t) accordingly once the upper bound of  $i_k(t)$  is obtained.

The window size T is a constant, which refers to one clock cycle or other critical time period, in order to correlate with overall system performance. We slide the window along the timing-axis of v(i,t). The area of noise at each time t is defined as A(i,t), which is the integral of v(i,t) in [t-T,t].

$$A(i,t) = \int_{t-T}^{t} v(i,t)dt = \int_{t-T}^{t} \int_{0}^{+\infty} h(t-\tau)i(\tau)d\tau dt$$
 (2)

<sup>2</sup>Here n refers to the vector length of the discretized impulse response of the PDN system. Full worst-case voltage waveform requires additional convolution of system impulse response and worst-case current, for which the total time complexity is O(nlog(n)).

The maximum voltage noise area of A(i,t) under window size T is defined as  $A_w$ . Current stimuli and time causing  $A_w$  are defined as  $i_w(t)$  and  $t_w$ , respectively. Similarly, we define the worst-case voltage response as  $v_w(t)$ , on which  $A_w$  is obtained at  $t_w$ .

$$A_w = \max_{i,t} A(i,t) = A(i_w, t_w) = \int_{t_w, -T}^{t_w} v_w(t)dt$$
 (3)

We can develop an algorithm to solve the above problem in linear time, based on the simplifications as below.



Fig. 3. An example of PDN system with (a) the impulse response h(t), (b) the step response  $V_s(t)$ , (c) the ramp response  $R_s(t)$  (integral of  $V_s(t)$  and (d) the noise area function  $A_s(t)$ .

- Binary-Valued Worst Current: We set  $i_w(t)$  as a binary-valued function  $(0 \lor 1)$ .
- Current Decomposition: For each load current,  $i_w(t)$  can be decomposed into a series of step inputs  $s(t-t_k)$  with constant amplitude  $(\pm 1)$  and monotonically increased phase delay. Here s(t) is a step input and  $t_k$  is the phase delay of the  $k^{th}$  step input. Without loss of generality, suppose that  $\{t_0, t_1, \ldots\}$  is in ascending order.

$$i_w(t) = \sum_{k=0}^{\infty} (-1)^k s(t - t_k) = \sum_{k=0}^{\infty} (-1)^k s_k(t)$$
 (4)

To generate  $i_w(t)$ , we need to calculate the phase delay  $(t_k)$  of every step input  $(s_k)$ .

• Voltage Area Responses of Single Step Input  $A_s(t)$ : Figure 3(b) shows an example of the voltage response  $V_s(t)$  with a single input  $s_k(t)$ . We observe that the integral within window size T on the step response can be formulated as a ramp response  $R_s(t) = \int_0^t V_s(t) dt$ , as shown in Figure 3(c). We substitute Eq. 4 into Eq. 2 and define  $A_{s_k}(t) = A(s_k(t), t)$  as follows.

$$A_{s_k}(t) = \int_{t-T}^t \int_0^{+\infty} h(t-\tau)(-1)^k s(\tau - t_k) d\tau dt$$

$$= \int_{t-T}^t (-1)^k V_s(t - t_k) dt$$

$$= (-1)^k (R_s(t - t_k) - R_s(t - T - t_k))$$
(5)

From Eq. 5, we can derive  $A_s(t)$  by setting  $t_k=0$  thus  $A_{s_k}(t)=A_s(t-t_k)$ , which is illustrated in Figure 3(d). It corresponds to the definite integral of  $V_s(t)$  in [t-T,t], as shown by the shaded area of Figure 3(b). Based on the definition of  $A_s(t)$ , the optimum phase delay sequence  $\{t_0,t_1,\ldots\}$ , and the optimum window location  $t_w$ , we can obtain the worst-case noise area  $A_w$  as follows.

$$A_w = \sum_{k=0} A_{s_k}(t_w) = \sum_{k=0} A_s(t_w - t_k)$$
 (6)

Based on all the above definitions and simplifications, we formulate our problem as a linear-constrained linear optimization, which is concisely defined as below.

- Input:h(t) and window size T.
- Output: $\{t_0, t_1, \ldots\}$  and  $t_w$ , calculate  $i_w(t)$  by Eq. 4.
- Objective:  $A(i_w, t_w) = A_w$ .
- Constraint: $i_w(t) \in [0, 1], \forall t \in [0, +\infty).$

## III. WORST NOISE AREA PREDICTION OF RLC TANK: ANALYTICAL SOLUTION

A typical PDN is a complex circuit model which can be approximated as the cascaded RLC tank models [9], [20]. We study the worst-case voltage noise area of an RLC tank model. We derive the closed-form expressions of the noise area from the ramp response of the model. The relations among noise area, quality factor, decaps C and it ESR  $R_2$  are studied.

Let A(s), H(s) and I(s) denote the Laplace transform of A(i,t), h(t) and i(t), respectively. Eq. 2 can be written as

$$A(i,t) = \int_{t}^{t+T} v(i,t)dt \qquad \xrightarrow{\text{Laplace}} A(s) = \frac{H(s)I(s)}{s}$$
 (7)

Fig. 4. A standard RLC tank model

Figure 4 shows a standard RLC tank.  $R_1$  and L are used to model the parasitic resistance and inductance of the PDN interconnects. C and  $R_2$  represent a decap with  $ESR_c$ .

The impedance profile of Figure 4 can be written as

$$Z(s) = \frac{s^2 L C R_2 + s(R_1 R_2 C + L) + R_1}{s^2 L C + s(R_1 + R_2) C + 1}$$
(8)

The quality factor, Q, and the resonant frequency,  $\omega_0$ , are

$$Q = \frac{1}{R_1 + R_2} \sqrt{\frac{L}{C}} , \quad \omega_0 = \frac{1}{\sqrt{LC}}$$
 (9)

For a normal PDN design with limited cost budge,  $Q \geq 0.5$  and the RLC tank is underdamped. In the case of Q < 0.5, the PDN is over-designed with excessive decoupling capacitors which is not the scope of this paper.

To derive the expressions for the worst-case noise area, we first study the step and ramp response of the model.

**Lemma 1.** The step response of an underdamped RLC tank

$$V_s(t) = R_1 + 2e^{-\alpha t} [A\cos(\beta t) - B\sin(\beta t)]$$
 (10)

where 
$$\alpha = \frac{\omega_0}{2Q}$$
,  $\beta = \sqrt{\omega_0^2 - (\frac{\omega_0}{2Q})^2}$ ,  $A = \frac{1}{2}(R_2 - R_1)$ ,  $B = R_2 \frac{\frac{1}{2Q}(1 + \frac{Q_2}{Q_1}) - (Q_2 + \frac{1}{Q_1})}{2\sqrt{1 - \frac{1}{4Q^2}}}$ ,  $Q_1 = \frac{1}{R_1}\sqrt{\frac{L}{C}}$ ,  $Q_2 = \frac{1}{R_2}\sqrt{\frac{L}{C}}$ .

**Lemma 2.** The ramp response of an underdamped RLC tank is,

$$R_{s}(t) = \int_{0}^{t} V_{s}(t)dt = R_{1}t + \frac{1}{\beta} [K_{1}cos(\beta t) + K_{2}sin(\beta t)]e^{-\alpha t}$$

$$where \quad K_{1} = \frac{R_{1}(Q^{2}Q_{2}^{2} - Q^{2} + 2QQ_{2} - Q_{2}^{2})}{QQ_{2}(Q - Q_{2})} \sqrt{1 - \frac{1}{4Q^{2}}}, \quad K_{2} = -\frac{R_{1}(4Q^{3}Q_{2} - 3Q^{2}Q_{2}^{2} + Q^{2} - 2QQ_{2} + Q_{2}^{2})}{2Q^{2}Q_{2}(Q - Q_{2})}.$$

$$(11)$$

The ramp response  $R_s$  is derived from the integral of  $V_s$ . Based on  $R_s$ , the results lead to the following theorem.

**Theorem 1.** Given a window size T, the worst-case voltage noise area  $A_w$  of an underdamped RLC tank is,

$$A_w = \sum_{k=0}^{n} A_{s_k}(t_w) = \sum_{k=0}^{n} A_s(t_w - t_k)$$
 (12)

where  $t_w$  is set to a relatively large value where  $h(t) \approx 0$  and  $t_k$  is the time(phase delay) where local peaks/valleys of  $A_s$  occur, solved by equating the derivatives of  $A_s$  to zero.  $A_s$  can be expressed as follows

$$A_s(t) = \begin{cases} R_s(t) - R_s(t - T) & : t > T, \\ R_s(t) & : t \le T. \end{cases}$$
 (13)

Since  $A_s(t)$  is a piecewise-defined function upon the region of t (Eq. (13)), we can derive the results of  $t_k$  from the following two cases, (1) t > T and (2)  $t \le T$ .

(1) For t > T, local peaks/valleys  $t_k$  are

$$t_k = \begin{cases} \frac{1}{\beta} \left( arctan\left(\frac{A-X}{B-Y}\right) + k\pi \right) & : \frac{A-X}{B-Y} \ge 0\\ \frac{1}{\beta} \left( arctan\left(\frac{A-X}{B-Y}\right) + (k+1)\pi \right) & : \frac{A-X}{B-Y} < 0 \end{cases}$$
(14)

where k = 0, 1, ..., n,  $t_k > T$ ,  $X = e^{\alpha T}(A\cos(\beta T) + B\sin(\beta T))$ ,  $Y = e^{\alpha T}(A\sin(\beta T) + B\cos(\beta T))$ .

(2) For  $t \leq T$ , local peaks and valleys  $t_k$  occur at  $R_s'(t) = V_s(t) = 0$ , which are the solutions of a transcendental equation,

$$R_1 + 2e^{-\alpha t} [A\cos(\beta t) - B\sin(\beta t)] = 0.$$
 (15)

Because  $\alpha>0$ ,  $t_k$  occurs limited times when  $t\leq T$ . Plugging the results of Eq. (14), (15) back into Eq. (12),  $A_w$  can be derived.

## IV. WORST NOISE AREA PREDICTION FOR GENERAL PDN CASES: ALGORITHMIC SOLUTION

We propose an algorithm to find the worst-case noise area for a general PDN profile extracted from the commercial tools. The pseudo-code of our method is presented in Algorithm 1. We use Figure 3 to illustrate each intermediate signal during the optimization. From the load current assumption in Section II, we can decompose i(t) into n step inputs with constant

amplitude  $\pm 1.0$ . To calculate  $i_w(t)$  we only need to determine the phase delay of each step input. Given arbitrary impulse response h(t) and window size T, our algorithm is able to output  $t_w$  and all  $t_k$  such that  $A_w$  is achieved.

```
Algorithm 1 [i_w, t_w, A_w] = GetWorstCase(h, T)
```

- 1: **INPUT:** Impulse response h (length n), window size T
- 2: **OUTPUT:** Worst-case current wave  $i_w$ , window coordinate  $t_w$ , noise area  $A_w$
- 3: Set  $V_s$  as the step response of h,  $A_s[k]$  as the definite integral of  $V_s$  in [k, k+T)
- 4: Set  $A_s(t_{pv})$  as peaks and valleys of  $A_s$ ,  $|t_{pv}|=2m-1$
- 5: Set  $A_w = \sum_{i=0}^{m-1} A_s(t_{pv_{2i}}) \sum_{i=0}^{m-2} A_s(t_{pv_{2i+1}})$
- 6: Set  $t_{cur}=0$  and  $t_w=x_0=t_{pv_{2m-2}}$
- 7: for all  $x \in t_{pv}$  in reverse order do
- 8:  $t_{new} = t_{cur} + (x x_0)$
- 9: **if** x is a peak **then** 
  - Set  $i_w[t_{cur}:t_{new}]=1$
- 11: **else**

10:

- 12: Set  $i_w[t_{cur}:t_{new}] = 0$
- 13: **end if**
- 14: Set  $x_0 = x$  and  $t_{cur} = t_{new}$ .
- 15: end for
- 16: **return**  $[i_w, t_w, A_w]$

**Design of Algorithm:** The algorithm can be described as follows. Firstly, we convolute h(t) (Figure 3(a)) with step input s(t) and obtain the step response  $V_s(t)$  (Figure 3(b)), then calculate the noise area function  $A_s(t)$  (Figure 3(d)). To approach  $i_w(t)$ , we need to maximize (minimize) the contributions of all positive (negative) step inputs, which is no larger (smaller) than the sum of all peaks (valleys) of  $A_s(t)$ . Secondly, we extract all the peaks and valleys of  $A_s(t)$  into  $A_s(t_{pv})$ . The leftmost and rightmost element of  $A_s$  will also be added to  $A_s(t_{pv})$  if they are peaks. As every negative step input is sandwiched by two positive step inputs, we have each valley in  $A_s(t_{pv})$  be sandwiched by two peaks on both sides. Suppose there are m peaks thus m-1 valleys extracted, we have  $|t_{pv}| = 2m-1$ . Using  $t_{pv_j}$  to denote the  $j^{th}$  element of  $t_{pv}$ ,  $A_w$  is calculated at line 5 as

$$A_w = \sum_{i=0}^{m-1} A_s(t_{pv_{2i}}) - \sum_{i=0}^{m-2} A_s(t_{pv_{2i+1}})$$
 (16)

Thirdly,  $t_w$  is to the time of the last peak  $t_{pv_{2m-1}}$  to make enough space for all step inputs to be correctly shifted. We calculate the phase delay  $t_k$  for each step input  $s_k(t)$ , and construct  $i_w(t)$  as the superposition of them. Specifically,  $t_k$  is determined by the parity of k as below.

- k is even: Let x = m-k/2, shift the k<sup>th</sup> step input s<sub>k</sub>(t) by aligning the x<sup>th</sup> peak of s<sub>k</sub>(t) to t<sub>w</sub>. We have t<sub>k</sub> = t<sub>pv<sub>2x</sub></sub>.
  k is odd: Let x = m k+1/2, shift the k<sup>th</sup> step input
- **k** is odd: Let  $x = m \frac{k+1}{2}$ , shift the  $k^{th}$  step input  $s_k(t)$  by aligning the  $x^{th}$  valley of  $s_k(t)$  to  $t_w$ . We have  $t_k = t_{pv_{2x}}$ .

Figure 5(a) demonstrates the method by which we determine the phase delay of each step input, notice that  $s_k(t)$  is actually aligned to the  $t_w$  axis at  $t_{pv_{2m-1-k}}$ . Figure 5(b) shows how we construct  $i_w(t)$ .

**Proof of Optimality:** Given arbitrary (h(t), T), our algorithm always outputs  $i_w(t)$  and  $t_w$ , with maximum noise area  $A_w$ .

**Theorem 2.** Our algorithm is optimum on maximizing  $A_w$ .

The proof of Theorem 2 can be found in Section S1.



Fig. 5. The generation of  $t_k$  and  $i_w(t)$  in terms of peak-to-valley distances.

**Analysis of Complexity:** The overall complexity of our method is O(n), as there are only finite operations included in Algorithm 1, while all of them are no more complex than linear. Here n is the length of the vector of the discretized PDN impulse response h(t). The value of n represents a trade-off between accuracy and efficiency of the optimization.

The proposed worst-case current prediction can figure out the worst-case peak noise and thw worst-case noise area for general PDN cases.

### V. EXPERIMENTAL RESULTS

We implement our algorithm in Matlab R2013a. The circuit performance is simulated by HSPICE D-2013.03-SP1. Our test datapath is extracted from ISCAS85 benchmark circuit with 0.13um standard spice model. All the experiments, including both the optimization and the simulation, are executed on a Windows 7 machine with an Intel i7 3.4GHz quad-core CPU and 16GB memory. We design our experiments as follows.

- We study the relation of the circuit delay and the supply voltage noise area.
- We analyze the delay of a datapath under the worst-peak and the worst-area noise for a standard RLC tank model.
- We compare the results of the worst-peak and worst-area noise prediction between RLC tank analytical solutions and algorithmic solutions for complete PDN paths with cascaded RLC tanks.
- We measure the delay of a datapath under the worst-area noise of a complete PDN path extracted from commercial software tools.

### A. Circuit Delay vs Supply Noise Area

The relation between the delay of a datapath and the supply noise area is investigated in this subsection. The test datapath is a customized circuit modified from C432 of ISCAS85 circuit. Delay between one input and output port are measured under various supply noise areas as shown in Fig. 6. The

supply voltage fluctuates from 0.76V to 1.2V. The negative voltage area means the majority noise from droop, while positive represents the majority noise from overshoot. The end to end delay under constant 1V is normalized to 1. Results show that the delay increases quadratically as the voltage droop area increases.



Fig. 6. Normalized delay of a datapath under different supply voltage noise area. (The delay under constant  $V_{dd}=1V$  is normalized to 1.)

## B. Critical Path Delay under Worst-Area and Worst-Peak Supply Noises of an RLC Tank

We create a RLC tank model as shown in Figure. 4, where  $R_1 = 10m\Omega, l = 0.25nH, C = 33nF \text{ and } R_2 = 12m\Omega.$  The nominal voltage and window size T are set to 1V and 17ns, respectively. The simulation time step is set to 0.5ns. Using Algorithm 1, We generate the worst area/peak load current, the worst area/peak voltage response and the voltage noise area responses as shown in Fig. 7. The worst peak noise is obtained by setting the window size to the minimum time step, i.e., T = 0.5ns. Time causing the worst-case  $t_w$  for both the worst-area and worst-peak case are aligned to 500us in Fig. 7. The load current beyond 500us are set to 1. Fig. 7(a) confirms that the worst-peak load current is a constant square waveform with a frequency of  $\beta$ , while the worst-area load current is a piecewise-defined function. The segment before 499.983us is a constant square waveform with a frequency of  $\beta$ . The segment between 499.983us and 500us is determined by the solution of Eq. 15. Fig. 7(b) demonstrates the voltage response waveform for the worst-peak and the worst-area noise. Fig. 7(c) compares the voltage noise area of worst-peak and worst-area response under the same targeted window size T = 17ns.

We apply the waveforms between 499.9us and 500.1us from Fig. 7(b) as the supply voltages for the datapath used in the previous subsection. The delay of the datapath under constant 1V is 16.2ns. For the delay measurement, we send the input pulse every 100ps and record the delay at the output port as shown in Fig. 8. (Exp. 1 means that the input pulse starts at 499us. Exp. 1000 means that the input pulse starts at 500us.) Simulation results show that the maximum delay under the worst-area supply noise is 17ns, while the maximum delay under the worst-peak supply noise is 16.9ns. Our results confirm that the worst-area noise causes a worse circuit delay compared to the worst-peak noise.



Fig. 7. Load current, voltage noise and voltage area of the worst-case peak and area of a standard RLC tank model, T=17ns, (Nominal voltage 1V is superimposed in Fig.(b) and (c))



Fig. 8. The delay of the datapath under the worst-area and worst-peak noise of a standard RLC tank model (T=17ns)

## C. Worst-Area and Worst-Peak Noise of Multi-Stage Cascaded RLC Tanks

We use a multi-stage cascaded RLC tanks to model a complete PDN path. We study three multi-stage cascaded RLC tank PDN cases to compare the results from Theorem 1 and Algorithm 1. The circuit diagram of three cases are shown in Fig. 9 and the parameters are listed in Table I.



Fig. 9. Circuit diagram of a cascaded RLC Tank PDN

The multi-stage cascaded RLC tank can be decomposed into multiple single RLC tank circuits in different frequency regions. (An example is given to show Case I in Table I) are decomposed into three RLC tanks in Fig. 10.

Each tank contributes to a portion to the worst-peak and the worst-area noise. By applying Theorem 1 and Claim 5 in [20],

TABLE I THE R,L,C PARAMETERS FOR THREE CASCADED RLC TANK CASES

| Cases          | I   | II   | III  |
|----------------|-----|------|------|
| R1 $(m\Omega)$ | 5   | 38   | 5    |
| R2 $(m\Omega)$ | 0.1 | 8    | 0.5  |
| R3 $(m\Omega)$ | 3   | 2    | 5    |
| R4 $(m\Omega)$ | 0.3 | 1.7  | 0.8  |
| R5 $(m\Omega)$ | 5   | 10   | 10   |
| R6 $(m\Omega)$ | 10  | 4.6  | 5    |
| C1 (µF)        | 32  | 35   | 30   |
| $C2 (\mu F)$   | 1.5 | 35.8 | 1.0  |
| C3 (nF)        | 12  | 26.1 | 30   |
| L1 (nH)        | 40  | 530  | 16.7 |
| L2 (nH)        | 1.0 | 95   | 1.0  |
| L3 (pH)        | 50  | 157  | 100  |



Fig. 10. Three standard RLC tanks to model a cascaded tank in Case I of Table I

we calculate the noise contribution of each tank and estimate the global noise peak and area as shown in Table II. The RLC tank decomposition method provides a quick prediction on the worst area and peak noise from impedance profile directly. However, it tends to overestimate the voltage peak noise and voltage noise area due to the cancellation effect between neighbouring tanks. We observe a relatively large estimation error for Case II, which is because the impedance peaks of its first two tanks are close to each other. On average, the prediction error of RLC tank prediction method is 7.75% for the worst-peak noise and 12.3% for the worst-area noise.



Fig. 11. The impedance profile of a complete PDN path

### D. Critical Path Delay under Worst Noise Area Fluctuation:a Test Case

We study the worst-area noise (T=12.5ns) of a complete PDN path and the maximum detapath delay under the worst-area noise from a industrial design. The board model is extracted from Cadence Allegro Sigrity Power SI 16.6 and the package model is extracted from Ansoft Q3D 12.0. A fine on-die power grid model is used to simulated the die. The impedance profile of the complete PDN is shown in Fig. 11.

Plugging the impedance profile and T into Algorithm 1, the worse-peak and worst-area voltage response are shown in Fig. 12. Because the voltage droop of the complete PDN path is slightly high under our maximum current assumption (1(A)), we increase the nominal voltage to 1.15(V). Simulation results show that the worst-peak noise is 1.15-0.7779=0.3721(V) and the worst noise area  $A_w$  is 1.15(V)\*12.5(ns)-12.21(V\*ns)=2.165(V\*ns).

The datapath extracted from C432 of ISCAS85 is slightly modified for the new window size by removing some circuitry. The results of delay measurement are shown in Fig. 13. We observe 0.22ns (1.8%) extra delay for the worst-area noise for this complete PDN path case. The comparison of the worst-area and worst-peak noise of this case are listed in Table III.



Fig. 13. The delay under worst-area and worst-peak supply noise for a complete PDN path  $\left(T=12.5ns\right)$ 

TABLE III COMPARISON OF THE WORST-PEAK AND THE WORST-AREA NOISE FOR A COMPLETE PDN PATH (T=12.5ns)

|                         | Worst-Peak | Worst-Area |
|-------------------------|------------|------------|
| Max Voltage Area (V*ns) | 1.695      | 2.165      |
| Delay of Datapath (ns)  | 12.33      | 12.55      |

### VI. CONCLUSIONS

In this paper, we predict the worst-case voltage noise area and measure its impact on the circuit performance. We propose an analytical solution for RLC tank cases and an algorithm to find the worst-case current generation for general PDN cases. Our study shows that circuit delay is better correlated with the worst-area noise than the worst-peak noise. The former introduces on 1.8% additional propagation delay than the latter from our empirical validation under a complete PDN path.

# S1. PROOF OF OPTIMALITY ON THE PHASE DELAY OF THE WORST-CASE CURRENT

The worst-case current  $i_w(t)$  is a binary-valued function switching between 0 and 1. Based on this assumption, we prove that our algorithm could generate the optimum phase delay  $t_k$  for every step input  $s_k(t)$ , such that the superposition equals  $i_w(t)$ , as Theorem 2 shows. Fig. 5 shows that our algorithm determines  $t_k$  by the peak-to-valley distances in  $A_s(t)$ . Thus our target is to prove the correctness of Eq. (17), which is equivalent to the optimality of our algorithm as Theorem 2 shows.

$$A_w = \sum_{i=0}^{m-1} A_s(t_{p_i}) - \sum_{i=0}^{m-2} A_s(t_{v_i})$$
 (17)

| Cases                  | Tank1  | Tank 2 | Tank 3 | Valley of Tank 1,2 | Valley of Tank 2,3 | Total Est. Results | Alg. 1 Results | err (%) |
|------------------------|--------|--------|--------|--------------------|--------------------|--------------------|----------------|---------|
| Case I $V_{peak}(V)$   | 0.1592 | 0.1263 | 0.1742 | -0.008             | -0.005             | 0.4467             | 0.4151         | 7.23%   |
| Case I $A_w(V*ns)$     | 1.592  | 1.263  | 0.1366 | -0.08              | -0.05              | 2.8616             | 2.571          | 11.3%   |
| Case II $V_{peak}(V)$  | 0.1614 | 0.0838 | 0.2406 | -0.023             | -0.012             | 0.4508             | 0.4050         | 11.31%  |
| Case II $A_w(V*ns)$    | 1.614  | 0.838  | 0.7206 | -0.23              | -0.12              | 2.8226             | 2.363          | 19.45%  |
| Case III $V_{peak}(V)$ | 0.0678 | 0.1047 | 0.1397 | -0.016             | -0.011             | 0.2852             | 0.2724         | 4.70%   |
| Case III $A_w(V*ns)$   | 0.678  | 1.047  | 0.300  | -0.16              | -0.11              | 1.755              | 1.653          | 6.17%   |



Fig. 12. The worst-peak and worst-area current, voltage response and voltage area response (T = 12.5ns) of a complete PDN path. (d-f) shows the expanded view of (a-c) at the peak droop point.

where  $t_{p_i}(t_{v_i})$  represents the  $i^{th}$  elements of peaks(valleys). We prove the optimality by sequentially introducing the following lemmas. In the rest of the section, we assume  $i_w(t)$  is decomposed into N step inputs.

**Lemma 3.** 
$$\exists \{x_0, x_1, \ldots\}, \text{ s.t. } A_w = \sum_{k=0}^{N-1} (-1)^k A_s(x_k)$$

*Proof:* Based on Eq. 4, we can have  $i_w(t)$  decomposed into N step inputs with constant amplitude  $\pm 1$ . Positive step inputs alternate with negative step inputs. Without loss of generality, suppose that the first step input is positive, and we have  $i_w(t) = \sum_{k=0}^{N-1} (-1)^k s(t-t_k)$ . Let  $x_k = t_w - t_k$  and we have the lemma proved.

Lemma 3 shows that worst-case noise  $A_w$  equals the sum of a set of functional values sampled on  $A_s(t)$ , each with alternative sign of  $\pm 1$ . Let  $X = \{x_0, x_1, \ldots, x_{N-1}\}$ . As  $A_w = \max_{i,t} A(i,t)$ , we need to maximize the amount of positive components in  $A_s(x_k)$  while minimize negative components, which leads to Lemma 4.

**Lemma 4.** 
$$A_s(x_0)$$
 and  $A_s(x_{N-1})$  must be positive.

*Proof*: We prove this by contradiction. Suppose that the sign of  $A_s(x_0)$  is negative. We can simply remove  $x_0$  from X thus reduce |X| to N-1. Meanwhile,  $A_w$  will be increased by  $A_s(x_0)$ , which contradicts to the fact that  $A_w$  is maximum. As a result, we can prove that  $A_s(x_0)$  is positive. The proof to the fact that  $A_s(x_{N-1})$  is positive can be obtained in the similar way.

Lemma 4 shows the boundary conditions for  $A_s$  on X. We divide  $A_s(t)$  into a series of *uphill* and *downhill* regions.

**Definition 1.** An uphill region (downhill region) corresponds to

an interval on  $A_s(t)$  with monotonically increasing (decreasing) functional values.

As Figure 14 shows, each uphill region is sandwiched by two downhill regions, vice versa. Suppose that there are  $m_p$  peaks and  $m_v$  valleys in  $A_s(t)$ , thus totally there are  $m=m_p+m_v$  locally extreme points. The two end points of an uphill (downhill) regions are peak and valley (valley and peak), respectively. As a result, there are totally m-1 regions on  $A_s(t)$ . For the  $j^{th}$  region  $r_j$ , we have  $r_j=[t_{pv_j},t_{pv_{j+1}}]$ .



Fig. 14. Downhill region  $r_{j-1}$  is sandwiched by peak  $pv_{j-1}$  and valley  $pv_{i}$ , Uphill region  $r_{i}$  is sandwiched by valley  $pv_{i}$  and peak  $pv_{i+1}$ , etc..

**Lemma 5.** 
$$\forall j \in [0, m-1], \exists k \in [0, N-1], \text{ s.t. } t_{pv_j} = x_k.$$

*Proof*: We prove this by contradiction. Suppose that there is no  $x_k$  in X which equals the index of the  $j^{th}$  extreme point  $pv_j$ . Without loss of generality, let us make the following assumptions.

- Suppose that  $pv_j$  is a valley, which is sandwiched by two regions  $r_{j-1}$  and  $r_j$ , as Figure 14 shows.
- Suppose that x<sub>k</sub> is the sampling point which is the closest to t<sub>pv<sub>j</sub></sub>, and x<sub>k</sub> > t<sub>pv<sub>j</sub></sub>. Thus we have t<sub>pv<sub>j</sub></sub> ∈ (x<sub>k-1</sub>, x<sub>k</sub>).

 Suppose that x<sub>k-1</sub> corresponds to a negative step input s<sub>k-1</sub>(t), while x<sub>k</sub> corresponds to a positive step input s<sub>k</sub>(t).

We divide all possible local sampling cases in the two neighboring regions of  $pv_j$ ,  $r_{j-1}$  and  $r_j$ , into two categries.

- If  $x_{k-1} \in r_{j-1}$ , we can shift  $x_{k-1}$  rightwards to  $t_{pv_j}$ , thus increase  $A_w$  by  $A_s(x_{k-1}) A_s(t_{pv_j})$ , which contradicts to the fact that  $A_w$  is maximum.
- If x<sub>k-1</sub> ∉ r<sub>j-1</sub>, there must be no sampling point at pv<sub>j-1</sub>.
  We can increase A<sub>w</sub> by adding one positive point at pv<sub>j-1</sub> and one negative point at pv<sub>j</sub>, without changing the sign of any previous sampling points. This also contradicts to the fact that A<sub>w</sub> is maximum.

Here we get the proof based on the above assumptions. As our proof and assumptions are general, the proofs for other conditions can be obtained in a similar way (e.g.,  $pv_j$  is a peak,  $x_k$  corresponds to a positive step input  $s_k(t)$ , etc.) and are ignored here.

We define  $X_j$  to be the cluster of sampling points located in  $r_j$ . The two boundary points,  $t_{pv_j}$  and  $t_{pv_{j+1}}$ , are also included in  $X_j$ . Suppose that  $X_j$  is an uphill region, we define the noise area contribution of  $r_j$  to  $A_w$  as  $A_w^j = \sum_{k=t_{pv_j}}^{t_{pv_{j+1}}} A_s(x_k)$ .

**Lemma 6.**  $A_w$  is maximum only if  $A_w^j$  is maximum,  $\forall j \in [0, m-1]$ .

*Proof:* The proof is straightforward. As both  $t_{pv_{j-1}}$  and  $t_{pv_{j}}$  are included in  $X_{j}$  according to Lemma 5, we can only select or deselect the internal sampling points of  $r_{j}$ , which is independent with other regions. As a result,  $X_{j}$  is an optimum substructure for X, and we have Lemma 6 proved.

Based on Lemma 6, we only need to conduct local maximization of  $A_w^j$  on each  $X_j$ , and a global maximization of  $A_w$  is achieved, as Eq. (18) shows.

$$A_w = \sum_{j=0}^{m-1} A_w^j - \sum_{j=1}^{m-2} A_s(t_{pv_j})$$
 (18)



Fig. 15. A set  $X_j$  of n' local sampling points  $\{x'_0,\dots,x'_{n'-1}\}$  within region  $r_j$ .

**Lemma 7.**  $A_w^j$  is maximum when  $X_j = \{t_{pv_{j-1}}, t_{pv_j}\}.$ 

*Proof:* We illustrate our proof in Fig. 15. Assume that there are n' sampling points in  $X_j$  where  $X_j = \{x'_0, x'_1, \dots, x'_{n'-1}\}$  in ascending order. From Lemma 5 we know that  $x'_0 = t_{pv_{j-1}}$  and  $x'_{n'-1} = t_{pv_{j-1}}$ . Therefore,  $n' = |X_j|$  is an even number, as  $X_j$  starts from a negative sampling point and ends at a positive point.

$$A_w^j = \sum_{k=0}^{n'-1} (-1)^{k+1} A_s(x_k')$$

$$= \sum_{k=1}^{\frac{n'}{2}-1} \left( A_s(x_{2k-1}') - A_s(x_{2k}') \right) + A_s(t_{pv_{j+1}}) - A_s(t_{pv_j})$$

$$\leq A_s(t_{pv_{j+1}}) - A_s(t_{pv_j})$$
(19)

The last step of Eq. (19) holds because  $r_j$  is an uphill region with monotonically increasing functional values. Therefore, we have  $A_s(x'_{k_1}) \leq A_s(x'_{k_2})$ ,  $\forall 0 \leq k_1 < k_2 \leq (n'-1)$ . From Eq. 19 we have  $A_w^j \leq A_s(t_{pv_{j+1}}) - A_s(t_{pv_j})$ , which proves the lemma.

Based on all the above proved lemmas, we finally obtain the following equation which proves Eq. (17) thus Theorem 2 and shows that our algorithm is optimum.

$$A_w = \sum_{j=0}^{m_p - 1} A_s(t_{p_j}) - \sum_{j=0}^{m_v - 1} A_s(t_{v_j}) = \sum_{k=0}^{N-1} (-1)^k A_s(x_k)$$
 (20)

#### ACKNOWLEDGEMENT

The authors would like to acknowledge the support of NSF CCF-1017864.

#### REFERENCES

- M. Saint-Laurent *et al.*, "Impact of Power-Supply Noise on Timing in High-Frequency Microprocessors," *IEEE ADVP*, vol. 27, no. 1, pp. 135– 144, 2004.
- [2] T. Pialis and K. Phang, "Analysis of Timing Jitter in Ring Oscillators Due to Power Supply Noise," in ISCAS, 2003, pp. 685–688.
- [3] Y.-M. Jiang and K.-T. Cheng, "Analysis of Performance Impact Caused by Power Supply Noise in Deep Submicron Devices," in DAC, 1999, pp. 760–765.
- [4] L. D. Smith et al., "System Power Distribution Network Theory and Performance with Various Noise Current Stimuli Including Impacts on Chip Level Timing," in CICC, 2009, pp. 621–628.
- [5] M. Popovich, A. Mezhiba, and E. Friedman, Power Distribution Networks with On-Chip Decoupling Capacitors. Springer, 2008.
- [6] T.-H. Ding and Y.-S. Li, "Efficient Method for Modeling of SSN Using Time-Domain Impedance Function and Noise Suppression Analysis," *IEEE CPMT*, vol. 2, no. 3, pp. 510 –520, march 2012.
- [7] P. Du et al., "Worst-Case Noise Prediction with Non-Zero Current Transition Times for Early Power Distribution System Verification," in ISQED, 2010, pp. 624–631.
- [8] J. Kim et al., "Closed-Form Expressions for the Maximum Transient Noise Voltage Caused by an IC Switching Current on a Power Distribution Network," *IEEE EMC*, vol. 54, no. 5, pp. 1112–1124, 2012.
- [9] A. Waizman and C.-Y. Chung, "Resonant free power network design using extended adaptive voltage positioning (eavp) methodology," *IEEE ADVP*, vol. 24, no. 3, pp. 236–244, 2001.
- [10] W. Kim, "Estimation of Simultaneous Switching Noise From Frequency-Domain Impedance Response of Resonant Power Distribution Networks," *IEEE CMPT*, vol. 1, no. 9, pp. 1359–1367, 2011.
- [11] I. Novak et al., "Distributed Matched Bypassing for Board-Level Power Distribution Networks," IEEE ADVP, vol. 25, no. 2, pp. 230–243, 2002.
- [12] D. Kouroussis and F. N. Najm, "A Static Pattern-Independent Technique for Power Grid Voltage Integrity Verification," in DAC, 2003, pp. 99– 104
- [13] I. A. Ferzli, F. N. Najm, and L. Kruse, "A Geometric Approach for Early Power Grid Verification Using Current Constraints," in *DAC*, 2007, pp. 40–47.
- [14] M. Ketkar and E. Chiprout, "A Microarchitecture-Based Framework for Pre- and Post-Silicon Power Delivery Analysis," in *MICRO*, 2009, pp. 179–188.
- [15] E. Chiprout, "On-Die Power Grids: The Missing Link," in DAC, 2010, pp. 940–945.
- [16] N. H. A. Ghani and F. N. Najm, "Power Grid Verification Using Node and Branch Dominance," in DAC, 2011, pp. 682–687.
- [17] S. Huh, M. Swaminathan, and D. Keezer, "Low-noise power delivery network design using power transmission line for mixed-signal testing," in *IEEE IMS3TW*, 2011, pp. 53–57.
- [18] W. Zhao and Y. Cao, "Predictive Technology Model for Nano-CMOS Design Exploration," *JETC*, vol. 3, no. 1, 2007.
- [19] Y. Ogasahara et al., "Validation of a Full-Chip Simulation Model for Supply Noise and Delay Dependence on Average Voltage Drop With On-Chip Delay Measurement," IEEE CAS2, vol. 54, no. 10, pp. 868– 872, 2007.
- [20] X. Zhang, Y. Liu, and C.-K. Cheng, "Worst-case noise prediction using power network impedance profile," SLIP, 2013.