

Open access • Journal Article • DOI:10.1109/TVLSI.2012.2218838

# Slew-Rate Monitoring Circuit for On-Chip Process Variation Detection — Source link 🗹

Amlan Ghosh, Rahul M. Rao, Jae-Joon Kim, Ching-Te Chuang ...+1 more authors Institutions: Advanced Micro Devices, IBM, National Chiao Tung University, University of Utah Published on: 01 Sep 2013 - IEEE Transactions on Very Large Scale Integration Systems (IEEE) Topics: Slew rate, Process corners, Process variation, Integrated circuit and CMOS

Related papers:

- A slew-rate based process monitor and bi-directional body bias circuit for adaptive body biasing in SoC applications
- · Slew-rate enhancement circuit of CMOS current-mirror amplifier by edge-detecting technique
- · Folded-cascode CMOS operational amplifier with slew rate enhancement circuit
- A constant slew-rate Ethernet line driver
- A Constant- \$g\_{m}\$ Constant-Slew-Rate Rail-to-Rail Input Stage With Static Feedback and Dynamic Current Steering for VLSI Cell Libraries



# Slew-Rate Monitoring Circuit for On-Chip Process Variation Detection

Amlan Ghosh, *Member, IEEE*, Rahul M. Rao, *Member, IEEE*, Jae-Joon Kim, *Member, IEEE*, Ching-Te Chuang, *Fellow, IEEE*, and Richard B. Brown, *Senior Member, IEEE* 

Abstract—The need for efficient and accurate detection schemes to assess the impact of process variations on the parametric yield of integrated circuits has increased in the nanometer design era. In this paper, the difference of rise and fall slew is presented as another process-variation metric along with the delay in determining the relative mismatch between the drive strengths of nMOS and pMOS devices. The importance of considering both of these metrics is illustrated, and a new slewrate monitoring circuit is presented for measuring the difference of rise and fall slew of a signal on the critical path of a circuit. Sensitivity analysis with multiple pulses as input has also been investigated. Bias generator circuits that track nMOS and pMOS threshold voltages have been incorporated, which makes the design less susceptible to process variation. Design considerations, simulation results, and characteristics of the slew-rate monitor circuitry in a 65-nm IBM CMOS process are presented, and a sensitivity of 50 MHz/50 ps for single pulse input is achieved. The measurement sensitivity of a fabricated slew-rate monitor in a 65-nm IBM CMOS technology is 0.11 V/µs, with 1089 pF as the output load of the slew-rate monitor.

*Index Terms*—Process variation compensation, process variation detection, slew, slew-rate monitor.

#### I. INTRODUCTION

ANUFACTURING variations due to systematic interdie and random intra-die variations can cause significant discrepancies between the designed and the manufactured products in nanometer technologies. Process tolerances do not scale proportionally with the design dimensions, causing the relative impact of variations to increase with every new technology generation. Random local variations caused by random dopant fluctuation and microscopic effects, such as line edge roughness and surface roughness, further aggravate the problem. Precise detection and compensation schemes to mitigate variations and optimize the post-fabrication operating characteristics to meet the target frequency and power

R. B. Brown is with the University of Utah, Salt Lake City, UT 84102 USA (e-mail: brown@utah.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TVLSI.2012.2218838

consumption have become indispensable for yield enhancement and improvement.

Several compensation schemes have been reported recently. A typical scheme consists of a sensor block to determine the extent of variation followed by a compensation circuit that alters the operating characteristics of the design appropriately. Hence, the efficiency of the compensation scheme depends on the accuracy of the detection method. A correction scheme that senses variation in critical path delay and generates a suitable bidirectional body bias was presented in [1]–[4]. In [5], the authors used a combination of power and delay monitoring blocks to adjust the supply and threshold voltage ( $V_{TH}$ ) of devices in various modes of operation.

Most of these schemes are primarily based on monitoring the delay of the critical path of the circuit. However, purely delay-based compensation schemes can result in suboptimal design under certain scenarios, wherein  $V_{TH}$  mismatches between nMOS and pMOS devices result in nearly identical delay but inferior power characteristics as compared with nominal design. Power monitors determine the total switching plus leakage power of the system and hence may also fail to identify the effects of such mismatches.

This paper is organized as follows. Section II discusses the deficiency of using delay as the only metric to detect and characterize process variations. Section III presents the use of signal slew as an additional metric in combination with delay (and power) to determine suitable post-fabrication corrections to be applied. An analytical framework is also presented to substantiate the use of slew as a metric. Design details and sensitivity analysis of a novel slew-rate monitoring circuit are presented in Sections IV and V, respectively. Section VI shows the simulation results and characteristics of the slew-rate monitor circuitry in a 65-nm IBM PD/SOI CMOS process [6]. Section VII shows the measurement results in IBM 65-nm CMOS process. Section VIII concludes this paper.

## II. LIMITATION OF DELAY METRIC AS DEVICE MISMATCH DETECTION

The  $V_{TH}$  of pMOS and nMOS devices and their difference affects the circuit characteristics. A balanced  $V_{TH}$  between the devices enables lower circuit operating voltage and power while providing a symmetric static noise margin. Any mismatch in  $V_{TH}$  between these two types of devices causes degradation of the operating margin and also in performance and power of the circuit [7].

Manuscript received October 9, 2011; revised May 27, 2012; accepted July 21, 2012. Date of publication November 16, 2012; date of current version August 2, 2013.

A. Ghosh is with Advanced Micro Devices, Austin, TX 78730 USA (e-mail: hosh.amlan@gmail.com).

R. M. Rao and J.-J. Kim are with the IBM T. J. Watson Research Center, Yorktown Heights, NY 10598 USA (e-mail: raorahul@us.ibm.com; jjkim2@us.ibm.com).

C.-T. Chuang is with National Chiao Tung University, Hsinchu 300, Taiwan (e-mail: ChingTe.Chuang@gmail.com).

A simple 33-stage ring oscillator is used as a representative vehicle for a replica of the critical path. Normalized delay of the ring oscillator is shown in Fig. 1 for a typical V<sub>TH</sub> variation of -50 to 50 mV for both device types [1], [6]. In this graph, the x-axis represents shifts in nMOS  $V_{TH}$  and the y-axis represents shifts in pMOS V<sub>TH</sub>, and the normalized ring oscillator delay is shown with color. In Fig. 1, point A corresponds to a condition with fast (and leaky) nMOS devices and slow pMOS devices, whereas point B represents slow nMOS devices with fast pMOS devices. It can be observed that delay values at A and B are nearly identical to point C, which corresponds to the nominal (i.e., intended) operating point of the circuit in the absence of any threshold voltage variation. Similarly, the region inside the dotted oval in Fig. 1 represents different V<sub>TH</sub> values that exhibit a delay very close to nominal. Thus, it may not be possible to detect the mismatch between the two types of devices with delay as the only metric, and hence, a purely delay-based compensation scheme would not generate any adjustments to bias and/or supply voltage in such scenarios. However, with the V<sub>TH</sub> of one type of device being lower than the nominal value, its leakage current would be significantly higher, resulting in a substantial increase in power consumption of the circuit. In addition, noise margins of the circuit are also degraded, rendering it more susceptible to noise failures.

This can also be illustrated using a simple analytical delay model of the ring oscillator [8]–[11]. Propagation delay  $(t_{pd})$  of a single stage of a CMOS ring oscillator can be expressed as

$$t_{pd} = \frac{C_L V_{dd}}{n} \left( \frac{1}{I_{\text{dsatn}}} + \frac{1}{I_{\text{dsatp}}} \right) \tag{1}$$

with *n* being a constant.  $I_{dsatn}$  and  $I_{dsatn}$  are the drain currents in saturation of the nMOS and pMOS devices of the inverter in the ring oscillator and CL is the load capacitance. This analysis assumes a simple square-law relationship for the device currents, as shown in

$$I_{\rm dsat} = \beta_{\rm eff} \left( V_{GS} - V_{TH} \right)^2.$$
 (2)

Substituting this into (1), and assuming that  $\beta_{\text{neff}} = \beta_{\text{peff}}$ , we get where *K* is the constant term

$$t_{pd} = K \left( V_{\rm GSn}^{-2} + V_{\rm GSp}^{-2} + \frac{2V_{\rm THn}}{V_{\rm GSn}^3} + \frac{2V_{\rm THp}}{V_{\rm GSp}^3} \right).$$
(3)

Now, one can consider a variation of  $\Delta V_{THn}$  and  $\Delta V_{THp}$ in the threshold voltages of nMOS and pMOS devices. The corresponding change in  $t_{pd}$ , the propagation delay of a single stage can be computed as a function of threshold voltages and gate-source voltages (VGS) of both types of devices as

$$t_{\rm pdnom} + \Delta t_{pd} = k_0 \Delta V_{\rm THn} - k_1 \Delta V_{\rm THp} + k_2.$$
(4)

Here,  $k_2$  is representative of  $t_{pdnom}$ , the nominal propagation delay of a single stage of the ring oscillator. A similar approximation can be derived using a more accurate alphapower model. It can be seen that the propagation delay is a function of the variation in both of the device types. Hence, it is difficult to decouple their mismatch from delay variations when nMOS and pMOS V<sub>TH</sub> vary in opposite directions.



Fig. 1. Normalized delay with typical nMOS and pMOS threshold voltage variation.



Fig. 2. Voltage transfer characteristics of a CMOS inverter.

For instance, from Fig. 1, the delay variation can be expressed as a function of nMOS and pMOS  $V_{TH}$  as

$$\frac{\Delta t_d}{t_{\rm pdnom}} = 1.74 \times \Delta V_{\rm THn} - 1.52 \times \Delta V_{\rm THp}.$$
(5)

This indicates that the delay of this ring oscillator is identical to the nominal delay  $t_{pdnom}$ , for all cases where  $\Delta V_{THn}/\Delta V_{THp} = 0.877$ .

### A. Effect of Mismatch in Device Variation on CMOS Noise Margin

A further investigation was done on the impact of  $V_{TH}$  variation on noise margin. Fig. 2 shows the basic voltage transfer characteristics of a CMOS inverter [11]. Points A and C characterize the unity slope points, while  $V_{IH}$ ,  $V_{IL}$ ,  $V_{OH}$ , and  $V_{OL}$  are defined based on points A and C on the voltage transfer characteristics [10].



Fig. 3. Normalized noise margin with typical nMOS and pMOS  $V_{TH}$  variation. (a)  $\rm NM_{\it H.}$  (b)  $\rm NM_{\it L.}$ 

Noise margins  $NM_L$  and  $NM_H$  are defined as

$$NM_L = V_{IL} - V_{OL} \tag{6}$$

$$NM_H = V_{OH} - V_{IH}. (7)$$

Noise margins in the presence of variation are plotted in Fig. 3(a) and (b). As can be seen,  $NM_H$  varies from -20% to 15% from slow nMOS and fast pMOS corner (SF) to fast nMOS and slow pMOS corner (FS). Similarly,  $NM_L$  varies from -18% to 15% from FS corner to SF corner. If nMOS and pMOS threshold voltages are not measured separately, one might find the optimal operating point for compensating delay, but that might compromise the noise margin, or vice versa. The effect of V<sub>TH</sub> variation on the noise margin can be illustrated by a simple analytical model, as in (8) and (9), from the input–output voltage characteristics corresponding to points A and C in the voltage transfer curves of Fig. 2 [11]– [14]

$$NM_L = \frac{1}{4}(V_{DD} + 3V_{\text{THn}} + V_{\text{THp}})$$
(8)

$$NM_{H} = \frac{1}{4}(V_{\text{THn}} + 3V_{\text{THp}} + V_{DD}).$$
 (9)

A differential of the above equations gives the change in noise margins as functions of the change in threshold voltages

$$\Delta NM_L = \frac{1}{4} (3\Delta V_{\text{THn}} + \Delta V_{\text{THp}}) \tag{10}$$

$$\Delta NM_H = \frac{1}{4} (\Delta V_{\text{THn}} + 3\Delta V_{\text{THp}}). \tag{11}$$

These equations show analytically that noise margin changes linearly with the nMOS and pMOS  $V_{TH}$  variations. These analytical solutions match the simulation data shown in Fig. 3(a) and (b). Noise margins are compromised when the  $V_{TH}$  variations are in opposite directions. Thus, while the delay information is valuable, it is not sufficient to detect all combinations of variations in device parameters.

Delay-based compensation schemes perform suitably when the performance characteristics of the nMOS and pMOS devices vary in a similar fashion, i.e., both the devices become either slower or faster. But, they would fail to find the optimal operating conditions when  $V_{TH}$  variation for nMOS and pMOS are in opposite directions, as illustrated in Fig. 1.

#### III. WHY SLEW?

If the ring oscillator (or critical path) has been designed to have equal rise and fall times, then any node can be used to identify the mismatch in the two types of devices. The rise time at the output of a CMOS gate is determined by the pull-up network, whereas the pull-down network controls the fall time. The difference between the fall and rise slew indicates the mismatch between the strength of the nMOS and pMOS devices. The normalized difference in fall and rise slew of a 33-stage ring oscillator by shifting both nMOS and pMOS  $V_{TH}$  from -50 to 50 mV from their nominal value is plotted in Fig. 4. As can be seen, point A represents the condition with fast nMOS devices and slow pMOS devices. This results in a fast fall time and slow rise time, which causes an identifiable negative change in the difference between the fall and rise slew. Similarly, in the presence of fast pMOS and slow nMOS devices, represented by point B, there exists an identifiable positive change in the slew difference. Thus, the slew difference can be used to suitably characterize the mismatch between the two types of devices.

It should also be noted that, in scenarios where the strength of both of the device types is affected in a similar fashion (i.e., both devices are either slow or fast), the impact on slew difference is small. However, in such scenarios, a delay-based monitor can be used to determine the extent of variation in both types of devices.

Rise and fall times can also be modeled as functions of  $V_{TH}$  of the two types of devices. For a given set of variations in nMOS and pMOS  $V_{TH}$ , the corresponding change in the difference of fall and rise time can be represented as

$$\Delta t_{F-R} = k_3 \left( \frac{2\Delta V_{\text{THp}}}{V_{\text{GSp}}^3} - \frac{2\Delta V_{\text{THn}}}{V_{\text{GSn}}^3} \right).$$
(12)

As an illustration, from Fig. 4, the variation in normalized difference of rise and fall time of this oscillator can be



Fig. 4. Normalized difference of fall and rise slew with typical nMOS and pMOS threshold voltage variation.

expressed as a function of nMOS and pMOS V<sub>TH</sub> as

$$\frac{\Delta t_{R-F}}{t_{R-Fnom}} = 6.78 \times \Delta V_{THn} - 3.74 \times \Delta V_{THp}.$$
 (13)

#### IV. SLEW-RATE MONITOR DESIGN

In this section, the design issues and challenges of a slewrate monitor are described. Measuring the slew-rate of a signal from the critical path, especially in a multigigahertz design, requires very high-speed and precise dynamic apparatus with sensitivity in the picosecond range [15]. It is important to note that the logic gates in the critical path of a microprocessor are often designed with minimum size devices, so the slew of a signal in the critical path will be comparable to that of signals in the slew-rate monitor. To precisely measure a time (such as slew) requires circuitry that is faster than the time being measured. The slew-rate monitor is designed to be as fast as the process technology allows, but it still has gate delays and slews that are comparable to those coming from the circuit it is measuring. To overcome this problem, extra capacitive load is added at the output node of the replica critical path such that the input slew lies within the range of the slew-rate monitor. The replica critical path still uses minimum-size transistors so that it captures the characteristics of the active circuit. Fortunately, slew scales predictably with capacitive load, so that this approach provides reliable information on pMOS and nMOS characteristics.

A basic block diagram of the slew-rate monitor that measures the difference of rise and fall slew is shown in Fig. 5.

The signal under test (SUT) drives two comparators, A and B. Comparator-A, primarily composed of thick-oxide long-channel nMOS devices, compares the SUT level to a reference voltage equal to 80% of the supply voltage (Vdd). Similarly, comparator-B, primarily composed of thick-oxide long-channel pMOS devices, compares the SUT level to a reference voltage equal to 20% of the supply voltage. The reference voltages of 20% and 80% were chosen to provide sufficient noise margin against supply noise on the input signal. This slew-rate monitor topology is applicable for any



Fig. 5. Block diagram of the slew-rate monitor.

set of two reference voltages. In a low-noise system, 10%–90% reference voltages could be used in order to improve the output sensitivity.

The output of the one comparator switches before the output of the other comparator, depending on the slew of the signal and the direction of transition. Hence, the comparators generate two pulses of different width. These two comparator output signals (U and V in Fig. 5) are fed to a minimum-size device-based CMOS XOR gate that generates two pulses at its output.

The width of the first pulse (R) is a direct representation of the rise slew of the input whereas width of second pulse (F) represents the fall slew. The control logic, using only the SUT as an input, controls the select lines of the pass-gate based 2-to-1 multiplexer (MUX) to separate the R and F pulses into PR and PF, which are used to drive a charge pump, the output voltage of which, represents the difference in rise and fall time. The output pulse PR charges the capacitor C to increase the voltage from initial voltage level a to level b while PF discharges it back to level c. Thus, the final output level (c)is proportional to the difference of rise and fall slew of the SUT. This output voltage c controls the frequency of a variable capacitance voltage controlled oscillator (VCO). The output frequency of the VCO can be directly monitored, or it can be easily converted to a digital value representative of the device mismatch, by using it to clock an n-bit counter for a fixed time.

A discharge path for the capacitor voltage is provided through the pair of series-connected nMOS devices indicated as reset circuit. The reset signal is asserted to initialize the output voltage at half of the supply rail (500 mV), and deasserted just prior to the application of the input to the slew monitoring system. This pre-charging mechanism provides the needed dynamic range, allowing the sign of the difference between rise and fall time to be positive or negative.

#### A. Comparator

The sensitivity of the slew-rate monitor is a function of the performance and voltage offset of the comparators. Two CMOS differential latched comparators with additional diodeconnected transistors are designed to achieve the required high-speed and accuracy while operating with chosen reference voltages [16], [17]. Fig. 6 shows the implementation details for comparator-A, designed primarily using nMOS



Fig. 6. Schematic block diagram of comparator-A.



Fig. 7. Transfer characteristics of the comparators.

devices, with a reference voltage of 80% of Vdd. In the input pre-amplifier stage, N3 and N4 are diode-connected [16]. When the input voltage is less than the reference voltage, an additional current flows through N4 (in the other case, the extra current flows through N3). This additional current increases the voltage gain of the preamplifier and enhances the speed of the comparator.

A regenerative latch is incorporated as the decision element to achieve a high gain [14]. It uses positive feedback from the cross-gate connection of N15 and N8. To understand its operation, assume  $i_{P5}$ , the current flowing through device P5, is much larger than  $i_{P4}$ , the current through P4. In that case, N15 and N7 are on, and N14 and N8 are off. If  $i_{P4}$  is increased until it is greater than  $i_{P5}$ , the drain-source voltage of N15 will be large enough to switch N8 on. N8 will draw current from N7, which will reduce the drain-source voltage of N7. This in turn will switch N15 off. This regenerative process accelerates the comparison.

N9 is used to shift the level of the output to enable a rail-torail output voltage. The voltage drop across it is maintained at nearly  $V_{THn}$  by suitably sizing the device. The drive strength of a single latch may not be sufficient in scenarios where a significant amount of load capacitance is to be driven in a relatively short time. Hence, an inverting buffer stage comprising (P6, N13), is included as the output stage.

A complementary comparator-B using pMOS at the input stage and output latch has been designed with an intended



Fig. 8. Schematic block diagram of integrator.

reference voltage of 20% of Vdd. A rising-edge transition of 5 picoseconds slew was applied in simulation as an input to the comparators to determine their dynamic characteristics. The DC transfer characteristics are shown in Fig. 7. Comparators A and B exhibit a DC gain of 231 and 186, respectively.

#### B. Integrator

A simple charge pump circuit shown in Fig. 8 is used to integrate the PR and PF signals. A doubly-balanced current mirror consisting of appropriately sized P4, N1, and N4 allows identical current flow through N2 and P2. P3 and N3 act as switches [17]. The inverted output of pulse PR switches on P3. Thus, for a time equivalent to the pulse width of PR, P3 is on, and current flows from the supply to the output node to charge the capacitance C which is implemented as a thickoxide decoupling capacitor. Similarly, pulse PF switches N3 on. When N3 is on, current flows from output capacitance C to ground through the N3-N2 stack, thereby discharging the output node to an extent proportional to the pulse width of PF. The integrating time constant is dependent upon the capacitance (C) and the integrating current, and hence can be tuned as required, depending upon the difference of input signal pulse widths.

In this design, a reasonably large W/L ratio was chosen for devices P2 and N2 to obtain a small time-constant while ensuring that the integrator output does not saturate to the supply rail. Thus, the final output voltage represents the difference in pulse width of PR and PF signals. The slew-rate monitor output voltage depends on the charging capacitor (C) and the charging/discharging current ( $I_c$ ). If PR and PF are the widths of the pulses corresponding to rising and falling edges of the input signal, the output voltage (V) is

$$V_{\text{out}} = \frac{I_c}{C} \times (PR - PF) - 0.5.$$
(14)

C. VCO

A single stage of the VCO is shown in Fig. 9. While the pass-gate is OFF, the VCO acts as simple inverter-based



Fig. 9. Schematic block diagram of one stage of the VCO.



Fig. 10. Block diagram of the control logic circuit.

ring oscillator. While the pass-gate is partially turned on, the G2 inverter provides additional output current. G2 with N1 channel resistance in series shows almost linear relationship with increasing control voltage. The integrator output voltage controls the current through the nMOS device that follows gate G2. The output of the VCO helps in testing purpose. The stage delay is changed as a function of the input slew, which is reflected at the output frequency of the VCO.

The output frequency exhibits a fairly linear relationship with the control voltage over an input range of 0.3-1 V, with a sensitivity of ~1.69 MHz/mv.

#### D. Control Logic and MUX Circuitry

The control logic shown in Fig. 10 generates the select line inputs that control the multiplexer in the slew-rate monitor to separate pulses R and F (from the output of the XOR gate) so that they can drive the charge pump. The SUT is the only input to the logic block; it is XOR-ed with a delayed version of itself to generate two pulses at node A as shown in Fig. 10. These two pulses are used to clock the D-flip-flop and to generate CR and CF, which control the MUX to select R or F.

#### E. Process Variation Immune Bias Generator

The slew-rate monitor circuit was implemented using thick-oxide devices to reduce the effects of process variation. However, variations in input stage nMOS and pMOS transistor threshold voltages impact the circuit operating point.

If the bias voltages are kept constant, overdrive voltages change due to the variation in threshold voltage. This motivates the design of a biasing circuit, which senses the V<sub>TH</sub> variation due to process variation and generates an output voltage accordingly [18]. Consider the nMOS differential input stage of comparator-A shown in Fig. 6. W/L of the N1-N2 pair and diode-connected N3-N4 pair and W/L of P2 and P3 are identical. Ignoring the contribution of the diode-connected



Fig. 11. Schematic diagram of the process variation tolerant bias generator.

pair, the low-frequency small signal voltage gain can be approximated as in [19]

$$A_{v} = \frac{g_{mN1}}{g_{dsP3} + g_{dsN1}} = \frac{2}{(k_{1}\lambda_{n} + k_{2}\lambda_{p})} \frac{K}{\sqrt{I_{D}}}$$
$$= \frac{2}{(k_{1}\lambda_{n} + k_{2}\lambda_{p})} \frac{K}{(V_{\text{GSN5}} - V_{\text{THn5}})}$$

where K,  $k_1$ , and  $k_2$  represent constants consisting of parameters of N1 and P3. V<sub>GSN5</sub> and V<sub>THn5</sub> are the gate to source voltage and V<sub>TH</sub> of N5. It can be seen that the input stage gain is a strong function of V<sub>GSN5</sub> and V<sub>THn5</sub>, assuming all other parameters are constant.

In the presence of process variation, V<sub>THn5</sub> can be different than the nominal value, resulting in deviation in overdrive voltage (V<sub>GSN5</sub>-V<sub>THn5</sub>) if the gate to source voltage is kept constant. The objective is to make this overdrive voltage,  $V_{GSN5}$ - $V_{THn5}$ , which will be called  $V_{BB}$ , independent of  $V_{TH}$ variation. It can be expressed as

• •

$$V_{BB} = V_{\rm GSN5} - V_{\rm THn5} \tag{15}$$

$$V_{\rm GSN5} = V_{BB} - V_{\rm THn5}.$$
 (16)

This indicates that gate-to-source voltage should be the sum of the intended overdrive voltage and V<sub>TH</sub> of the device, thereby ensuring that the V<sub>TH</sub> variations will not change the bias current. A circuit which generates a bias voltage that is the sum of gate-source voltage and V<sub>TH</sub> is shown in Fig. 11 [18]. In this circuit,  $(W/L)_{BN1}$  is kept much higher than  $(W/L)_{BN2}$ , ensuring that  $V_X \approx V_{THBN1}$ . (W/L) of BN4 and BP1 are kept larger than that of BN5, BP2, and BN3. This ensures that voltage drop across BN4 at any current will be almost V<sub>THBN4</sub>. Hence, voltage at the bias node is

$$V_{\rm BIAS} = V_{\rm BIASIN} + V_{\rm THn4}.$$
 (17)

This bias voltage is applied at the gate of BN5 in the comparator circuit (Fig. 6).

Fig. 12 shows the output characteristic of the biasing circuit in the presence of variation in the nMOS V<sub>TH</sub>.

Here, the intended overdrive voltage for BN5 at nominal conditions with no process variation has been chosen as 0.75 V.



Fig. 12. Output characteristics of biasing circuit for various nMOS threshold voltages.

The output shows a linear relationship with the variation in nMOS  $V_{TH}$ .

# V. SENSITIVITY ANALYSIS WITH MULTIPLE INPUT PULSES

It is difficult to measure the difference of rise and fall slew for very sharp rise and fall edges using the slewrate monitoring scheme described above, especially when the difference between all rise and fall signals is nonzero in the picosecond range. Assuming all pulses in the pulse train are identical, multiple input pulses can be used to enhance the sensitivity of the slew-rate monitor, as shown in Fig. 13. A<sub>1</sub> is a train of pulses from the replica critical path. It consists of N pulses,  $P_1$ ,  $P_2$  to  $P_N$ . The number of pulses, N, can be adjusted, depending on the required output sensitivity. The slew-rate monitor generates pulses corresponding to each rise and fall slew of  $P_1$ ,  $P_2$  till  $P_N$ .  $R_1$ ,  $R_2$ ,  $R_3$ , through  $R_N$  are the pulses corresponding to the rise slew, and similarly,  $F_1$ ,  $F_2$ ,  $F_3$ , through  $F_N$  are generated corresponding to the fall slew. The pulse widths of  $R_i$  and  $F_i$  pulses are dependent upon the rise and fall slew of  $P_1$ ,  $P_2$  to  $P_N$ .  $F_i$  and  $R_i$  pulses are separated into the C1 and C2 pulse trains. It can be observed that  $PR_1$ ,  $PR_2$ ,  $PR_3$ , to  $PR_N$  in  $C_1$  never overlap with  $PF_1$ ,  $PF_2$ ,  $PF_3$ , to  $PF_N$  in  $C_2$  during the measurement period. This eliminates the possibility of any short-circuit path in the charge pump. The pulse train in  $C_1$  charges the capacitor C from the initial level and the output pulse train C2 discharges it. By integrating the charging and discharging over multiple cycles, the difference in pulse widths and in final capacitor voltage is multiplied by the number of cycles. Thus, the final output level is proportional to the difference of the rise and fall slews of the SUT.

#### VI. SIMULATION RESULTS

# A. Simulation Result of Slew-Rate Monitor Detecting Difference of Rise and Fall Slew

The slew-rate monitor circuit described in Section IV was designed in an IBM 65-nm CMOS process. Fig. 14 shows simulated results for the output voltage of the difference of rise



Fig. 13. Basic mechanism of slew-rate monitor using multiple pulses as input.



Fig. 14. Normalized output frequency of VCO and output voltage of slewrate monitor with the difference in the rise and fall slew of the input signal.

and fall slew-rate monitor and the normalized output frequency of the VCO with single pulse input signal slew. The output voltage exhibits a sensitivity of 0.4 mV/ps.

The output sensitivity of the oscillator with respect to the input signal slew is 1 MHz/ps. As can be seen in Fig. 14, the slew-rate monitor output is nearly linear while input slew difference is longer than 50 ps. Therefore, capacitance must be added to the replica critical path node so that the slew rate difference is no faster than 50 ps in order to accurately measure process parameters.

In any process monitored with circuits fabricated in the same technology, the slew rate will need to be slowed by adding an appropriate amount of capacitance to match the precision of the slew-rate monitor. To simulate the effect of mismatch between nMOS and pMOS  $V_{TH}$  variation using slew, a 33-stage inverter chain with FO4 loading at each node was used as a replica critical path. A signal from a node in the ring is fed to the input of the slew-rate monitor. The inverter chain was simulated for different scenarios of  $V_{TH}$  mismatch between the two types of devices. When a falling edge transition is applied at the input, the output of the slew-rate monitor responds to the nMOS  $V_{TH}$  shift, while with the rising edge at the input, it captures the variation in pMOS  $V_{TH}$ . The normalized output frequency difference between the rise and fall input transitions of the slew monitor



Fig. 15. Normalized output of the integrator with nMOS and pMOS  $V_{TH}$  variation.



Fig. 16. Sensitivity of slew-rate monitor with number of input pulses.

system is shown in Fig. 15 for various mismatches in the  $V_{TH}$  of the nMOS and pMOS transistors. In this experiment, the  $V_{TH}$ s of all nMOS and all pMOS were varied together. The slew-rate monitor output frequency can detect the nMOS and pMOS variation accurately with sensitivity of 0.95 MHz/mV.

#### B. Sensitivity of the Slew-Rate Monitor With Multiple Pulses

The output sensitivity is a strong function of the number of pulses, N, and the integrating constant of the charge pump (ratio of charging/discharging current to the value of the output capacitor). Simulations were run to investigate the use of multiple pulses to enhance the sensitivity of the system as described in Section V. The sensitivity of the slew-rate monitor with the number of pulses is shown in Fig. 16. As can be seen, the sensitivity increases almost linearly with the number of pulses. From this data, one can begin to optimize the circuit performance and sensitivity.

#### C. Characterization Using Process-Immune Bias Generator

Process variation-immune bias generator circuits for nMOS and pMOS biasing were incorporated into the comparator and integrator circuits. Inverted transfer characteristics of comparator-A without and with the process-immune bias generator circuit with various nMOS  $V_{TH}$  variation are shown



Fig. 17. Transfer characteristics of the comparator-A in the presence of nMOS  $V_{TH}$  variation. (a) Without process-immune bias generator. (b) With process-immune bias generator.

in Fig. 17(a) and (b), respectively. As seen in the figure, the width of the transition window fluctuates almost 0.2 V across process corners without the process-immune bias generator. With the bias generator circuit, the fluctuation in output transition decreases to 0.03 V. Further, several Monte Carlo simulations were also performed to validate the benefit of using the process-immune bias generator.

#### VII. MEASUREMENT RESULTS

A circuit to measure the difference between rise and fall slew rates has been designed and implemented in an IBM 65-nm five metal CMOS process [20]. It has a die area of  $45 \times 35 \ \mu$ m. Fig. 18 is a micrograph of the die with the slewrate monitor circled. This slew-rate monitor design is capable of measuring the slew of any rise or fall edge of a signal. A single rising edge input was generated using an Agilent 33220A 20-MHz arbitrary waveform generator.

The arbitrary waveform generation function was used in burst mode with a manual triggering option that allowed control of the start time and number of repetitions of the waveform. The fastest achievable slew in the arbitrary waveform generation mode is 1  $\mu$ s. The slew-rate monitor was designed to measure slew in the 100 ps to 1ns range. If



Fig. 18. Chip micrograph and the physical layout of the slew-rate monitoring circuit.



Fig. 19. Measured output of the slew-rate monitor circuit.



Fig. 20. Measured sensitivity of the slew-rate monitoring circuit.

the slew-rate monitor was used to measure slower slew rates, the output of the charge pump would saturate at Vdd. To demonstrate the slew-rate monitor on signals generated by the available waveform generator (in the 1–12  $\mu$ S range), a 1089pF capacitor was added between the output probe tip and the ground. The measured output voltage is plotted in Fig. 19. The slew-rate monitor shows a sensitivity of 0.11 V/ $\mu$ s with the extra output load of 1089 pF. The external capacitor value was varied to investigate the sensitivity of the slew-rate monitor to capacitance value. The measured value is shown in Fig. 20.

#### VIII. CONCLUSION

In this paper, we illustrated the deficiency of purely delaybased detection schemes in modeling process parameter variations because they have no way of determining mismatch between nMOS and pMOS devices. The effect of  $V_{TH}$  mismatch in nMOS and pMOS devices on circuit delay, slew, noise margin, and static and dynamic power was illustrated using analytical and simulation results. A purely delay-based compensation scheme would not generate any adjustments to bias and/or supply voltage if the parameter variations in the two types of devices are equal in effect and opposite in direction. However, with the  $V_{TH}$  of one type of device being lower than the nominal value, its leakage current would be significantly higher, resulting in a substantial increase in power consumption of the circuit. In addition, noise margins of the circuit would also be degraded, rendering it more susceptible to noise failures. Thus, while delay is necessary, it is not a sufficient metric to detect all combinations of variations in device parameters. Delay-based compensation schemes perform suitably when the performance characteristics of both nMOS and pMOS devices vary in a similar fashion, i.e., both of the devices become either slower or faster. However, it would fail to find the optimum operating conditions when  $V_{TH}$  variation for nMOS and pMOS is in opposite directions, or when the characteristics of only one of the device types drift away from the target parameters.

As a solution to this problem, we developed the use of slew as an additional metric in combination with the delay, to precisely detect the variation and mismatch of  $V_{THn}$  and  $V_{THp}$ . This information is used to determine the optimal supply and body-bias compensation to make the circuit meet the performance target at the minimum power, thereby enhancing the parametric yield of the design [19], [21]. A novel slewrate monitoring circuit was designed in a 65-nm IBM CMOS technology. This circuit can be used to measure the slew of a signal by setting the reference voltages to the desired points. Circuit design parameters were optimized based on the expected range of slew from the replica critical path period. For different ranges of input slew, the integrator architecture can be modified by employing multiple current sources with different drive strengths, and by selecting the appropriate current source based on the input slew range.

A slew-rate monitor designed in an IBM 65-nm CMOS technology exhibited, in simulation, a sensitivity of 50 MHz/50 ps for an input slew range from 25 to 250 ps. This circuit is capable of detecting a  $V_{TH}$  mismatch of nMOS and pMOS in the order of milivolts, with a sensitivity of 0.95 MHz/mV. To demonstrate the slew-rate monitor with test signals that could be generated in our laboratory, we added a 1089-pF capacitor to the charge pump. The slew-rate monitor generated 0.11 V/ $\mu$ s with an input signal difference between rise and fall slew of 1–12  $\mu$ s.

A process-immune bias generator circuit was designed and integrated at the nMOS and pMOS biasing nodes. The output voltage of this bias generator changes linearly with the change in the  $V_{TH}$ . Therefore, the drain current of the current source in the critical components of the slew-rate monitors, such as comparators and integrators, becomes process-variation resilient. Both Monte Carlo analysis and worst case corner analysis showed that the comparator offset voltage due to parameter variation was reduced from 110 to 20 mV through the use of this process-immune bias generator.

#### REFERENCES

- [1] J. Tschanz, J. T. Kao, S. G. Narendra, R. Nair, D. A. Antoniadis, A. P. Chandrakasan, and V. De, "Adaptive body bias for reducing impacts of die-to-die and within-die parameter variations on microprocessor frequency and leakage," *IEEE J. Solid-State Circuits*, vol. 37, no. 11, pp. 1396–1402, Nov. 2002.
- [2] E. S. Fetzer, "Using adaptive circuits to mitigate process variations in a microprocessor design," *IEEE Des. Test Comput.*, vol. 23, no. 6, pp. 476–483, Jun. 2006.
- [3] B. Zhou and A. Khouas, "Measurement of delay mismatch due to process variations by means of modified ring oscillators," in *Proc. Int. Symp. Circuits Syst.*, vol. 5. 2005, pp. 5246–5249.
- [4] A. Bassi, A. Veggetti, L. Croce, and A. Bogliolo, "Measuring the effects of process variations on circuit performance by means of digitallycontrollable ring oscillators," in *Proc. Int. Conf. Microelectron. Test Struct.*, 2003, pp. 214–217.
- [5] M. Nomura, Y. Ikenaga, K. Takeda, Y. Nakazawa, Y. Aimoto, and Y. Hagihara, "Delay and power monitoring schemes for minimizing power consumption by means of supply and threshold voltage control in active and standby modes," *IEEE J. Solid-State Circuits*, vol. 41, no. 4, pp. 805–814, Apr. 2006.
- [6] IBM 10SF CMOS Process. (2010) [Online]. Available: http://www. mosis.com/ibm/10sf/
- [7] R. Rao, K. Agarwal, A. Devgan, K. Nowka, D. Sylvester, and R. Brown, "Parametric yield analysis and constrained-based supply voltage optimization," in *Proc. 6th Int. Symp. Qual. Electron. Design*, Mar. 2005, pp. 284–290.
- [8] A. A. Hamoui and N. C. Rumin, "An analytical model for current, delay, and power analysis of submicron CMOS logic circuits," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 47, no. 10, pp. 999–1007, Oct. 2000.
- [9] C. Tae-Yong, W.-I. Cho, and D.-W. Kim, "A simple CMOS delay model for wide applications," in *Proc. Asia Pacific Conf. Circuits Syst.*, Nov. 1996, pp. 77–80.
- [10] L. Bisdounis, S. Nikolaidis, and O. Koufopavlou, "Analytical transient response and propagation delay evaluation of the CMOS inverter for short-channel devices," *IEEE J. Solid-State Circuits*, vol. 33, no. 2, pp. 302–306, Feb. 2006.
- [11] J. M. Rabaey, A. Chandrakasan, and B. Nikolić, *Digital Integrated Circuits*, 2nd ed. Englewood Cliffs, NJ: Prentice-Hall, 2003.
- [12] J. M. Zurada, Y. S. Joo, and S. V. Bell, "Dynamic noise margins of MOS logic gates," in *Proc. IEEE Int. Symp. Circuits Syst.*, vol. 2. May 1989, pp. 1153–1156.
- [13] J. R. Hauser, "Noise margin criteria for digital logic circuits," *IEEE Trans. Educ.*, vol. 36, no. 4, pp. 363–368, Nov. 1993.
- [14] J. Lohstroh, E. Seevinck, and J. de Groot, "Worst-case static noise margin criteria for logic circuits and their mathematical equivalence," *IEEE J. Solid-State Circuits*, vol. 18, no. 6, pp. 803–807, Dec. 1983.
- [15] M. Shur, *Physics of Semiconductor Devices*. Englewood Cliffs, NJ: Prentice-Hall, 1990.
- [16] S. Park, E. W. Greeneich, and T. A. DeMassa, "Low-power transistorstring and new rail-to-rail comparator in A/D converter," in *Proc. 42nd Midwest Symp. Circuits Syst.*, vol. 1. Aug. 1999, pp. 194–197.
- [17] R. J. Baker, H. W. Li, and D. E. Boyce, CMOS Circuit Design, Layout, and Simulation. Piscataway, NJ: IEEE Press, 1996, ch. 26.
- [18] J. S. Chen and K. T. Kornegay, "Design of a process variation tolerant CMOS opamp in 6H-SiC technology for high-temperature operation," *IEEE Trans. Circuits Syst. I, Fundam. Theory Appl.*, vol. 45, no. 11, pp. 1159–1171, Nov. 1998.
- [19] A. Ghosh, R. M. Rao, C.-T. Chuang, and R. B. Brown, "A centralized supply voltage and local body bias-based compensation approach to mitigate within-die process variation," in *Proc. Int. Symp. Low Power Electron. Design*, 2009, pp. 45–50.
- [20] A. Ghosh, R. M. Rao, J.-J. Kim, C.-T. Chuang, and R. B. Brown, "Onchip process variation detection using slew-rate monitoring circuit," in *Proc. 21st Int. Conf. VLSI Design*, 2008, pp. 143–149.
- [21] A. Ghosh, R. M. Rao, C.-T. Chuang, and R. B. Brown, "On-chip process variation detection and compensation using delay and slew-rate monitoring circuits," in *Proc. Int. Symp. Qual. Electron. Design*, 2008, pp. 815–820.

Amlan Ghosh (M'04) received the B.S. and M.S. degrees in electrical engineering from the Indian Institute of Technology Kharagpur, Kharagpur, India, in 2002 and 2004, respectively, and the Ph.D. degree in low power circuit design from the Department of Electrical and Computer Engineering, University of Utah, Salt Lake City, in 2010.

He researched low-power SRAM designs with IBM's Austin Research Center as a Post-Doctoral Researcher from 2010 to 2011. He is currently with Advanced Micro Devices, Austin, TX. His current research interests include memory circuit design for exploratory devices, 3-D VLSI integration, robust memory design, and on-chip monitoring circuit design for reliability and process variations.

Rahul M. Rao (M'04–SM'12) received the Ph.D. degree in low-leakage SOI CMOS circuits from the University of Michigan, Ann Arbor, in 2004.

He is currently a Senior Engineer with IBM, Bangalore, India, where he is the Technical Lead for IBM's P series Microprocessor Design Team. From 2004 to 2012, he was a Research Staff Member with IBM T. J. Watson Research Center, Yorktown Heights, NY, where he was involved in design and power reduction of IBM's POWER processors as a member of the Advanced RISC Design Department. His research interests include low-power design, reliability, and variability characterization of compensation systems, high-performance memory, and 3-D design.

Dr. Rao has been on the Technical Program Committee of ISLPED since 2010. He is the Guest Editor and is on the review committees of several IEEE journals. He is on the Technical Advisory Board of IBM for the SRC.

**Jae-Joon Kim** (M'04) received the B.S. and M.S. degrees in electronics engineering from Seoul National University, Seoul, Korea, in 1994 and 1998, respectively, and the Ph.D. degree in low power SRAM circuit design from the School of Electrical and Computer Engineering, Purdue University, West Lafayette, IN, in 2004.

He has been with IBM T. J. Watson Research Center, Yorktown Heights, NY, since 2004, where he is currently a Research Staff Member and is involved in POWER6 and POWER7 microprocessor design. His current research interests include circuit design for exploratory devices, 3-D VLSI integration, robust memory design, and on-chip monitoring circuit design for reliability and process variations.

Ching-Te Chuang (S'78–M'82–SM'91–F'94) received the B.S.E.E. degree from National Taiwan University, Taipei, Taiwan, and the Ph.D. degree in electrical engineering from the University of California, Berkeley, in 1975 and 1982, respectively.

He was with IBM T. J. Watson Research Center, Yorktown Heights, NY, from 1982 to 2008, where he held technical and management positions and was involved in research on bipolar devices, circuits, and technologies, BiCMOS logic and memory, CMOS microprocessor, and SRAM design. He joined National Chiao-Tung University, Hsinchu, Taiwan, as a Chair Professor with the Department of Electronics Engineering in 2008. He has authored or co-authored over 320 papers. He holds 47 U.S. patents with another 20 pending.

**Richard B. Brown** (S'74–M'76–SM'91) received the B.S. and M.S. degrees in electrical engineering from Brigham Young University, Provo, UT, in 1976, and the Ph.D. degree from the University of Utah, Salt Lake City, in 1985, through research and development of one of the first "smart sensors," an array of liquid chemical sensors with integrated electronics.

He has intensive industrial experience of five years. He joined the University of Michigan, Ann Arbor, where he developed a VLSI program and conducted research on circuits (high-speed, low-power, high-temperature, and radiation hard), microprocessors (high-performance, low-power, and mixed-signal), sensors (for ions, heavy metals, and neurotransmitters), and brain-machine interfaces. He held an Arthur F. Thurnau Endowed Professorship with the University of Michigan. In 2004, he joined as the eleventh Dean of the College of Engineering, University of Utah, and e-SENS (chemical sensors). He has graduated 29 Ph.D. students. He has authored more than 225 peer-reviewed publications and holds 17 patents.