# A Low-Phase Noise, Anti-Harmonic Programmable DLL Frequency Multiplier With Period Error Compensation for Spur Reduction

Qingjin Du, Jingcheng Zhuang, and Tad Kwasniewski

Abstract-A low phase noise, delay-locked loop-based programmable frequency multiplier, with the multiplication ratio from 13 to 20 and output frequency range from 900 MHz to 2.9 GHz, is reported in this brief. A new switching control scheme is employed in the circuit to enable the capability of locking to frequencies either above or below the start-up frequency without initialization. To reduce the spurious output power level, a low-bandwidth auxiliary loop [period error compensation loop (PECL)] is employed to compensate for the output period error caused by the phase realignment errors. This frequency multiplier is implemented in TSMC 0.18-µm CMOS technology and measured with a synthesized frequency source. A significant reduction of the output spurs from -23 to -46.5 dB at 1.216 GHz is achieved by enabling the PECL. The measured cycle-to-cycle timing jitter at 2.16 GHz is 1.6 ps (rms) and 12.9 ps (pk-pk), and the phase noise is -110 dBc/Hz at 100-kHz offset with a power consumption of 19.8 mW at a 1.8-V power supply.

*Index Terms*—Delay-locked loop (DLL), frequency multiplier, in-lock error, phase noise, phase-locked loop (PLL), spurious power level.

# I. INTRODUCTION

HE frequency multiplier is an important building block in communication systems and its performance parameters including the phase noise and spurious power level are very critical. The phase-locked loop (PLL) architecture is dominant in the field of frequency synthesis for many years. However, compared with the delay-locked loop (DLL) counterpart, it suffers from phase noise accumulation in the voltage-controlled oscillator (VCO) induced by the power supply and substrate noise. In recent years, the DLL-based clock synthesis is under exploration [1]–[7], due to its superior phase noise performance, while relatively high output spurs limit its wide use. To date, two typical implementations of DLL-based frequency multiplication are reported. One is based on a voltage-controlled delay line (VCDL) and an edge combiner [1], [2], [6], [7] and the other one is based on a VCO-like VCDL [3], [4]. Two main sources of spurious output of the first approach are the mismatches of

The authors are with the Department of Electronics, Carleton University, Ottawa, ON K1S 5B2, Canada (e-mails: qidu@doe.carleton.ca; jzhuang@doe.carleton.ca; tak@doe.carleton.ca).

Digital Object Identifier 10.1109/TCSII.2006.883103

 $f_{ref}$  PD CP  $V_{cntrl}$  Error Detector  $f_{ref}$   $PD_{sw}$  Divider /N  $ED_{sw}$   $Mux_{sw}$  and Switching Logic  $PD_{sw}$   $PD_{sw}$ 

 $E_{cn\underline{trl}}$ 

Fig. 1. Proposed architecture of the DLL frequency multiplier with period error compensation loop.

the VCDL/edge combiner and the loop in-lock error. The mismatches are minimized in the second approach because the reference signal always circulates in the same delay stage, but the phase realignment errors, in the process of injecting the clean reference edges into the VCDL, might even enlarge the total period errors, which cannot be effectively minimized by optimizing the charge pump (CP) and the phase and frequency detector (PFD). This brief presents a multiplying DLL frequency multiplier with a period error compensation loop (PECL) to achieve both low phase noise and low spurious power level. To prevent the loop from locking to harmonic reference edges, a switching control logic circuit and a resetable dynamic frequency divider are proposed.

# II. PROPOSED DLL FREQUENCY MULTIPLIER WITH A PECL

The block diagram of the proposed frequency multiplier is shown in Fig. 1. It comprises of a PFD, a CP, a MUX, a divide-by-N divider, a switching logic (SL) circuit, an error detector (ED), a VCO-like VCDL, and a loop filter ( $C_1$ ). The reference edges are injected into the VCDL by setting the MUX to position "I" and the edges propagates in the VCDL, or circulates in the ring consisting of 3 invertors and a MUX, until the MUX is switched to "0." Comparing with a conventional VCDL, which physically consists of multiple delay stages, the VCO-like VCDL eliminates the stage mismatch and the use of edge combining circuit. In addition, it can be easily programmable. By comparing the phase of the VCDL output ( $f_{rcl}$ ), which is the delayed reference edge after circulating in the ring

Manuscript received January 11, 2005; revised April 28, 2006. This work was supported by the Government of Ontario, Carleton University, Ottawa, ON, Canada, Altera, Ottawa, ON, Candada, Communications and Information Technology Ontario (CITO), and the Natural Science and Engineering Research Council (NSERC). This paper was recommended by Associate Editor A. I. Karsiliyan.



Fig. 2. Timing of the frequency multiplier for the cases of the initial frequency is (a) higher and (b) lower than the desired frequency.

for N cycles, with that of the next reference edge, the delay of the VCDL is tuned so that the total delay  $(N \cdot T_{out})$  equals to the reference period  $(T_{ref})$ . In other words, the output frequency is N times the reference frequency.

However, employing a VCO-like VCDL may cause the phase detector compares the phase of  $f_{rcl}$  with that of the harmonic reference edge, the loop may fail in locking if an initial frequency is lower than the desired one [3]. This design employs a novel switch scheme to solve this problem. Fig. 2 shows two cases where the initial frequency is higher and lower than the desired frequency with an example N = 6. After a reference edge is injected to the VCDL, the frequency divider is reset and the MUX<sub>sw</sub> goes low so that the reference edge can circulate in the VCDL. After five output cycles, the MUX<sub>sw</sub> goes high and the VCDL is connected to the reference signal until a new reference rising edge is injected. The PD<sub>sw</sub> goes high at the fifth output rising edge so that the PFD captures the last output rising edge. The PFD is based on a conventional 3-state PFD controlled by  $PD_{sw}$ , which only controls the comparison path of  $f_{rcl}$  and resets the PFD when its falling edge comes and a new reference edge is injected. This proposed switching scheme ensures the phase comparison is always done between the last output rising edge and the reference rising edge next to the previously injected one as illustrated in the dotted arrows. Consequently, the DLL can always acquire lock without initializing the VCDL control signal and avoid harmonic locking.

Due to the mismatches in the feedback path including the PFD/CP combination and the realignment error introduced while the new reference edge is injected into the VCDL every N output cycles [5], the N-th cycle always has its period different from periods of the previous N-1 cycles even when the loop is in lock and large output spur is thus resulted. As illustrated in Fig. 3, there are N output cycles within one reference period, and each corresponding period is plotted at the bottom of the figure. One period among the N periods may be always larger or smaller than other N - 1 periods as shown in Fig. 3(a) and (b), respectively. This static error is referred as the period error in the remaining of this brief. To minimize the error, the output  $f_{out}$  is applied to an ED to extract this error and compensate it by tuning the charge pump current. The ED compares the period of the Nth output cycle  $T_{out}(n)$  with the average period  $(T_{av})$ . The difference between  $T_{out}(n)$  and  $T_{av}$ is amplified and its polarity is determined. The comparison



Fig. 3. Period errors when the loop is in lock with positive period error (a) and negative period error (b).



Fig. 4. Frequency divider and the SL circuit.

offset is self-cancelled in the ED so that it can provide more precise error information than the PFD, where the offset of the phase comparison cannot be self-cancelled. In addition, the error information provided by the ED corresponds to the total period error appearing at output of the frequency multiplier while the error provided by PFD only corresponds to part of the period error. In this design, besides the loop governed by the PFD, which dynamically compensates for the period error, a low bandwidth PECL, governed by the ED, is used to compensate for the static period error to significantly reduce the output spurious level.

# **III. CMOS IMPLEMENTATION**

# A. VCO-Like VCDL

The high output frequency is generated in the VCO-like VCDL, which consists of three inverting delay cells and a MUX controlled by the SL circuit. Since the frequency multiplier output  $f_{rclb}$  or  $f_{out}$  is derived from a fixed node (the output of the second delay cell), the stage mismatch among the three delay cells does not result in spurious output even with a MUX placed at the input of the first stage. Current starved inverters are employed in the delay cells and two CMOS pass gates are used to form the 2 to 1 MUX.

# B. Frequency Divider and SL Circuit

The frequency divider and the SL circuit shown in Fig. 4, coordinate the frequency multiplier operation by providing three signals  $MUX_{sw}$ ,  $PD_{sw}$ , and  $ED_{sw}$ . The programmable frequency divider counts the high frequency output  $f_{rcl}$  to achieve a programmable multiplication ratio. After a reference edge is injected in the VCDL, the frequency divider is reset by a reset signal from the SL and starts to count from zero again. In this design, the frequency divider comprises of three stages of divide-by-2,3 dividers connected in series and three NOR gates



Fig. 5. (a) Divider-by-N divider. (b) Divide-by-2,3 divider .



Fig. 6. Schematic of the SL circuit.

as shown in Fig. 5(a). By programming  $P_0$ ,  $P_1$  and  $P_2$ , multiple division ratios can be obtained and the ratio is calculated as

$$N = \sum_{i=o}^{i=2} 2^{i} \cdot P_{i} + 2^{i+1}.$$
 (1)

From the equation, the frequency division ratios are integers from 8 to 15. Due to the delay from the VCO output, the divider and the SL, to the  $MUX_{sw}$ , the actual division ratios of the frequency multiplier are from 13 to 20.

Fig. 5(b) shows the schematic of the divide-by-2,3 divider. It consists of two D flip-flops (DFFs) and two logic gates. For the normal mode operation when mod = 1, the output Q of the shaded DFF is always 0 and the second one is in a toggle configuration working as a divide-by-2 divider. When mod = 0, the second one remains high for an extra clock cycle before setting back to 0 and toggles again, resulting in dividing by 3. The extended true-single-phase-clock (E-TSPC) logic is employed in the divider to reduce the transistor count, power consumption and achieve high-speed operation.

The SL circuit consists of seven resetable FFs and two delay elements as shown in Fig. 6. One of the four inputs,  $d_{out}$ , is from the frequency divider, and its second rising edge after the divider is reset indicates when the counting-to-N is reached. DFF1 and DFF2 form a count-to-2 circuit and the output Q of the DFF2 goes high at the second rising edge of the  $d_{out}$ . The operation after Q2 goes high is illustrated in Fig. 7. Signal PD<sub>sw</sub> is obtained by sampling Q2 at the rising edge of the  $f_{rcl}$ . The MUX<sub>sw</sub> is generated by sampling the PD<sub>sw</sub> at the rising edge of  $f_{rcl}$  after the rising edge of the  $f_{ref}$ . The ED<sub>sw</sub> is obtained by sampling the reset signal appears at the first rising edge of  $f_{rcl}$  after the rising edge of the  $f_{ref}$ . The ED<sub>sw</sub> is obtained by sampling the reset signal at the rising edge of  $f_{rcl}$ , and it is a positive pulse with the width equal to one output period.

# C. Phase Detector and Charge Pump

The schematic of the phase detector is shown in Fig. 8(a). It is based on a dual FF PFD controlled by  $PD_{sw}$  generated from



Fig. 7. Operating principle of the SL circuit.



Fig. 8. Schematic of PFD (a) and CP (b).

the SL circuit. The signal *DN* goes high at rising edge of  $f_{\rm rcl}$  only if  $\rm PD_{sw}$  is high, while the *UP* signal always goes high at the reference rising edge. Similar to a conventional PFD, the UP and DN are reset once both of them are high. At the following edge of  $\rm PD_{sw}$ , the PFD is reset so that correct comparison can be guaranteed. This reset signal is a narrow pulse generated by DFF 3 with its output connected to its reset input.

In the charge pump as shown in Fig. 8(b), the *UP/DN* signals control a charge current and a discharge current respectively. Normally speaking, the two currents are equally matched by the current mirrors so that the zero net charge is achieved when *UP/DN* have the equal width. Any current mismatch in this approach results in in-lock error. In this design, a dynamic compensation technique is used to reduce the in-lock error as well as other injection errors and a current tuning capability is implemented in the charge pump. Since the relative error itself between the charge current and the discharge current is of importance, only charge current is finely tuned with the PECL while the discharge current is fixed in this design.

### D. Error Detector

The ED is the main part of the PECL. It compares the average period of the signal  $f_{\rm rcl}$  and the period of each *n*th cycle, whose period may be different from the other cycles. The mathematic model of the ED is shown in Fig. 9. From the figure, all VCO periods are subtracted by their average period value, which is determined by an internal feedback loop, and the subtracted results are amplified and their polarities are examined to produce the error output whenever the  $ED_{\rm sw}$  is high.

The schematic of the ED is shown in Fig. 10 and its operation, for the case of positive period error, is illustrated in



Fig. 9. Mathematic model of the ED.



Fig. 10. CMOS implementation of the ED.



Fig. 11. Operation of the ED.

Fig. 11. At each falling edge of the input signal  $f_{rclb}$ , the pulse generator (PG) generates a narrow pulse  $(P_a)$ , and its width is then increased by an amount corresponding to the average period duration. The resulted pulse train  $(P_p)$  is inverted and integrated by a high-gain integrator whose output  $(V_{int})$  is sampled and then reset before the next pulse coming. Those sampled results ( $V_{sam}$ ), corresponding to the VCO periods, drive the internal feedback, so that the average value  $(V_{ave})$  of the sampled results can be dynamically maintained by the internal feedback at a certain level determined by  $V_{\text{bias}}$ . When signal  $ED_{sw}$  is high, a comparator is switched on to compare the samples  $(V_{\text{sam}})$  with its average  $(V_{\text{ave}})$  to generate the error output signal. The internal feedback loop compares Vave with a pre-defined value  $(V_{\text{bias}})$  and adjusts the width increase amount of signal  $P_a$  accordingly. With the internal loop, a large integrator gain can be used to achieve high period error resolution. The bias voltage ( $V_{\text{bias}}$ ), which determines  $V_{\text{ave}}$ , is set to be approximately  $V_{DD}/2$  to ensure the integrator works in its high-gain region.

The output of the ED is to finely tune the charging current in the charge pump after it is low-pass filtered to compensate the period error. The functionality of the ED can be observed from signal  $V_{cntrl}$  and  $E_{cntrl}$  during the locking process from the post-layout simulation result as shown in Fig. 12. The shaded region expanded on the right shows the details how  $E_{cntrl}$  changes  $V_{cntrl}$  after the loop is in lock. When the main loop is in lock,  $V_{cntrl}$  is stable, and the ED detects in-lock errors based on the output  $f_{out}$ . Accordingly, the period compensation loop tunes



Fig. 12. Signal  $V_{cntrl}$  and signal  $E_{cntrl}$  signal from post-layout simulation.



Fig. 13. Measured spurious power level at 1.216 GHz with PECL enabled and disabled.

 $E_{\rm cntrl}$  to increase or decrease the charging current of the charge pump, making a small change in  $V_{\rm cntrl}$ .  $V_{\rm cntrl}$  is decreased by about 2 mV as shown in the figure, and the large current spikes are also reduced, resulting in an edge jitter reduction from 28 to 3.2 ps (peak to peak). All these significantly contribute to the output spurs reduction.

## **IV. MEASUREMENT RESULTS**

The circuit is implemented in CMOS 0.18- $\mu$ m technology and tested with a synthesized reference signal from an RF signal generator. Fig. 13 presents the measured output spurious power level with a carrier frequency of 1.2 GHz. With the compensation loop disabled and enabled, the spurious tones at the reference frequency of 64 MHz are -23.17 and -46.5 dBc, respectively, and a significant spur reduction of 23 dB is obtained.



Fig. 14 shows the jitter histogram with the in-lock error reduction technique enabled. The cycle-to-cycle rms edge jitter is 1.6 ps and the peak-to-peak value is 12.9 ps at the 2.16-GHz output frequency with the division ratio of N = 20. The phase noise of the output is shown in Fig. 15. The measured phase noise is -110 dBc/Hz at 100-kHz offset while the reference phase noise is -135 dBc/Hz at 100 kHz. This output phase noise is mainly limited by the reference signal due to the nature of the DLL. The measured phase noise does not change much when the compensation loop is enabled. Similar results are achieved for other frequencies tested. Supplied by a 1.8-V dc source, the circuit consumes less than 20 mW, and the output frequency range is from 900 MHz to 2.9 GHz. The measured performances of the circuit are listed in the comparison with state of the art as shown in Table I. Fig. 16 shows the die microphotograph.

# V. CONCLUSION

A programmable DLL-based frequency multiplier with a period error compensation loop is presented. The proposed in-lock error reduction technique can effectively reduce the output spur by more than 20 dB, and an rms timing jitter of 1.6 pS at 2.16 GHz is achieved. By implementing the proposed SL circuit and

TABLE I MEASURED PERFORMANCE COMPARISONS

| Items                                     | REF[3]                     | REF[7]              | THIS WORK                        |
|-------------------------------------------|----------------------------|---------------------|----------------------------------|
| Supply voltage                            | 1.8 V                      | 3.3 V               | 1.8 V                            |
| Division ratios                           | 4, 6, 8, 10                | 1-8                 | 13 - 20                          |
| Output (Hz)                               | 200M - 2G                  | 120M - 1.8G         | 900 M ~ 2.9 G                    |
| Phase noise<br>(dBc/Hz)                   | N/A                        | N/A                 | -116 @ 2.16 G,<br>1MHz offset    |
| Timing jitter<br>(ps, rms)<br>(ps, pk-pk) | 1.62 at 2 G<br>13.1 at 2 G | 1.8<br>13.2         | 1.61 at 2.16 G<br>12.9 at 2.16 G |
| Spurious tones                            | -37 dB<br>at 2GHz          | N/A                 | -46.5 dB<br>at 1.2 GHz           |
| Loop capacitor                            | N/A                        | N/A                 | C <sub>1</sub> : 2 pF            |
| Power (mW)                                | 12 at 2GHz                 | 86 at 1.6GHz        | 19.8 at 2GHz                     |
| Active area                               | $0.05 \text{ mm}^2$        | $0.07 \text{ mm}^2$ | $0.07 \text{ mm}^2$              |
| Technology                                | 0.18 µm                    | 0.35 μm             | 0.18 µm                          |



Fig. 16. Die microphotograph.

the resetable divider, the circuit can lock to any frequencies either above or below the start-up frequency without initialization and avoid harmonic locking.

#### ACKNOWLEDGMENT

The authors would like to thank Canadian Microelectronic Corporation (CMC) for the chip fabrication.

#### REFERENCES

- G. Chien and P. R. Gray, "A 900-MHz local oscillator using a DLL-based frequency multiplier technique for PCS applications," *Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC)*, vol. XLIII, pp. 202–203, Feb. 2000.
- [2] D. J. Foley, "CMOS DLL-based 2-V 3.2-ps jitter 1-GHz clock synthesizer and temperature-compensated tunable oscillator," *IEEE J. Solid-State Circuits*, vol. 36, no. 3, pp. 417–423, Mar. 2001.
- [3] R. F. -Rad *et al.*, "A low-power *multiplying* DLL for low-jitter multigigaherz clock generation," *IEEE J. Solid-State Circuits*, vol. 37, no. 12, pp. 1804–1811, Dec. 2002.
- [4] G.-Y. Wei et al., "A 500-MHz MP/DLL clock generator for a 5 Gb/S backplane transceiver in 0.25-µm CMOS," Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), Feb. 2003.
- [5] H.-T. Ng, R. F. -Rad, M.-J. E. Lee, W. J. Dally, T. Greer, J. Poulton, J. H. Edmondson, R. Rathi, and R. Senthinathan, "A second-order semidigital clock recovery circuit based on injection locking," *IEEE J. Solid-State Circuits*, vol. 38, no. 12, pp. 2101–2110, Dec. 2003.
- [6] C.-C. Wang et al., "A 1.2-GHz programmable DLL-based frequency multiplier for wireless applications," *IEEE Trans. Very Large Scale In*tegr. (VLSI) Syst., vol. 12, no. 12, pp. 1377–1381, Dec. 2004.
- [7] J.-H. Kim et al., "A CMOS DLL-based 120-MHz to 1.8-GHz clock generator for dynamic frequency scaling," Proc. IEEE Int. Solid-State Circuits Conf. (ISSCC), pp. 516–517, 2005.