# **Sub-Sampling PLL Techniques** Xiang Gao, Eric Klumperink\*, Bram Nauta\* Marvell, Santa Clara, CA \*University of Twente, Enschede, Netherlands Abstract — In a classical PLL, the phase detector (PD) and charge pump (CP) noise is multiplied by $N^2$ , when referred to the VCO output, due to the divide-by-N in the feedback path. It often dominates the in-band phase noise and limits the achievable PLL jitter power Figure-Of-Merit (FOM). A subsampling PLL uses a PD that sub-samples the high frequency VCO output with the reference clock. The PD and CP noise in this PLL is shown to be not multiplied by $N^2$ , and greatly attenuated by the high phase detection gain, leading to lower in-band phase noise and better PLL FOM. This article reviews the development of the PLL FOM, the sub-sampling PLL techniques and their applications in recent PLL architectures. Index Terms — Clock generation, clock multiplier, frequency multiplication, frequency synthesizer, phase locked loop, low jitter, low phase noise, low power, sub-sampling phase detector, sub-sampling PLL, PLL FOM. #### I. INTRODUCTION Timing generation is an indispensable function in electronic systems, and the phase-locked-loop (PLL) is a ubiquitous component in modern ICs due to its versatility. It can for instance be used for clock generation, frequency synthesis, frequency modulation and demodulation, clock and data recovery, synchronization and spread spectrum signal generation. Of the many known PLL architectures, the one shown in Fig. 1(a) is perhaps the most widely-used which we call the "classical PLL" architecture. It consists of a voltage controlled oscillator (VCO) locked to a reference clock *Ref* by a feedback loop with the following "loop components": a phase detector (PD), a charge pump (CP), a loop filter (LF) and a frequency divider with ratio N (÷N). The PLL's jitter performance for a given power can be evaluated with the PLL Figure-Of-Merit (FOM) [1]. In the classical PLL, the PD and CP noise is multiplied by $N^2$ and often dominates the in-band phase noise, thus limiting the achievable PLL FOM. The sub-sampling PLL (SSPLL) proposed in [2] uses a PD that sub-samples the high frequency VCO output with the reference clock. The PD and CP noise is shown to be not multiplied by N², and greatly attenuated by the high phase detection gain, leading to lower in-band phase noise and better PLL FOM. This article reviews the development of the sub-sampling PLL techniques and their applications in recent PLL architectures [3-21]. Section II discusses the classical charge pump PLL. Section III reviews the development of the PLL FOM and Section IV the SSPLL architecture. Power and spur reduction techniques for SSPLL are discussed in Section V. Finally, Section VII draws conclusions and discusses the recent application of sub-sampling PLL techniques. #### II. CLASSICAL CHARGE PUMP PLL Fig. 1. Classical PLL (a) architecture; (b) phase domain model, (c) phase noise spectrum (1/f) noise neglected). A linear, phase domain model for the classical PLL and its noise are shown in Fig. 1(b), where $K_d$ is the PD/CP detection gain, $F_{LF}(s)$ the loop filter trans-impedance transfer function and $K_{VCO}$ the VCO tuning gain. 1/f noise has been neglected to simplify the analysis so that we can focus on the fundamental limitations due to thermal noise. Defining a CP feedback gain $\beta_{CP}$ as the gain from the PLL output to the CP output current, the closed loop CP noise transfer function can be calculated as: $$H_{CP}(s) = \frac{1}{\beta_{CP}} \cdot \frac{\beta_{CP} \cdot F_{LF}(s) \cdot K_{VCO} / s}{1 + \beta_{CP} \cdot F_{LF}(s) \cdot K_{VCO} / s} = \frac{1}{\beta_{CP}} \cdot \frac{G(s)}{1 + G(s)}$$ (1) where G(s) is the PLL open loop transfer function. The inband phase noise due to CP can be approximated as: $$\pounds_{\text{in-bandCP}} \approx \frac{S_{iCP,n}}{2\beta_{CP}^2} \tag{2}$$ with $S_{iCP,n}$ the power spectral density of CP current noise. Fig. 2. 3-state PFD/CP: (a) schematic, (b) timing diagram, (c) characteristic. Equation (2) indicates that the CP noise is suppressed by $(\beta_{CP})^2$ when transferred to the PLL output. A larger $\beta_{CP}$ is desired for more CP noise suppression. In a classical PLL design, the 3-state PFD/CP in Fig. 2 is often used. The VCO output is firstly divided down so that the divider output Div has similar frequency as *Ref*. The timing of *Div* and *Ref* are then compared and the CP outputs a current pulse with width equal to the amount of timing error. The CP feedback gain for a given CP bias current $I_{CP}$ is: $$\beta_{CP,PFD} = \frac{\Delta \overline{i_{CP}}}{\Delta \phi_{VCO}} = \frac{I_{CP}}{2\pi} \cdot \frac{1}{N}.$$ (3) We can see that $\beta_{CP,PFD}$ is reduced by N and that's why the CP noise is multiplied by $N^2$ in a classical PLL. The reduction of $\beta_{CP,PFD}$ by N can be understood in the time domain. The VCO timing error $\Delta t$ is directly transferred to the divider output with a gain of 1. The PFD/CP detects this error and outputs a current pulse with width $\Delta t$ over one Ref period $T_{ref}$ . If we increase N for the same $f_{VCO}$ , $f_{ref}$ becomes lower and $T_{ref}$ larger. Consequently, the mean CP output current $I_{CP}$ · $\Delta t/T_{ref}$ corresponding to the same $\Delta t$ becomes smaller, resulting in a lower $\beta_{CP,PFD}$ . #### III. PLL FOM A benchmarking FOM that takes into account fundamental tradeoffs between key design parameters can be useful in comparing the relative merits of different designs and stimulating the development of power efficient high performance circuits. Two widely used examples are the ADC FOM and the VCO FOM. The VCO FOM is defined as [22]: $$FOM_{VCO} = 10\log[\pounds_{VCO}(f_m) \cdot (\frac{f_m}{f_{VCO}})^2 \cdot \frac{P_{VCO}}{1mW}]$$ (4) where $\pounds_{VCO}(f_m)$ is the phase noise at an offset frequency $f_m$ , $f_{VCO}$ is the operation frequency, and $P_{VCO}$ the power consumption. The unit of $FOM_{VCO}$ is dBc/Hz. A more negative $FOM_{VCO}$ corresponds to a better VCO design. When transferred to the PLL output, the VCO noise is high-pass filtered and dominates the out-of-band phase noise. Another important part of the PLL phase noise is the noise from the loop components like reference clock buffer, PD, CP and Divider. The loop noise is low pass filtered and dominates in-band. Assuming all the fundamentally required power for a timing circuit is dynamic (related to the switching events) and neglecting 1/f noise, [1] showed that the in-band loop noise at the PLL output $\pounds_{loop,in-band}$ is proportional to the consumed power $P_{loop}$ and the square of the output frequency $f_{out}$ : $$\pounds_{loop,in-band} \propto N^2 \cdot f_{ref} \cdot \frac{f_{ref}}{P_{loop}} = \frac{f_{out}^2}{P_{loop}}.$$ (5) which is independent on $f_{ref}$ because using a larger $f_{ref}$ reduces the loop phase noise but also increases the dynamic power consumed by the loop. A FOM for PLL loop design is thus proposed in [1] as: $$FOM_{loop} = 10 \log \left[ \pounds_{loop,in-band} \cdot \left( \frac{1Hz}{f_{out}} \right)^2 \cdot \frac{P_{loop}}{1mW} \right]. \tag{6}$$ With (4) and (6), we have now separate FOMs for the loop and VCO, while we would like a combined PLL FOM. In many PLL applications, minimizing the total integrated phase noise or jitter is the design target, where a tradeoff exists between the VCO and loop noise filtering. It can be shown that an optimum loop bandwidth $f_{c,opt}$ exists [1, 23], approximately where the spectrum density of the VCO and the loop noise intersect as shown in Fig. 1(c). In that case the VCO and the loop contribute equal jitter. For a given PLL power budget $P_{PLL}$ , it would be best to spend it equally between the loop and VCO [1]: $P_{loop} = P_{VCO} = P_{PLL}/2$ . Under such optimized conditions, the minimum achievable jitter can be calculated as [1]: $$\sigma_{t,PLL,\min}^{2} = \frac{1}{(1Hz)^{2}} \cdot \frac{1mW}{P_{PLL}} \cdot \frac{1}{\pi} \cdot 10^{\frac{FOM_{loop} + FOM_{VCO}}{20}}$$ (7) which says that for a given quality of VCO and loop design (constant $FOM_{VCO}$ and $FOM_{loop}$ ), the minimum achievable PLL jitter variance is inversely proportional to the consumed power. In [1], we thus proposed a PLL benchmarking FOM as: $$FOM_{PLL} = 10 \log[(\frac{\sigma_{t,PLL}}{1s})^2 \cdot \frac{P_{PLL}}{1mW}]. \tag{8}$$ A smaller $FOM_{PLL}$ corresponds to a better PLL design. Comparing (7) and (8), we get: $$FOM_{PLL} = \frac{FOM_{loop} + FOM_{VCO}}{2} + 10\log\frac{1}{\pi}.$$ (9) Therefore, the design quality of the loop and VCO is equally important in improving PLL FOM. This is intuitive since the loop and the VCO have equal contribution to both power and jitter in an optimized PLL design. Fig. 3 shows the FOM for PLL designs with integrated LC oscillator in recent years' literature. Note that there is no $f_{ref}$ and $f_{VCO}$ involved in the PLL FOM definition because the derivations in [1] assumed the optimum case: all the loop power is dynamic (scale with $f_{ref}$ ) and is fundamentally needed to meet the noise performance. In reality, some power will not scale with $f_{ref}$ , e.g. the CP DC bias current. Also, noise performance is not always defining the required power. Some power may be needed to achieve sufficient speed, e.g. in a high-frequency divider. Given the same PLL design, the FOM in practice would still improve with a higher $f_{ref}$ and a lower $f_{VCO}$ , as the former helps to reduce the percentage of static power in the total power and the latter helps to reduce the high-speed related power, approaching closer to the optimum case. In other words, it will be harder in practice to achieve the same FOM with a lower $f_{ref}$ and a higher $f_{VCO}$ . Fig. 3. FOM of recent PLL designs with integrated LC oscillator. ### IV. SUB-SAMPLING PLL Fig. 4 shows the conceptual and timing diagram of a sub-sampling PD/CP (SSPD/CP) [2]. The high frequency VCO output, a sine wave with amplitude $A_{VCO}$ and DC voltage $V_{DC,VCO}$ , is sub-sampled by a low frequency reference clock Ref. When the VCO and Ref are phase aligned and their frequency ratio N is an integer, the sampled voltage $V_{sam}$ has a constant value equal to $V_{DC,VCO}$ . Fig. 4. Sub-sampling PD/CP (a) conceptual diagram (b) timing diagram (c) characteristic. When there is phase error, $V_{sam}$ will deviate from $V_{DC,VCO}$ . The timing error $\Delta t$ between VCO and Ref is thus translated to voltage difference between $V_{sam}$ and $V_{DC,VCO}$ . A transconductor $g_m$ now converts voltage $V_{sam}$ - $V_{DC,VCO}$ into current, so that a traditional current driven loop-filter can still be used. This $g_m$ can be implemented in a time-continuous way, different from a duty-cycled CP. Here we still call it "CP" to simplify the notation, and implementing the gm as a real CP is actually useful for gain control as we will see later. In contrast to a traditional CP, the output current is not proportional to $\Delta t/T_{ref}$ , but rather amplitude controlled by the difference of $V_{sam}$ and $V_{DC,VCO}$ . The SSPD/CP transfer characteristic has the same shape as the waveform to be sampled, see Fig. 4(c). The ideal locking point is the zero crossing where the SSPD/CP gain can be calculated as: $$\beta_{CP,SS} = \frac{\Delta \overline{i_{CP}}}{\Delta \phi_{VCO}} = \frac{SR_{sam}}{2\pi f_{VCO}} \cdot g_m \tag{10}$$ with $SR_{sam}$ the slew rate of the waveform to be sampled at the zero crossing locking point. In the case of LC VCO, $SR_{sam}=A_{VCO}\cdot 2\pi f_{VCO}$ , usually a well defined value since $f_{VCO}$ is known and VCO amplitude calibration over corners is often performed in practice. We can thus re-write (10) as: $$\beta_{CP,SS} = A_{VCO} \cdot g_m = A_{VCO} \cdot \frac{2I_{CP}}{V_{os\ eff}}$$ (11) where $g_m$ is assumed to be implemented with a single square-law MOS transistor in saturation and $V_{gs,eff}$ is the effective gate-source voltage of the transistor. There is no N in (11), which means the CP noise in a SSPLL is not multiplied by $N^2$ when transferred to the output. This is because the phase detection gain of the SSPD is determined by $SR_{sam}$ . If we fix $f_{VCO}$ , the change in $f_{ref}$ or N only impacts how frequent the sampling would happen but does not impact $SR_{sam}$ and the detection gain. Fig. 5. PLL's analogy to a loop-back transceiver: (a) SSPLL with a direct conversion receiver, (b) Classical PLL with a superheterodyne receiver. Viewed from a different angle, the N factor difference between the SSPLL and the classical PLL can be understood if we look at the PLL as a simple loop-back transceiver. The 'signal' is now the VCO phase noise and the function of the PLL loop is to receive this signal, process it and apply it to the VCO to cancel/suppress the VCO phase noise. In a SSPLL, as shown in Fig. 5(a), the SSPD with an LO (Ref clock) acts as a down converter that directly aliases back the VCO phase noise to around DC. The loop filter acts as a base band circuit that processes the received signal and applies it to the VCO input. In other words, the SSPLL is analogous to a direct conversion receiver. The SSPD down-converter has no loss but a gain of 1. Therefore, there is no amplification to the SSPD/CP noise. The aliasing of high frequency VCO noise to low frequency is not a problem because the VCO noise has a steep roll off. In a classical PLL, the receiver chain consists of a divider and a PFD/CP as shown in Fig. 5(b). The divider firstly down converts the signal to an intermediate frequency $f_{VCO}/N$ . The PFD/CP together with the Ref clock acts as a second down converter and converts the signal to around DC. In other words, the classical PLL is analogous to a superheterodyne receiver. The divider acts like a very lossy down-converter with a gain of 1/N. and thus a noise figure of $20\log N$ even if it has no noise. As a result, any noise in the receiver chain after the divider, like the PFD/CP noise, is amplified by $N^2$ , when referred to the PLL output. The architecture of a SSPLL utilizing the SSPD/CP is shown in Fig. 6(a). It works without using a divider as soon as the ratio $f_{VCO}/f_{ref}$ is an integer. A linear phasedomain model for the SSPLL is shown in Fig. 6(b). There is no classical divide-by-N on the feedback path but instead a virtual frequency multiplier " $\times N$ " on the reference clock path. This is because the sub-sampling process aliases back the VCO with the closest $N \cdot f_{ref}$ so it works as if the VCO is sampled by a signal with frequency N times higher than Ref. Note that although there is no $N^2$ factor for SSPD/CP noise, the reference clock noise in a SSPLL is still multiplied by $N^2$ when transferred to the output, just as in a classical PLL. This is because the SSPD still relies on the timing of the Ref sampling edge to define the sampling point and a given timing error still corresponds to N times more phase error if we refer it to $f_{VCO}$ instead of $f_{ref}$ . It is instrumental to compare the $\beta_{CP}$ between the classical PFD/CP PLL and the SSPLL using (3) and (11): $$\frac{\beta_{CP,SS}}{\beta_{CP,PFD}} = N \cdot 4\pi \cdot \frac{A_{VCO}}{V_{gs,eff}}.$$ (12) The advantage of the SSLL in terms of the detection gain can be more than just a factor of N, as $4\pi >> 1$ and usually $A_{VCO} > V_{gs,eff}$ . Thus, the SSPLL has a much larger $\beta_{CP}$ than the classical PLL and much more suppression for the CP noise. Fig. 6. SSPLL (a) architecture, (b) phase domain model. Fig. 7. Sub-sampling PD/CP with pulse width gain control. Thanks to the high $\beta_{CP}$ , the CP noise in a SSPLL is greatly suppressed and would have negligible contribution to the total loop noise. In such a case, having an "unnecessarily high" $\beta_{CP}$ wouldn't further improve the loop noise but does require a large filter capacitor to stabilize the loop. Fig. 7 shows the SSPD/CP with gain control [2]. Differential VCO outputs and differential sampling are used so the differential zero-crossing can naturally be the locking point. Two switches and a block called "Pulser" are added to the CP. The Pulser generates a pulse Pul, non-overlapping with Ref, and switches on/off $I_{UP}$ and $I_{DN}$ simultaneously for a fraction of time $\tau_{pul}$ in each $T_{ref}$ . The mean CP output current and thus $\beta_{CP,SS}$ is reduced. Note that switching on the CP only for a fraction of time also reduces CP noise. The overall $\beta_{CP,SS}$ and the in-band phase noise due to CP can be calculated as [2]: $$\beta_{CP,SS} = 2A_{VCO} \cdot g_m \cdot \frac{\tau_{pul}}{T_{ref}}$$ (13) $$\pounds_{\text{in-band,CP,SS}} = \frac{kT\gamma}{A_{VCO}^2 \cdot g_m} \cdot \frac{T_{ref}}{\tau_{pul}}.$$ (14) By a careful choice of $\tau_{pull}/T_{ref}$ , the value of $\beta_{CP,SS}$ will not be "unnecessarily high" but just high enough to keep the CP as a negligible contributor to the total loop noise. Note that there are other ways of reducing $\beta_{CP,SS}$ as well, e.g. reducing the sampled voltage slope, reducing gm by proper sizing/biasing or by adding degeneration. However, the advantage of the Pulser is that it also functions as the slave track-and-hold for the VCO sampling so the SSPD can be a simple switch-and-cap as shown in Fig. 7. Also by tuning the pulse width, $\beta_{CP}$ can be tuned in a wide range without affecting the SSPD/CP operation points. The in-band phase noise due to the simple switch-and-cap SSPD in Fig. 7 can be derived as [4]: $$\pounds_{in-band,SSPD} = 10 \log \frac{kT}{2C_{sam} \cdot A_{VCO}^2 \cdot f_{ref}}.$$ (15) Assuming $f_{ref}$ =40MHz and $A_{VCO}$ =0.5V, a small $C_{sam}$ of 5fF is enough to keep the SSPD noise at a very low level of -134dBc/Hz at VCO output. Interestingly, the noise level can be *independent* on $f_{VCO}$ as the scaling of phase noise and the scaling of SSPD gain (defined by VCO slope) with $f_{VCO}$ cancel out. Fig. 8 shows the overall architecture of the SSPLL proposed in [2]. The core loop consists of a SSPD/CP with a Pulser, a passive loop filter and a VCO. Since the subsampling process can not distinguish between $N \cdot f_{ref}$ and other harmonics of $f_{ref}$ , the SSPLL may lock to any integer N as long as $N cdot f_{ref}$ fits into the VCO tuning range. A classical PLL with divider is thus used as a frequencylocked-loop (FLL) to ensure correct locking. The key for the FLL is to dominate the loop control when the phase/frequency error is large but avoid adding noise once the phase is locked. This can be achieved e.g. by intentionally adding a large dead zone (DZ) to the FLL PFD/CP (see [2] for the implementation), so that it will inject no current once the phase error is small and fall within the DZ. However, the work in [3] shows that Fig. 8 would also work with no DZ because around the locking point $\beta_{CP,SS}$ can be much larger than $\beta_{CP,PFD}$ anyway. The overall characteristic of the combined SSPD/CP and PFD/CP is shown in Fig. 9. With no DZ, the PFD/CP in the FLL will inject noise even in the locked condition, but it's contribution can be small as it is attenuated by $(\beta_{CP,SS}+\beta_{CP,PFD})^2$ . After locking, the FLL can be disabled to save power or it can remain on to constantly monitor the phase/frequency error and improve the SSPLL's robustness against disturbances [3]. Fig. 8. Overall SSPLL architecture. Fig. 9. Combined SSPD/CP and PFD/CP characteristic. # V. POWER AND SPUR REDUCTION TECHNIQUES FOR SUB-SAMPLING PLL In a SSPLL the PD and CP noise contributions are low and thus their power can be scaled down progressively. The VCO sampling buffer and *Ref* buffer for the SSPD could easily become the dominant source of loop noise as well as power. In [2], they respectively account for 30% and 60% of the total loop power. In most applications, *Ref* is derived from a low frequency 10s-of-MHz sine wave crystal oscillator (XO). To properly sample the high frequency VCO, *Ref* need to have a sharp edge. A *Ref* buffer converting the sine XO output into a square wave *Ref* is thus needed. When a CMOS inverter is used as a sine-to-square buffer in the 10s-of-MHz frequency range, a majority of power, as much as 90% [2], could be wasted by the "short-circuit" current due to simultaneous conduction of the NMOS and PMOS during switching. Fig. 10. Schematic and timing diagram of a low power sine-to-square *Ref* buffer. Fig. 11. Buffer-less direct VCO sampling with dummy sampler. Fig. 10 shows a Ref buffer design [4] that can largely eliminate the short-circuit current and drastically reduce the buffer power. The core is an inverter with an NMOS N1 and a PMOS P1. N1 is directly connected to XO while a timing control circuit (TCC) is inserted between P1 and XO. The TCC generates a narrow pulse $V_{GP}$ from the XO and controls the gate of P1. The delay $\Delta t_1$ and $\Delta t_2$ are set such that the time when $V_{GP}$ is low (P1 conducts) and the time when XO is high (N1 conducts) are non-overlapping. Since $f_{ref}$ is low, this timing plan can be easily met. In this way, N1 and P1 will not conduct simultaneously thereby eliminating the short-circuit current. Moreover, for low noise sampling, only the Ref sampling edge (rising edge in this example) needs to be clean while the other edge's noise is not relevant. Therefore, N1 can be sized big to maintain low noise, while the P1 and TCC can be sized small to save power. This buffer thus greatly reduces power while maintaining the critical edge's noise performance. It also offers the flexibility of tuning the Ref duty cycle without impacting the critical edge. One straightforward way of reducing the VCO sampling buffer power is to do buffer-less direct VCO sampling as shown in Fig. 11. Then the sampling buffer power is simply eliminated. However, the concern is the disturbance of the sampling process to the VCO operation, causing reference spurs. Different spur mechanisms can be identified, namely charge injection, charge sharing and VCO load/frequency modulation when the sampling switch is turned on/off. The load modulation can be alleviated by adding a complementary switched dummy sampler as shown in Fig. 11, so that the VCO is always loaded by one $C_{sam}$ . The dummy sampler also helps to cancel the charge injection from the sampling switches to the VCO. The other spur mechanism, the VCO- $C_{sam}$ charge sharing needs more effort to deal with. Charge sharing happens when $V_{sam,on!}$ , the voltages on $C_{sam}$ and $V_{VCO,on!}$ , the voltage on VCO tank capacitor at the switch-on moment, are not equal. As shown in Fig. 12, the Ref rising edge defines the switch-off moment where holding starts and voltage is sampled and Ref falling edge defines the switchon moment where tracking starts. After the PLL is locked, the sampling edge is phase locked to a VCO zero-crossing. $V_{sam,on!}$ is well-defined and equal to $V_{VCO,DC}$ . In contrast, $V_{VCO,on!}$ depends on the position of the Ref tracking edge which could be anywhere on the VCO waveform. Apparently, it is desirable to phase lock the tracking edge to VCO zero-crossings as well. Then we will have $V_{VCO,on!} = V_{sam,on!} = V_{VCO,DC}$ and hence no VCO- $C_{sam}$ charge sharing as shown in Fig. 12(a). To this end, [5] proposed a low spur SSPLL architecture as in Fig. 13. On top of the SSPLL in Fig. 8, a sub-sampling DLL (SSDLL) is added which reuses the dummy sampler with $\overline{\text{Ref}}$ as its sampling clock, and reuses the low power buffer in Fig. 10 to independently tune the Ref tracking edge. Once the loops settle, both the Ref sampling and tracking edges are aligned with the VCO zero-crossings and the condition for no VCO- $C_{sam}$ charge sharing is achieved. Since the SSDLL tuning only affects the timing of Ref tracking edge which is the non-critical edge for the SSPLL, it will neither disturb the SSPLL operation nor add noise to the SSPLL output. With the aforementioned techniques, the SSPLL design in [4] achieved -125 dBc/Hz in-band phase noise at 200 kHz offset, with $f_{ref}$ =55MHz and $f_{VCO}$ =2.2GHz. The loop only consumes 0.7mW and the reference spur is kept below -56dBc. The total integrated PLL jitter is 0.16ps and total power is 2.5mW. Its -252dB FOM is still one of the best as shown in Fig. 3. Fig. 12. Conceptual illustration of (a) the case of minimum charge sharing, (b) the case of maximum charge sharing. Fig. 13. Low spur SSPLL architecture For a particular design, the achievable spur level with buffer-less VCO sampling will be dependent on the matching between the sampler and its dummy and also the ratio between $C_{sam}$ and VCO tank capacitor. When lower reference spur is needed, one can always add isolation buffers between the VCO and the samplers [5]. Note that the loop filter renders no filtering for the spur due to sampling so there is no tradeoff between low (sampling caused) spur and PLL bandwidth. The spur due to CP ripple of course still goes through the loop filter. It is a very important source of spurs in a classical PLL and often caused by UP/DN current mismatch in the 3-state PFD/CP. However, for a SSPD/CP, the current mismatch does not add to CP ripple but merely shifts the locking point away from the VCO zero crossing as shown in Fig. 14(a). The SSPLL can thus achieve low CP ripple without requiring good CP matching. The SSPLL in [5] demonstrated that <-80 dBc reference spur is achievable with a large loopbandwidth-to-reference-frequency ratio of 1/20 and a simple CP design as shown in Fig. 14(b). Fig. 14. (a) Behavior of SSPD/CP with UP/DN mismatch, (b) simple low ripple CP design. # VI. DISCUSSION AND RECENT SSPLL DEVELOPMENT One key feature of the SSPD is that it can realize very high phase detection gain with low power. This is achieved by converting timing/phase error into voltage error by sampling a high slew rate voltage slope, directly from a high frequency VCO or VCO followed by a slope generator [6]. The waveform to be sampled is not necessary a sine-wave and the sub-sampling technique can also be applied to e.g. ring oscillators [7-8]. In a more advanced process, steeper voltage slopes will be available and the simple switch-cap SSPD will be able to sample faster, thus benefiting from the technology scaling. SSPLLs with SSPD sampling at 10s-of-GHz have been demonstrated in [9] and [10]. In a SSPLL, the SSPD/CP noise is greatly suppressed by the high detection gain so the SSPD/CP design can be simple and very power efficient. The FLL and divider can be turned off in the locked state to save power. Even the VCO sampling buffer power can be eliminated by doing direct VCO sampling. The only remaining loop component is the simple *Ref* buffer, which is needed in all PLLs. The SSPLL can thus approach "PLL Utopia", where only the Ref path contributes to loop noise and power. Another type of PLL that can approximate this state is the injection locked PLL (ILPLL). Indeed as shown in Fig. 3, the current PLL FOM record in literature is -252dB, held by both the SSPLL in [4] and ILPLL in [24], to the best of the authors' knowledge. Note that [4] used a 55MHz sinewave XO as reference clock, while [24] used a 400MHz input which either requires a special reference source or a clock multiplier before it. Compared with the 'open-loop' ILPLL, the SSPLL is more like a traditional PLL with a phase detector and a feedback loop. It can have more robust bandwidth/loop-dynamic control and lower reference spur. However, the ILPLL can achieve a larger filtering bandwidth for VCO noise and can be beneficial e.g. when a ring oscillator is used. Lastly, [11-12] showed that these two techniques can nicely work together with a SSPLL assisting an ILPLL in injection timing control. The SSPD converts the timing error between the VCO and Ref into a sampled voltage error with a very high gain. One can think that if we digitize the sampled voltage with an ADC, it will effectively become a sub-sampling time-to-digital converter (SSTDC) [13]. Simple calculation shows that for a 2 GHz LC VCO with $A_{VCO}$ =0.5V and a 10-bit 1-V full scale ADC, the SSTDC resolution would be about 0.16 ps, nearly two orders of magnitude finer than the gate delay. A digital SSPLL with SSTDC has been recently demonstrated by [14]. The ADC can also be just 1-bit, like the digital SSPLL design in [15]. So far the SSPLL has been shown to work well with integer-N, where the steady-state sampling point is always around the zero crossing. It is interesting to investigate whether it can work with fractional-N since frac-N PLLs are more versatile. In such case, the sampling point can take any value on the waveform even in locked state. The linearity of the SSPD/CP is then a serious concern. If the sampling point is at the peak of a sine-wave, the gain would even be zero. One way to handle this issue as shown in [16] is to first convert the sine-wave into a more linear waveform like a sawtooth and then use an SSTDC to digitize the entire waveform. Digital background calibration can then be applied to linearize it. An alternative is adding a delta-sigma modulated digital-to-time-converter (DTC) to the reference clock path [17-19]. This effectively adds a frac-N multiplier before the SSPD/CP so that the sampling point would still be around zero crossings, as if it is still in int-N mode. Frac-N SSPLLs using these concepts have achieved good FOM among frac-N PLLs as shown in Fig. 3. The development of very linear DTCs [19], [25] should help to further improve the performance. #### REFERENCES - [1] X. Gao, E. Klumperink, P. J. F. Geraedts and B. Nauta, "Jitter Analysis and a Benchmarking Figure-of-Merit for Phase-Locked Loops," *IEEE Trans. Circuits Syst. II*, vol. 56, no.2, pp. 117-121, Feb. 2009. - [2] X. Gao, E. Klumperink, M. Bohsali and B. Nauta, "A Low Noise Sub-Sampling PLL in Which Divider Noise is Eliminated and PD/CP Noise is not Multiplied by N<sup>2</sup>," *IEEE J. Solid-State Circuits*, vol. 44, no.12, pp. 3253-3263, Dec. 2009. - [3] C.-W. Hsu, K. Tripurari, S.-A. Yu and P. R. Kinget, "A Sub-Sampling-Assisted Phase-Frequency Detector for Low-Noise PLLs With Robust Operation Under Supply Interference," *IEEE Trans. Circuits Syst. I*, vol.62, no.1, pp.90-99, Jan. 2015. - [4] X. Gao, E. Klumperink, G. Socci, M. Bohsali and B. Nauta, "A 2.2GHz Sub-Sampling PLL with 0.16ps<sub>rms</sub> Jitter and - 125dBc/Hz In-band Phase Noise at 700μW Loop- Components Power," *IEEE Symposium on VLSI Circuits*, pp. 139-140, Jun. 2010. - [5] X. Gao, E. Klumperink, G. Socci, M. Bohsali and B. Nauta, "Spur Reduction Techniques for Phase-Locked Loops Exploiting a Sub-Sampling Phase Detector," *IEEE J. Solid-State Circuits*, vol. 45, no.9, pp. 1809-1821, Sept. 2010. - [6] D. Cai, et. al., "A Dividerless PLL With Low Power and Low Reference Spur by Aperture-Phase Detector and Phaseto-Analog Converter," *IEEE Trans. Circuits Syst. I*, vol.60, no.1, pp.37-50, Jan. 2013. - [7] S. D. Vamvakos, et. al., "A 8.125–15.625 Gb/s SerDes using a sub-sampling ring-oscillator phase-locked loop," *IEEE Custom Integrated Circuits Conference (CICC)*, paper 10.1, Sept. 2014. - [8] K. Sogo, A. Toya and T. Kikkawa, "A ring-VCO-based subsampling PLL CMOS circuit with -119 dBc/Hz phase noise and 0.73 ps jitter," *IEEE European Solid State Circuits Conference (ESSCIRC)*, pp.253-256, Sept. 2012. - [9] X. Yi, C. C. Boon, J. Sun, N. Huang and W. M. Lim, "A low phase noise 24/77 GHz dual-band sub-sampling PLL for automotive radar applications in 65 nm CMOS technology," *IEEE Asian Solid-State Circuits Conference (A-SSCC)*, pp.417-420, Nov. 2013. - [10] T. Siriburanon, et. al., "A 60-GHz sub-sampling frequency synthesizer using sub-harmonic injection-locked quadrature oscillators," *IEEE Radio Frequency Integrated Circuits Symposium (RFIC)*, pp.105-108, Jun. 2014. - [11] C.-F. Liang and K.-J. Hsiao, "An Injection-Locked Ring PLL with Self-Aligned Injection Window", *IEEE Int. Solid-State Circuits Conference (ISSCC)*, pp.90-92, Feb. 2011. - [12] I.-T. Lee, et. al., "A divider-less sub-harmonically injection-locked PLL with self-adjusted injection timing," IEEE Int. Solid-State Circuits Conference (ISSCC), pp.414-415, Feb. 2013 - [13] X. Gao, "Low Jitter Low Power Phase Locked Loops Using Sub-Sampling," PhD thesis, University of Twente, ISBN-978-90-365-3022-4, 2010. - [14] T. Siriburanon, et. al., "A 2.2GHz –242dB-FOM 4.2mW ADC-PLL using digital sub-sampling architecture," *IEEE Solid-State Circuits Conference (ISSCC)*, paper 25.2, Feb. 2015. - [15] Z. Ru, P. Geraedts, E. Klumperink and B. Nauta, "A 12GHz 210fs 6mW Digital PLL with Sub-sampling Binary Phase Detector and Voltage-Time Modulated DCO," IEEE Symp. VLSI Circuits, pp. 194-195, June 2013. - [16] Z.-Z. Chen, et. al., "Sub-sampling all-digital fractional-N frequency synthesizer with -111dBc/Hz in-band phase noise and an FOM of -242dB," *IEEE Solid- State Circuits Conference (ISSCC)*, paper 14.9, Feb. 2015. - [17] K. Raczkowski, N. Markulic, B. Hershberg, J. Van Driessche, and J. Craninckx, "A 9.2–12.7 GHz wideband fractional-N subsampling PLL sin 28nm CMOS with 280fs RMS jitter," *IEEE Radio Frequency Integrated Circuits Symposium (RFIC)*, pp.89-92, Jun. 2014. - [18] W.-S. Chang, P,-C, Huang and T.-C. Lee, "A Fractional-N Divider-Less Phase-Locked Loop With a Subsampling Phase Detector," *IEEE J. Solid-State Circuits*, vol.49, no.12, pp.2964-2975, Dec. 2014. - [19] N. Markulic, K. Raczkowski, P. Wambacq and J. Craninckx, "A 10-bit, 550-fs step Digital-to-Time Converter in 28nm CMOS," *IEEE European Solid State Circuits Conference* (ESSCIRC), pp.79-82, Sept. 2014. - [20] S. Ikeda, S.-Y. Lee, H. Ito, N. Ishihara and K. Masu, "A 0.52-V 5.7-GHz low noise sub-sampling PLL with dynamic threshold MOSFET," *IEEE Asian Solid-State Circuits Conference (A-SSCC)*, pp.365-368, Nov. 2014. - [21] J. Liang, Z. Zhou, J. Han and D. G. Elliott, "A 6.0–13.5 GHz Alias-Locked Loop Frequency Synthesizer in 130 nm CMOS," *IEEE Trans. Circuits Syst. I*, vol.60, no.1, pp.108-115, Jan. 2013. - [22] P. Kinget, "Integrated GHz voltage controlled oscillators," *Analog Circuit Design: (X)DSL and Other Communication Systems; RF MOST Models; Integrated Filters and Oscillators*, Kluwer, 1999, pp. 353-381. - [23] C. S. Vaucher, *Architectures for RF Frequency Synthesizers*. Boston, MA: Kluwer, 2002. - [24] I.-T. Lee, K.-H. Zeng and S.-I. Liu, "A 4.8-GHz Dividerless Subharmonically Injection-Locked All-Digital PLL With a FOM of 252.5 dB," *IEEE Trans. Circuits Syst. II*, vol.60, no.9, pp.547-551, Sept. 2013. - [25] Z. Ru, C. Palattella, P. Geraedts, E. Klumperink and B. Nauta, "A High-Linearity Digital-to-Time Converter Technique: Constant-Slope Charging", to apprea in *IEEE J. Solid-State Circuits*, 2015.