# A Time-Domain 147fs<sub>rms</sub> 2.5-MHz Bandwidth Two-Step Flash-MASH 1-1-1 Time-to-Digital Converter With Third-Order Noise-Shaping and Mismatch Correction

Ying Wu<sup>(D)</sup>, *Student Member, IEEE*, Ping Lu<sup>(D)</sup>, *Senior Member, IEEE*, and Robert Bogdan Staszewski<sup>(D)</sup>, *Fellow, IEEE* 

Abstract-A 50 MS/s two-step flash-MASH 1-1-1 time-(TDC) to-digital converter employing a two-channel time-interleaved time-domain register with an implicit adder/subtractor realizes an error-feedback topology. Such an error-feedback unit of 1<sup>st</sup>-order noise-shaping TDC can be cascaded as a multi-stage noise shaping (MASH) configuration to achieve higher-order noise-shaping and, thereby high resolution. This paper also discusses different noise sources, linearity and noise tradeoffs in noise-shaping TDC and then demonstrates a histogram testing technique to correct the mismatch of 1<sup>st</sup> stage flash TDC. An on/off-chip delay modulation (DM) measurement technique is presented to characterize the TDC linearity and noise performance. Fabricated in 40-nm CMOS technology, the proposed TDC consumes 1.32 mW from a 1.1 V supply. At frequency below 2.5 MHz, the TDC error integrates to 147fsrms, which is equal to equivalent flash resolution of 1.6 ps.

*Index Terms*— Two-step TDC, noise shaping, time-domain register, time-domain adder/subtractor, time-domain signal processing, error-feedback, delta-sigma, MASH, histogram testing, mismatch correction, TDC measurement technique, time-to-digital converter.

# I. INTRODUCTION

THE relentless scaling of CMOS process technology over the past decades has enabled integration of a large number of transistors on a single chip while significantly improving the circuit performance. Especially in nanometer-scale CMOS, exploiting time-domain resolution is becoming more and more popular over voltage-domain resolution due to the high-speed of transistors and the reduced supply voltage. A time-todigital converter (TDC) quantizes the time-domain signal,

Manuscript received October 20, 2019; revised January 24, 2020 and February 19, 2020; accepted March 13, 2020. Date of publication April 21, 2020; date of current version July 31, 2020. This work was supported in part by the RF Department of HiSilicon, Shanghai, in part by the European Research Council under Consolidator Grant 307624 TDRFSP, and in part by the Science Foundation Ireland under Grant 14/RP/I292. This article was recommended by Associate Editor L. Hernandez. (*Corresponding author: Ying Wu*.)

Ying Wu was with the Department of Microelectronics, Delft University of Technology, 2628CD Delft, The Netherlands. He is now with Intel, Austria (e-mail: wy1023wy@gmail.com).

Ping Lu was with Lund University, 22184 Lund, Sweden. She is now with Microsoft Corporation, Raleigh, NC 98052 USA.

Robert Bogdan Staszewski was with the Delft University of Technology, 2628CD Delft, The Netherlands. He is now with the University College Dublin, Dublin 4, Ireland (e-mail: r.b.staszewski@tudelft.nl)

Color versions of one or more of the figures in this article are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/TCSI.2020.2983581

represented, for example, by a time interval between two signal transitions, into a digital signal, and becomes widely used for jitter measurements [1], time-of-flight range finders [2], space science instruments [3], all-digital phase-locked loops (ADPLL) [4], and so on. In these applications, it is crucial to have high-performance TDCs achieving high resolution, wide dynamic range, and large signal bandwidth.

Various types of analog-to-digital converter (ADC) architectures have been suggested for exploitation in TDCs in order to satisfy the aforementioned requirements. Just like ADCs, the TDCs can also be classified into Nyquist-rate and oversampling types. For Nyquist-rate TDCs, such as flash [5], ring oscillator (RO) based [6], vernier [7]-[10], coarse-fine two-step [11], [12], interpolation [13], [14], pipeline [15], pulse shrinking delay-line [16], [17] and successive approximation (SA) [18], [19], the mismatch of the delay cells and nonlinearity of devices will degrade their resolution and dynamic range. Therefore, randomization and calibration techniques are often used to optimize their linearity [20]-[25]. For oversampling TDCs, such as gated-ring-oscillator (GRO) [26], vernier-GRO [27] and relaxation oscillator  $\Delta \Sigma$ -TDC [28], they achieve noise shaping by gating the oscillator and preserving the sampled phase during the off-state. Because the noise-shaping property is dependent on maintaining the off-state phase, these TDCs are susceptible to leakage and charge injection which worsen with the nanometer CMOS technology scaling. The switched-ring oscillator (SRO) TDCs [29], [30], however, significantly suppress leakage and charge redistribution by not gating the oscillator but by switching it between two frequencies. However, they still suffer from the nonideal voltage-to-frequency transfer characteristic. Since the quantization step of SRO TDCs is equivalent to the stage-delay of the oscillator, a high sampling frequency or high oversampling ratio (OSR) is also needed to achieve fine time resolution, which will inevitably increase power consumption.

It is noted that the aforementioned noise-shaping TDCs are based on the RO structure. In addition, the conventional  $\Delta\Sigma$ -ADC approach can be applied to the TDCs by using a time-to-voltage (T-to-V) converter as its preceding stage, which converts a time interval of input to a corresponding voltage level [31], [32]. As shown in Fig. 1(a), the  $\Delta\Sigma$ -TDC includes an analog-intensive charge-pump (CP) circuitry and a noise-shaping  $\Delta\Sigma$ -ADC. However, the nonlinear

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/



Fig. 1. Architecture of (a) ADC based TDC; (b) time-domain TDC.

T-to-V conversion is inevitable due to mismatches of up/down current sources and the non-ideality of switching. Furthermore, that analog-intensive TDC topology is also sensitive to PVT variations and, as the technology scales down, it becomes less attractive due to the reduced supply voltage.

An alternative implementation of noise-shaping TDC is by applying signal processing functions, such as adder/ sub-stractor, register/unit-delay and accumulator, to timedomain, as well as some digital signal processing algorithms and 'tricks' that have digital or analog equivalents [15], [33]–[35], as shown in Fig. 1(b). However, directly processing the signal in time regime is still not feasible with current technologies, therefore, the time information has to be first converted into intermediate voltage or charge for local processing, and then resumed by a V-to-T conversion. The unavoidable nonlinearity of T-to-V conversion function, illustrated in Fig. 1(b), can be reversed by the following V-to-T transformation if their transfer-functions are invertible:  $t = f^{-1}(v) = f^{-1}{f(t)}$ . This is advantageous in avoiding errors and distortion in time-domain signal processing.

In this paper, we leverage a time-domain signal processing technique to realize a 3<sup>rd</sup>-order noise-shaping TDC [31], [35]. It achieves 147 fs resolution within 2.5 MHz bandwidth when a raw quantization step of 20 ps is used. The impact of various noise sources in the noise-shaping TDC are analyzed, as well as tradeoffs between linearity and noise performance. We also demonstrate a measurement technique to characterize the noise-shaping TDC linearity and noise performance without using expensive laboratory low-jitter signal sources. Furthermore, a mismatch correction technique is proposed to improve the two-step TDC linearity.

The remainder of this paper is organized as follows. First, the time-domain signal processing techniques are explained in Section II. The architecture and circuit implementation of the two-step 3<sup>rd</sup>-order noise-shaping TDC are revealed in Section III. Section IV investigates the noise and linearity with regards to the conversion of voltage noise to jitter in time-domain. Next, the TDC mismatch correction by means of a histogram testing is discussed. Finally, the TDC measurement techniques and the experimental results are presented in Section V.



Fig. 2. Voltage transfer trajectory of switched-RC circuit.

## II. TIME-DOMAIN SIGNAL PROCESSING TECHNIQUE

Time-domain signal processing is a type of signal processing conducted on time intervals or widths of pulses by means of fundamental operations, such as delay/shift, addition/subtraction and multiplication. The proposed concept is to convert the input time interval into an intermediate voltage/charge, process it, and afterwards, retrieve the processed time interval from the final voltage/charge. For ease of understanding, the scheme and voltage-transfer trajectory of a switched-RC circuit are graphically shown in Fig. 2, where only the high-to-low transition is considered. When the fully charged capacitor *C* is switched from its original  $V_{DD}$  voltage to the ground via the resistor *R*, the transient voltage response can be expressed as:

$$V_C(t) = V_{DD}(e^{-t/RC}) \tag{1}$$

and the inverse function of  $V_C(t)$  can be obtained as:

$$t = -RC \times In(V_C / V_{DD}) \tag{2}$$

The  $t_{VTH}$  in Fig. 2 is defined as the discharging time when the transient voltage  $V_C(t)$  crosses the threshold voltage  $V_{TH}$ .

The concept of time-domain signal processing can be described by the above switched-RC circuit exhibiting the voltage transfer trajectory in Fig. 2. Initially, the capacitor Cis pre-charged to V<sub>DD</sub>. When the switch is then connected to the resistor, the capacitor is being discharged through it until the switch turns to *hold* at time  $t_1$ . Then, the top plate of the capacitor is left floating while holding the voltage  $V_C(t_1)$ . Up to this point, the scheme of Fig. 2 operates similarly to a sample-and-hold (S&H) in a traditional ADC [36]. However, the sampled voltage  $V_C$  is not considered here as output but only as an intermediate quantity. The held voltage on C is thereby resumed to discharge when the switch is connected to R again at time  $t_{AWK}$ . As a result, the trajectory in Fig. 2 is distinctly separated into four regions: pre-charge ( $\varphi_1$ ), predischarge ( $\varphi_2$ ), hold ( $\varphi_3$ ), and residual discharge ( $\varphi_4$ ). Assuming no leakage on the capacitor, the merging of voltage transfer trajectories of the two separated discharging regions  $(\varphi_2 \text{ and } \varphi_4)$  is equivalent to the continuous one (without the holding), as shown in Fig. 2. This intuitive conclusion can be obtained from the time-invariant property of equation (2), that is, regardless of the holding region, the discharging time is dependent only on voltage ( $V_{DD}$  and  $\tau = RC$  are assumed constant). Therefore, the discharging time is expressed as



Fig. 3. Concept of time-domain (a) subtractor, (b) adder.

Given that  $t_{VTH}$  and  $t_{AWK}$  are constant, the above linear function can be used to realize various *linear* algorithms. The simple switched-RC circuit thus acts here as a basic component. It can achieve different arithmetic operations in time-domain.

## A. Concept of the Time-Domain Subtractor and Adder

The time-domain subtraction in Fig. 3(a) is simply realized by a couple of switched-RC circuits. The time difference,  $\Delta t$ , is obtained by separately applying  $t_1$  and  $t_2$  as  $t_1$  in (3), and then subtracting the resulting  $t_{VTH2}(t_2)$  from  $t_{VTH1}(t_1)$  as follows

$$\Delta t = t_{VTH1} - t_{VTH2} = t_2 - t_1 \tag{4}$$

Theoretically, the time-domain addition can be obtained directly from the subtraction operation by, for example, reversing the polarity of the  $t_1$  input in (4). However, in order to ensure the discharging phase, that is  $t_{1,2} \ge 0$ , a time offset is inserted to each input as:  $t_1 = t_{off} - t'_1$  and  $t_2 = t_{off} + t'_2$ . As shown in Fig. 3(b), the cross-connected input of  $t'_1$  makes the time interval its inverse element. As a result, the equation (4) can be rewritten as

$$\Delta t = t_{VTH1} - t_{VTH2} = t_2 - t_1 = t'_2 + t'_1 \tag{5}$$

The time events  $t_{VTH1}$  and  $t_{VTH2}$ , whose difference is the output of the adder, can be detected in practice via the output voltage comparison with the threshold voltage  $V_{TH}$ .

## B. Concept of the Time-Domain Multiplier

As indicated above, the time-domain  $\times 2$  multiplication can be conceptually obtained from equation (5) if  $t'_1 = t'_2$ , that is, the input time intervals are equal, and the sum of inputs can be written as

$$\Delta t = t_2' + t_1' = 2 \times t_{1,2}' \tag{6}$$



Fig. 4. Time-domain 2<sup>n</sup> multiplier (a), accumulator (b).

As shown in Fig. 4(a), by cascading a number of n adders, a time-domain  $2^n$  multiplication function can be expressed as

$$\Delta t = \underbrace{2 \times 2 \times \dots 2 \times 2}_{n} \times t'_{1,2} = 2^{n} \times t'_{1,2} \tag{7}$$

## C. Concept of the Time-Domain Accumulator

The delay, which is introduced by the time holding operation, can be viewed as a time-domain register since the result of addition/subtraction is temporarily stored as voltage. Hence, a time-domain accumulator can be built with two cascaded adders with each output feeding into an input of the other [33]. Fig. 4(b) shows that one of the adders works as a unit-gain register by setting one of the inputs to zero (zero time difference). It is noted that the adder and register are synchronized by  $t_{AWK}$ , thus, a sequence of input time differences can be accumulated as follows:

$$\Delta t[k] = \sum_{0}^{n-1} t'_{1,2}[k] \tag{8}$$

## III. PROPOSED TWO-STEP FLASH-MASH 1-1-1 TDC

In this section, the proposed two-step TDC is first linearly modeled in the discrete-time domain. With a MASH 1-1-1 architecture it exhibits high-order quantization noise shaping. Next, the aforementioned time-domain signal processing technique is applied to implement the noise-shaping TDC at circuit level.

## A. Structure of Two-Step Flash-MASH 1-1-1 TDC

The TDC can be modeled in z-domain since the time-domain input is naturally in the discrete format. Fig. 5 shows two 1st-order noise-shaping modulators: delta-sigma ( $\Delta\Sigma$ ) and error-feedback (EF). In each structure, the quantizer is simply a single or multi-bit TDC and the digital-to-time converter (DTC) of the same number of bits converts the digitized TDC output back to the time-domain. Rather than the DTC output itself as in the  $\Delta\Sigma$  loop, it is the quantization error that is fed back to the input in the EF configuration. Even with the unit-delay instead of the accumulator, the signal and quantization noise transfer functions



Fig. 5. Model of (a)  $\Delta \Sigma$ , (b) error-feedback TDC.

of the EF configuration are algebraically equivalent to that of  $\Delta \Sigma$ . By inspection of the EF scheme:

$$Y(z) = X(z) \cdot z^{-1} + E(z) \cdot (1 - z^{-1})$$
(9)

where X(z) and Y(z) are the input and output of the EF TDC, respectively, in z-domain. E(z) denotes the quantization error. Observe that the signal is just delayed by one clock period while the noise is high-pass filtered. In spite of the equivalent transfer functions, the EF is preferred for the MASH 1-1-1 configuration due to the simple implementation of the unit delay and the DTC with an implicit time-domain subtractor at the circuitry level. By contrast, in the  $\Delta \Sigma$  loop, the quantization error needs to be extracted by an additional circuitry.

The block diagram of the realized two-step noise-shaping TDC is shown in Fig. 6. It is made up of two conversion stages: flash (coarse) and MASH 1-1-1 (fine). The multi-bit flash converter coarsely estimates the input time difference yielding the MSB part,  $Y_0$ , as well as the LSB part's primary  $V_{III}$ and two intermediate outputs  $V_{II,I}$ . The implicit DTC<sub>0</sub> then converts the digital  $Y_0$  back to the time-domain signal, which is then subtracted from the input to give the coarse quantization error or residue. Next, the residue is finely quantized by the 2<sup>nd</sup> stage MASH 1-1-1 noise-shaping converter, which is configured by cascading three 1st-order EF modulators, which are free from stability concerns [37]. Assuming ideal DTCs in the flash and EF structures as well as the quantization errors being of equal power,  $E_0^2 = E_1^2 = E_2^2 = E_3^2$ , the output of each stage  $(Y_{1-3})$  in MASH can be digitally post-processed to cancel the quantization errors and to achieve higher-order noise shaping. It is noted that the final primary output of the overall TDC,  $V_{III}$ , is a combination of the delayed flash output,  $Y_0$  and MASH outputs,  $Y_{1-3}$ , and is expressed as follows:

$$V_n = U(z) \cdot z^{-n} + E_n(z) \cdot (1 - z^{-1})^n; n = 1, 2, 3.$$
(10)

where n denotes the order of noise-shaping. The output spectrum of MASH 1-1-1 TDC is clearly discernible in Fig. 6. However, the noise floor remaining at the low-frequency offsets cannot be shaped due to the fact that it is the device noise, such as flicker and thermal noise. From another point

of view, the proposed TDC has a potential of high resolution, since, in most of cases, the device noise is much lower than the quantization noise [26]–[28].

Due to the nature of oversampling, the noise-shaping modulators tend to be much slower than their Nyquist-rate counterparts. Hence, the applications of the noise-shaping modulator are usually restricted to those requiring highlinearity albeit at narrow bandwidths. One approach is through increasing the sampling rate to broaden the bandwidth. Therefore, it needs a small value of time constant  $\tau = RC$ , which is shown in equations (1) and (2), to speed up the processing of time-domain signal. However, as shown in Fig. 2, a small *time* constant  $\tau$  leads to a reduction of the input range due to a shrunk T-to-V conversion. In this work, the sampling rate is extended via time-interleaved unit-delays, which is based on multi-rate signal processing theory [38]. As shown in Fig. 6, the unit-delay is replaced by two interconnected modulators working in an interleaved fashion to each other and running at the same clock (but opposite phases). The effective sampling rate thus doubles the clock rate of each modulator. In other words, one can achieve the required sampling rate not by *directly* increasing the oversampling but by increasing the number of modulators.

## B. Circuit Implementation of Flash TDC

The block diagram of the 1<sup>st</sup> stage flash TDC is shown in Fig. 7. A 4-binary-bit TDC with 15  $2 \times t_d$  delay-time buffers ( $t_d$  is equivalent to the inverter delay) quantizes the input time difference while the multiplexer selects which one of the residues should be passed forward to the 2<sup>nd</sup> stage. Note that the flash TDC here acts also as a DTC; it was shown as two *separate* signal-processing components in Fig. 6. Figure 7(a) shows a buffer-based TDC in a single-ended arrangement. It illustrates a problem caused by the lagged processing of the DFFs array. That is, the detection of residue is slower than its generation [43]. To avoid this dysfunction, an extra path with  $N \times t_d$  delays is added in Fig. 7(b) to compensate for the required time of residue detection. Meanwhile, the DFF outputs need to be shifted by the number of added delays, N, to select the correct time residue.

The structure of delay chain is further shown in Fig. 7(c). It consists of a transmission gate, two inverter strings and realigning latches (i.e. cross-coupled inverters). The input signal, start, is split into two complementary signals which are 'forced' to align their opposite edges by the latches. This dual-chain scheme conveniently provides both the  $2 \times t_d$ and  $1 \times t_d$  rising-to-rising edge buffer delays. Consequently, an appropriate offset,  $-t_d$ , can be added to the residue by selecting the  $(N-1)^{th}$  rising edge at the stop path while the DFF array is still shifted by N (even number). This eases the possible stability issue caused by the out-of-range residues, which might exceed the input range of the noise shaping TDC. The introduced offset can be easily compensated at the TDC digitalized output. In this work, the  $2 \times t_d$  delay is chosen in the flash TDC and the  $t_d$  delay is used in the internal quantizer of noise-shaping TDC, which will be discussed in the next section.



Fig. 6. Block diagram of two-step flash-MASH 1-1-1 TDC.



Fig. 7. (a) Conventional flash TDC; (b) Signal flow of detection and residue generation path, conceptual solution. (c) Modified flash TDC.

# C. Circuit Implementation of 1<sup>st</sup>-Order Noise-Shaping TDC

Fig. 8(a) shows the details of the 1<sup>st</sup>-order noise-shaping TDC in the chosen EF configuration. It consists of a time-interleaved (TI) adder/subtractor, a 1.5-bit internal TDC and a 1.5-bit DTC. The TI adder/subtractor is realized using two parallel identical units. Each of them consists of two-way tristate inverters, capacitors, and switching/controlling



Fig. 8. Architecture of error-feedback TDC with time-interleaved time-domain adder (a); transfer function of 1.5-bit TDC and DTC (b).

circuitry. For the TI operation, the noise-shaping TDC runs at its full speed  $f_S$ , which is equal to the input sampling rate, while the de-multiplexer delivers the input samples to the two-way time registered adder/subtractors, whose individual operation frequency is reduced to half of  $f_S$ . Then, the multiplexer sequentially selects the output of each channel to obtain  $f_S$ , i.e., the full-rate output. A 1.5-bit internal TDC with resolution of  $2 \times t_d$  (20 ps) is used as a 2-step mid-tread quantizer. Fig. 8(b) illustrates the transfer function and the quantization error,  $Q_{err}$ , of the internal TDC. It is shown that as long as  $\Delta T$  is limited between  $-3 \times t_d$  and  $3 \times t_d$ , the error  $Q_{err}$  is between  $-t_d$  and  $t_d$ . The delay cell of  $t_d$ is implemented as the same scheme in Fig. 7(c) to match with the quantization step of flash TDC.



Fig. 9. Timing of time-interleaved operation.

The timing generation of TI adder/subtractor is explained in Fig. 9. It produces quadrature-switching controlling clocks, *G* and *AWK*, at one half the input frequency. The inputs  $IN_{1,2}$  and  $HLD_{1,2}$  are gated by the OR gates of the de-interleaving de-multiplexer via the clock *G/GB*. The gated inputs  $IN_{1,2\_gated}$  and  $HLD_{1,2\_gated}$  are thereby sequentially sent to each TI adder/ subtractor according to the complementary phase of *G* and *GB*. Meanwhile, the clock *AWK/AWKB*, which is in quadrature with *G/GB*, triggers and synchronizes the outputs of TI adder/ subtractor. The interleaving multiplexer takes in values from these outputs in sequence to realize the perfect recombination; i.e., the output is a delayed scaled addition/subtraction of the inputs [39].

As explained in Section II, the time-domain signal processing can be conceptually realized by a pair of switched-RC circuits through the pre-charge, pre-discharge, hold and residue discharge operations. In fact,  $\tau = RC$  in Fig. 2 is not necessarily a constant, as long as the trajectory of RC-circuit in (2) is still considered time-invariant; for example, R can be a voltagedependent resistor. Therefore, as shown in Fig. 10, a tristate inverter, whose on-resistance is variable with the instantaneous voltage, can be used here to functionalize the equivalent switch and resistor features. A triggered control circuit, consisting of an edge-sensitive flip-flop and a multiplexer, triggers the switch on and off under different phases of operation. Furthermore,  $V_{TH}$  is determined by the threshold voltage of output inverter. The discharging voltage crossing down  $V_{TH}$  triggers the inverter and generates a rising-edge transition.

To describe the operation of TI adder/subtractor, a 4-phase flow diagram is introduced with periodic sequence in Fig. 10: 1) Pre-charge: Just before this operation, the lower NMOS switch was set on, causing the tristate inverter to simply reset the capacitor. After *IN* settles at '0', the loading capacitor is pre-charged to  $V_{DD}$ . 2) Pre-discharge: The voltage on capacitor starts to be pulled down from  $V_{DD}$  when *IN* transitions from '0' to '1'. Note that an appropriate *RC* constant is required to ensure that the discharging voltage will not go below  $V_{TH}$ , which could result in the mistaken triggering of output inverter. 3) Hold: The discharging process is suspended when the *HLD* clock is triggered to turn off the switch. The time difference between *IN* and *HLD* is converted into the temporary voltage



Fig. 10. (a) Phases of time-domain adder/subtractor operation and (b) the waveform.

held on the capacitor. Due to the high input resistance of the loading inverter, the stored charge is preserved. Observe that even though *IN* is '1' and the bottom NMOS is on, there is no path for the current to flow to ground. 4) Residue-discharge: the capacitor resumes discharging the residue voltage when *AWK* turns the switch on and the output is triggered when the discharging voltage goes down to  $V_{TH}$ .

#### IV. NOISE AND LINEARITY ANALYSIS

The noise of TDC comes mainly from two sources: the jitter introduced by the delay cells of the 1<sup>st</sup> stage flash TDC, and quantization noise as well as device noise of the 2nd stage noise-shaping TDC. As revealed in Fig. 6, the quantization error of flash TDC is not visible at the output since it is finely quantized by the noise-shaping TDC. However, the accumulated jitter, which is the sum of jitter contributions from many individual delay stages in the flash TDC, will appear at the output without any filtering. If these jitter errors are independent, then the standard deviation of their sum increases as the square root of the number of delays being summed [40]. For the noise-shaping TDC, the quantization error E is pushed to higher frequencies; however, any noise of the time-domain adder/subtractor, DTC and input buffers, as shown in Fig. 11, will directly appear at the TDC's output, without any benefits of noise shaping.

#### A. Jitter in Time-Domain Adder/Subtractor

In fact, the in-band noise of TDC is mainly contributed by the tristate inverter of the time-domain subtractor and depends



Fig. 11. Noise sources in the noise-shaping TDC.



Fig. 12. (a) Noise models of tristate inverter within different phases; (b) voltage noise to jitter conversion.

on the charge/discharge current and the loading capacitance. To simplify the analysis and make the results easier to interpret, we will make the following assumptions regarding the tristate inverter:

- i) all noise sources are white and uncorrelated.
- (ii) the tristate inverter is switched instantaneously.
- (iii) the propagation delay of tristate inverter is much shorter than the sampling period,  $T_S$ .
- (iv) when the voltage noise crosses down the threshold it securely triggers the inverter [41].

For noise analysis, the tristate inverter can be modeled as shown in Fig. 12(a). At the very beginning of phase  $\varphi_1$ , when the capacitor voltage is still low, thermal noise in the PMOS transistor,  $M_P$ , is represented by a noise current source  $i_{n,P}^2$ . The capacitor integrates this noise into voltage over a time interval when  $M_P$  works in saturation region. Shortly afterwards, the resulting voltage noise as well as the deposited noise at the end of phase  $\varphi_4$  decays exponentially with  $R_P C$  constant after  $M_P$  enters the triode region ( $R_P$  is the channel resistance of  $M_P$  in triode region). Generally, as long as iii) holds, the aforementioned noise contributions vanish when the capacitor is charged to V<sub>DD</sub> [40]. Prior to the  $\varphi_2$ switching event, the channel resistance of the  $M_P$  pull-up deposits initial noise on the capacitor. For simplicity,  $R_P$  is considered as constant. As  $R_P C \ll T_S$ , the noise on capacitor is

$$\overline{V_{n,c(\Phi 1)}^2} \approx \int_0^\infty i_{n,P}^2 \cdot \left(\frac{R_P}{1+j\omega R_P C}\right)^2 df$$
$$= 4KTg_{ds0,P} \cdot \frac{R_P^2}{4R_P C} = \frac{KTg_{ds0,P}R_P}{C} \quad (11)$$

where *K* is the Boltzmann's constant and *T* is the absolute temperature. The integration interval is conveniently chosen for the ease of arithmetic simplification. The  $g_{ds0,p}$  is channel conductance of  $M_P$  at zero drain-source voltage [42]. Otherwise, for general  $V_{DS}$ ,  $g_{ds,p} = 1/R_p$ ,

From phase  $\varphi_2$  to the threshold toggling point,  $t_1$ , in phase  $\varphi_4$  (except during  $\varphi_3$ ), the NMOS  $M_{n1}$  and  $M_{n2}$  transistors work respectively in saturation and in triode with resistance of  $R_N$ . As shown in Fig. 12(b), the discharging of capacitor is stopped during the hold phase  $\varphi_3$ , hence, the voltage as well as noise are kept on the capacitor. As a result, the integrated voltage noise can be derived as with the 'always switch-on' tristate inverter since no new noise is introduced in phase  $\varphi_3$ . Thus, the noise analysis can be simplified as in [43] and the power spectral density (PSD) of current noise is

$$S_{i_n}(f) = \underbrace{4KT\gamma g_{ds0,n} \cdot (\frac{R_N}{1/g_{m,n} + R_N})^2}_{M_{n1} \ current \ nosie \ contribution} \underbrace{4KT\gamma g_{ds0,n} \cdot (\frac{1/g_{m,n}}{1/g_{m,n} + R_N})^2}_{M_{n2} \ current \ nosie \ contribution}$$
(12)

where the parameter  $\gamma$  is the noise coefficient. Given that  $M_{n1}$  and  $M_{n2}$  are the same and  $1/g_{m,n} \approx R_N$ , the equation can be simplified as

$$S_{i_n}(f) = KTg_{ds0,n}(1+\gamma)$$
(13)

The noise current integrates on the capacitor within a discharge time interval, which is equivalent to the delay of the loaded tristate inverter,  $t_d$ . In frequency domain, this can be thought of as passing the noise through a linear block whose transfer function is the Laplace transform of a rectangular window of width,  $t_d$  [43]. In terms of the integrated voltage noise

$$\overline{V_{n,c(t_d)}^2} = \frac{1}{C^2} \int_0^\infty S_{i_n}(f) \left| W_{t_d}(f) \right|^2 df$$
$$= \frac{KTg_{ds0,n}(1+\gamma)}{2C^2} t_d$$
(14)

Conveniently assuming that the threshold voltage is  $V_{DD}/2$ and the discharge current is  $I_N$ , then  $t_d$  can be expressed as follows:

$$\int_{0}^{t_{d}} \frac{I_{N}}{C} dt = \frac{V_{DD}}{2} \Longrightarrow t_{d} = \frac{CV_{DD}}{2I_{N}}$$
(15)

Jitter at the toggle point can be obtained from the square root of voltage noise at threshold over the slope,  $I_N/C$ .

$$\sigma_{j,TRI-INV} = \sqrt{\overline{V_{n,c}^{2}(\Phi_{1})}} + \overline{V_{n,c}^{2}(t_{d})} / (I_{N}/C)$$
$$= \frac{1}{I_{N}} \sqrt{KTg_{ds0,p}R_{P}C} + \frac{KTg_{ds0,n}(1+\gamma)CV_{DD}}{4I_{N}} (16)$$

For short-channel MOSFETs operating in *strong inversion*, a coefficient  $\alpha = g_{m,N}/g_{ds0,n}$  is introduced to capture the drop of  $g_{m,N}$ . The  $\alpha$  is typically in the range between 0 and 1. Substituting  $g_{ds0,n}$  with  $g_{m,N} (\approx 2I_N/(V_{gs,N} - V_{tN}))$  and  $R_P$ with  $g_{ds0,p} (\approx 1/R_P)$  in (16), and assuming  $V_{gs,N} - V_{tN} = V_{DD}/2$ .

$$\sigma_{j,TRI-INV} = \frac{1}{I_N} \sqrt{KTC + \frac{KT(1+\gamma)CV_{DD}}{2\alpha(V_{gs,N} - V_{lN})}}$$
$$= \frac{1}{I_N} \sqrt{\frac{KT(1+\gamma+a)C}{\alpha}}$$
(17)

This is a compact expression for jitter caused by the current noise integrating on capacitor C. Intuitively, it can be understood that a sharp discharging can reduce the jitter by increasing the current and decreasing loading capacitance as well. However, this inevitably limits the dynamic range due to the shrink of  $t_d$  as in (15). Fortunately, even for a given dynamic range, the jitter can be also optimized by proportionally increasing  $I_N$  and C since the jitter increases only as a square root of C in (17). In other words, if a power consumption budget is specified, a tradeoff between jitter performance and dynamic range needs to be considered.

Fig. 13(a) plots the simulated noise with Spectre noise analysis using C=1.1 pF,  $I_N=2 \text{ mA}$  and  $f_S=50 \text{ MHz}$ . The integrated voltage noise contributions of  $M_{n1}$ ,  $M_{n2}$  and  $M_P$ on the capacitor are separately shown within one period of tristate inverter operation. In reality, the calculated jitter in (17) is conservative since the integrated voltage noise,  $V_{n,P}$  of  $M_P$ decays in the residue-discharging phase,  $\varphi_4$ . Fig. 13(b) shows the simulated and calculated jitter across the capacitance C. With  $\gamma = 2/3$  and  $\alpha = 0.45$ , the simulated result verifies that the jitter increases with the square root of capacitance, as predicted by (17). In addition, the optimization of jitter, which can be achieved by increasing  $I_N$  and C proportionally, is verified in Fig. 13(c).

#### B. Jitter in Other Blocks

In applying the noise analysis to the entire noise-shaping TDC, each noise source in Fig. 11 is obtained from Spectre phase-noise simulations. The  $\sigma_{j,BUF}^2$  represents the jitter introduced from the input buffers and built-in test circuit as well as the de-multiplexer in Fig. 9.  $\sigma_{j,TRIG}^2$  denotes the jitter caused by the triggered control circuit in Fig. 10(a). It is



Fig. 13. Contributions of integrated voltage noise on capacitor vs. time (a) Integrated jitter vs. (b) C; and (c)  $I_N$  with fixed ratio of C/  $I_N$ .



Fig. 14. Simulated jitter contributions.

noted that any external sampling jitter to the triggered control circuit, such as the sampling clock or timing generator of Fig. 9, will not degrade the jitter performance since the time errors are correlated for the pair of tristate inverters. In other words, the time errors caused by a common source can cancel each other after the time difference or interval is measured. The jitter of tristate inverter and DTC are represented by  $\sigma_{j,TRI-INV}^2$  and  $\sigma_{j,DTC}^2$ , respectively. As shown in Fig. 14, the jitter of 208 fs is obtained by

As shown in Fig. 14, the jitter of 208 fs is obtained by integrating noise from 1 kHz to 25 MHz, which is half of the sampling frequency. The jitter can be transferred to phase noise at the 'carrier' frequency of 50 MHz. Note that the integrated jitter here is defined in terms of the edge-to-edge

error since the output is a time interval. The  $\sigma_{j,BUF}^2$ , instead of  $\sigma_{j,TRI-INV}^2$ , dominates the jitter performance if the tristate inverter is properly designed. In fact,  $\sigma_{j,BUF}^2$  can be virtually eliminated if the TDC is used as an internal block of, for example, an ADPLL or TDC-assisted ADC. The flicker noise at low frequency offset, as shown in Fig. 14, contributes less than 10% of the total jitter. Therefore, equation (17) can be used to manually calculate the jitter with a reasonable accuracy.

## C. Linearity of Time-Domain Adder/Subtractor

As mentioned in Section I, the linearity of time-domain based TDC is theoretically superior to that of the charge- pump based TDC since the T-to-V conversion is not required here to be a linear function. Without a loss of generality, the function of T-to-V and V-to-T can be expressed as v = f(t) and  $t = f^{-1}(v)$ , respectively. For the charge-pump, the time difference,  $\Delta t$ , is converted to a proportional voltage difference,  $\Delta v$ , as

$$\Delta v = v_1 - v_2 = \underbrace{f(t_1) - f(t_2)}_{linear \ system} = f(\Delta t) \tag{18}$$

The property of superposition is only satisfied if f(t) is a linear system, which is not always valid in practice due to the nonlinear characteristics of transistors. However, in the proposed time-domain signal processing technique, a nonlinear T-to-V conversion is used and expressed as

$$\Delta v = v_1 - v_2 = f(t_1) - f(t_2) \tag{19}$$

The instantaneous voltages  $v_1$  and  $v_2$  can be converted back to  $\Delta t$  afterwards by applying  $f^{-1}(\cdot)$  to  $v_1$  and  $v_2$  seperately. The V-to-T conversion is expressed as

$$f^{-1}(v_1) - f^{-1}(v_2) = f^{-1} \{f(t_1)\} - f^{-1} \{f(t_2)\}$$
$$= t_1 - t_2 = \Delta t$$
(20)

It clearly shows that the voltage difference,  $\Delta v$ , is not the main concern but it is merely an intermediate quantity. Equations (19) and (20) are only satisfied if  $f(\cdot)$  is invertible and  $f^{-1}(\cdot)$  is unique, which means there is an exact one-toone mapping between t and v, or in other words, a monotonic T-to-V conversion.

Another observation from (19) and (20) reveals that the linear time-domain signal processing can be realized with nonlinear components. However, non-ideal effects of the switch in the tristate inverter, such as channel charge injection, charge sharing, clock feedthrough and capacitor leakage yield voltage variance on the loading capacitor and hence lead to the deterministic timing offset. The simulated integral nonlinearity (INL) of time-domain adder/subtractor is obtained by slightly offsetting the frequencies of two input clocks. Interestingly, the INLs in Fig. 15(b) are largely independent of loading capacitance. It can be understood that the voltage variance is proportional to I/C if parasitic capacitances of MOSFETs are much lower than C [44]. The time error



Fig. 15. Voltage difference on holding capacitor (a), simulated INL of time-domain adder/subtractor (b).

obtained from the voltage variance over the slope,  $I_N/C$ , is thus independent of loading capacitance.

In fact, the voltage error caused by the non-ideal switch manifests itself as a constant voltage offset and thus leads to zero deterministic jitter for pair-like (i.e. pseudo-differential) tristate inverters. However, as shown in Fig. 15(a), the holding voltages  $V_{C1}$  and  $V_{C2}$  vary with the input time intervals, resulting in variance of  $I_N$  or discharging slope between tristate inverter pairs. Given that the following inverter is triggered at a threshold voltage  $V_{TH}$ =0.55 V, the dynamic range of  $V_{C1}$  and  $V_{C2}$  is limited between 0.6 and 1.05 V. Fig. 15(b) indicates that lower than 330 fs and 20 fs errors are achieved under an input range of 740 ps and 20 ps, respectively. Since the input range of noise-shaping TDC is only tens of picoseconds, the small nonlinearity can be dithered by the device noise itself.

## D. Mismatch Correction

In this two-step topology of TDC, the nonlinearity is mainly dominated by the delay mismatch of the 1<sup>st</sup> stage (flash TDC). The mismatch of MASH TDC in the 2<sup>nd</sup> stage, however, usually causes minor input-independent offsets and neglectable nonlinearities due to its small dynamic range and the high linearity of the time-domain adder/subtractor, as shown in Fig.15. In our previous work [45], the time-domain noise-shaping TDC is used to calibrate the delay mismatch due to its high linearity and resolution. In this work, a ramp input sequence is generated for histogram testing of DNL and INL as well as error correction [48].

Fig. 16 shows a histogram test and correction setup. A linear ramp waveform, which slightly exceeds both ends of the TDC range, is obtained by slightly offsetting the frequency between the two input signals. A large number of samples of the flash TDC are collected for the ramp waveform input, and the numbers of occurrence of each code are tailed. If the TDC had no DNL or INL errors, all codes would have equal probability of occurrence and there should be the same number of counts



Fig. 16. Mismatch correction based on histogram testing.



Fig. 17. Simulated PSD (a), and INL (b) of TDC with mismatch correction.

in each code bin. The real DNL for each code can be calculated from the probability density function and normalized to the flash TDC resolution. Once the DNL is obtained, it is added to the flash TDC output for mismatch correction. It is noted that the histogram test eliminates the effects of random noise by averaging it out over each code bin.

To simulate the SNR and INL improvements at the top level, a delay cell (20 ps) with a standard deviation of 300 fs obtained from a Monte-Carlo analysis is applied to the flash TDC. In Fig. 17(a), a 113-kHz, 252-ps peak-to-peak sinusoidal time-domain signal is used as an input. The spectrum shows that spurious tones, which are caused by the nonlinearity of flash TDC, are attenuated after the mismatch correction,



Fig. 18. Block diagram of voltage-controlled delay line.

thus SNR is improved by 6 dB within a 2.5MHz bandwidth. In order to verify the effectiveness of mismatch correction, different ensembles of nonlinearities are applied. As shown in Fig. 17(b), the linearity is effectively improved and INL is performed within  $\pm 0.02$  LSB/20ps.

## V. EXPERIMENTAL RESULTS

The proposed noise-shaping TDC is fabricated in 40-nm CMOS and occupies an active area of 0.08 mm<sup>2</sup>. The die photograph of the prototype IC is shown in Fig. 23. It operates at 50 MHz from a 1.1V supply. The total consumed power of the chip is 1.32 mW, which consists of 0.35 mW for the first-stage flash TDC and 0.97 mW for the second stage MASH 1-1-1 TDC (cascaded with three 1<sup>st</sup>-order noise-shaping TDCs).

To be able to properly characterize the prototype, a timedomain sinusoid signal input is generated on-chip. As shown in Fig. 18, an array of NMOS transistors with capacitors is added to the load of each delay cell. Two differential sinusoid voltages are applied to modulate the resistance of the NMOS transistors. In another words, the NMOS transistor is viewed as a tunable resistor and the time constant of the series RC can be adjusted by tuning the control voltage. Note that the delay cells need to be designed with low noise and high enough linearity to avoid degrading the TDC performance.

# A. TDC Measurement Techniques

Measuring the performance of noise-shaping TDCs is challenging due to the difficulty in generating a linear and arbitrary waveform in time-domain. Two measurement setups, as shown in Fig. 19, are employed to characterize the noise and linearity of the prototype TDC.

The first technique utilizes an on-chip delay modulation (DM) by tuning the variable delay-lines as shown in Fig. 19(a). The signal source is followed by a passive low-pass filter to obtain a 'pure' single-tone signal. It is noted that the jitter from the signal source is invisible for TDC due to the correlation of noise, thus, the jitter requirement of the signal source can be dramatically relieved. To verify it, a poor jitter signal source is intentionally used to generate a fixed time-interval as a dc input for the TDC. Fig. 19(b) reveals that the intrinsic TDC noise is concealed by the source (twoport) jitter, but it is clearly shown in spectrum when using the on-chip DM technique. Due to the non-linearity of the on-chip DM itself, especially under a large dynamic range,



Fig. 19. TDC measurement setup with (a) on-chip DM technique; (b) measured spectra with one/two-port signal source; (c) off-chip DM modulation technique.

this technique is, however, only feasible to measure noise performance.

In the second technique, an off-chip DM is used to extend the measurement range, as shown in Fig. 19(c). The ramp input is generated by two signal sources with frequencies  $f_1(=1/T_1)$  and  $f_2(=1/T_2)$ , respectively. The time-step can be set by offsetting frequency deviation  $\Delta f = f_1 - f_2$ . As (21) and (22) show, given  $\Delta f = 1$  kHz and  $T_1 = 20$  ns, the ramp increases about 0.4 ps per step.

$$T_{1} + \Delta T = \frac{1}{f_{1} - \Delta f} = \frac{1}{f_{1} \left(1 - \Delta f / f_{1}\right)}$$
$$\approx \frac{1 + \Delta f / f_{1}}{f_{1}} = T_{1} + T_{1} \Delta f / f_{1} \qquad (21)$$

$$\Delta T / T_1 = \Delta f / f_1 \tag{22}$$

The time deviation  $\Delta T$  is accumulated linearly until it surpasses the TDC detection range. Thus, this measurement technique can achieve a large DM range without any penalty of nonlinearity. The jitter of signal sources, however, will be inevitably added to the TDC's input thus degrading the noise performance. Consequently, in this work, the on-chip and off-chip DM techniques are applied to measure noise and linearity performance, respectively.



Fig. 20. Measured PSDs. 65,536 pt FFT is performed with a Hanning window and averaged for 4 sequential sequences: (a)  $1^{st}$ , (b)  $2^{nd}$  and (c)  $3^{rd}$ -order noise-shaping at fs=100 MS/s; (d)  $3^{rd}$ -order noise-shaping at fs=100 MS/s.

## B. TDC Measurement Results

Firstly, the on-chip DM test is performed to verify the noise performance of the proposed TDC. A 28 kHz 12.4 ps peak-to-peak time-domain sinusoidal input is generated by tuning the differential control voltages. The measured output power spectral density (PSD) for the 1<sup>st</sup>, 2<sup>nd</sup> and 3<sup>rd</sup> order noise-shaping at *fs*=50 MS/s are depicted in Figs. 20(a-c), respectively, where the shaped noise is clearly visible, with 1/f flicker noise and thermal noise dominating at low frequencies.

From the measured output spectrums, the integrated noise up to the signal bandwidth ( $f_{BW} = 2.5$  MHz) for the 1<sup>st</sup>, 2<sup>nd</sup> and 3<sup>rd</sup> order is 325 fs<sub>rms</sub>, 165 fs<sub>rms</sub> and 147 fs<sub>rms</sub>, respectively. To achieve the same integrated noise level with the same OSR, but assuming white noise PSD, it yields the equivalent flash-TDC step of  $\Delta t_{res} = 3.6$  ps, 1.8 ps and 1.6 ps. The proposed TDC can operate up to fs=100 MS/s in MASH 1-1-1 mode, as shown in Fig. 20(d), where 160 fs<sub>rms</sub> integrated noise with 5 MHz band-width are achieved and the equivalent step is equal to 1.7 ps.

Secondly, the off-chip DM test is employed to measure the linearity of the proposed TDC. A ramp input is generated by using 50 MHz and 50.001 MHz clocks. The INL is obtained by subtracting the TDC's output from the ideal ramp and then filtering by a 2.5 MHz low-pass filter. The measured transfer curve and INL for the MASH 1-1-1 mode are shown in Fig. 21, where an INL of  $\pm 0.07$  LSB/20 ps or  $\pm 1.4$  ps is achieved when a full-scale input of 320 ps is applied. Other TDC nonlinearity calibration methods are also possible, such as [4], [49].

The nonlinearity of the 1<sup>st</sup> stage flash TDC can be reduced by applying the aforementioned mismatch correction technique in Section IV. The output of flash TDC in Fig. 22(a) is collected and the number of occurrences of each code (i.e., histogram) are shown in Fig. 22(b). The result of probability density function can be improved by increasing the number of samples, for example,  $\times 10$  samples in Fig. 22(c).

TABLE I PERFORMANCE SUMMARY AND COMPARISON WITH STATE-OF-THE ART NOISE-SHAPING TDCs

|                                | This work       |                 |                 | JSSC'09<br>[26] | JSSC'12<br>[27] | JSSC'14<br>[29] | JSSC'12<br>[28]     | TCAS-I'14<br>[46] | JSSC'18<br>[32] |
|--------------------------------|-----------------|-----------------|-----------------|-----------------|-----------------|-----------------|---------------------|-------------------|-----------------|
| Scheme                         | Flash-ΔΣ        |                 |                 | GRO             | Vernier<br>GRO  | SRO             | MASH $\Delta\Sigma$ | Gated<br>SRO      | CT-ΔΣ           |
| Technology                     | 40nm            |                 |                 | 0.13um          | 90nm            | 90nm            | 0.13um              | 65nm              | 65nm            |
| Sampling frequency             | 50MS/s          |                 |                 | 50MS/s          | 25MS/s          | 500MS/s         | 50MS/s              | 400MS/s           | 250MS/s         |
| OSR                            | 10              |                 |                 | 25              | 16              | 250             | 250                 | 50                | 125             |
| Bandwidth (BW)                 | 2.5MHz          |                 |                 | 1MHz            | 800kHz          | 1MHz            | 0.1MHz              | 4MHz              | 1MHz            |
| Shaping-order                  | 1 <sup>st</sup> | 2 <sup>nd</sup> | 3 <sup>rd</sup> | 1 st            | 1 st            | 1 <sup>st</sup> | 3 <sup>rd</sup>     | 2 <sup>nd</sup>   | 3 <sup>rd</sup> |
| Integrated noise               | 325fs           | 165fs           | 147fs           | 80fs            | 200fs           | 315fs           | 5.6ps               | 148fs             | 182fs           |
| $\Delta t_{res}$ *             | 3.6ps           | 1.8ps           | 1.6ps           | 1.4ps           | 3.2ps           | 17ps            | 307ps               | 3.6ps             | 7ps             |
| Full-scale range               | 320ps           |                 |                 | 12ns            | 40ns            | 20ns            | 10ns                | 4ns               | 2ns             |
| 2π-Range                       | 320ps           |                 |                 | 564ps           | 45ps            | ~5ns            | 10ns                | ~666ps            | 2ns             |
| Supply voltage [V]             | 1.1             |                 |                 | 1.5             | 1.2             | 1               | 1.2                 | 1                 | 1.2             |
| Power [mW]                     | 0.67            | 1               | 1.32            | 21              | 3.6             | 2               | 1.7                 | 6.72              | 8.4             |
| Active area [mm <sup>2</sup> ] | 0.04            | 0.06            | 0.08            | 0.04            | 0.027           | 0.02            | 0.11                | 0.05              | 0.055           |
| FOM <sub>2</sub> (pJ/step) **  | 0.17            | 0.13            | 0.15            | 1.8             | 12.2            | 0.08            | 5.8                 | 0.23              | 0.47            |

\* Equivalent flash resolution:  $\Delta t_{res} = \sqrt{12 \times OSR} \times int\_noise$ . \*\*  $FOM_{2\pi} = Power/(2 \times BW \times 2^{ENOB})$ , where ENOB = (DR - 1.76)/6.02 and  $DR = 20 \times log_{10}(Range_{2\pi}/int\_noise)$ 



Fig. 21. Measured transfer characteristics of the noise-shaping TDC: (a) TDC output; (b) INL.

The mismatch is obtained through subtracting the number of samples of each code from the ideal one (in case of even distribution), then normalized and added to flash TDC output for mismatch correction. Fig. 22(d) shows the INL is improved to  $\pm 0.03$  LSB/20ps or  $\pm 0.6$  ps, which is mainly limited by the device noise and nonlinearity of the 2<sup>nd</sup> stage noise-shaping TDC. The TDC efficiency is 0.15 pJ/step for 2.5 MHz bandwidth.

The performance of the proposed TDC is summarized and compared with recent state-of-the-art noise-shaping TDCs in Table I. Note that, as explained in [28], the maximum input amplitude of an RO-based TDC is limited only by the depth of counter or accumulator, which can be extended to arbitrarily enlarge the effective TDC range. Therefore, to compare FoM fairly, only  $2\pi$  radians of the input range are considered since the phase wrapping around  $2\pi$  is accounted for by the counter in [26], [29]. In this fair way, the FoM of this



Fig. 22. Measured transfer characteristics of the 1st-stage flash TDC: flash TDC output (a); histogram of delay distribution with (b) 800 points and (c) 8000 points. (d) INL of the two-step TDC with and without mismatch correction.

work is improved by one order-of-magnitude compared with TDCs having similarly fine single-picosecond-level resolution (i.e., [26]). It is also worth noting that references [28] and [29] produce an even worse resolution than a single inverter can provide in the given technology.



Fig. 23. Chip micrograph of the proposed TDC.

## VI. CONCLUSIONS

We have presented a two-step flash- $\Delta \Sigma$  TDC by employing the time-domain signal processing technique, which is capable of storing, adding and subtracting of time information. The measured TDC displays a 3<sup>rd</sup>-order shaping of quantization noise and achieves 147 fs<sub>rms</sub> integrated noise or 1.6 ps equivalent flash resolution within 2.5 MHz bandwidth. Operating at a 50 MHz sampling frequency, the power consumption is 1.32 mW from 1.1 V. We also demonstrated an off-chip mismatch correction technique to linearize the 1<sup>st</sup> stage flash TDC. The INL of proposed two-step TDC is only ±0.03 LSB/20ps or ±0.6 ps.

## REFERENCES

- V. Gutnik and A. Chandrakasan, "On-chip picosecond time measurement," in *Symp. VLSI Circuits. Dig. Tech. Papers*, Honolulu, HI, USA, Jun. 2000, pp. 52–53.
- [2] M. Gersbach *et al.*, "A time-resolved, low-noise single-photon image sensor fabricated in deep-submicron CMOS technology," *IEEE J. Solid-State Circuits*, vol. 47, no. 6, pp. 1394–1407, Jun. 2012.
- [3] K. Karadamoglou, N. P. Paschalidis, E. Sarris, N. Stamatopoulos, G. Kottaras, and V. Paschalidis, "An 11-bit high-resolution and adjustable-range CMOS time-to-digital converter for space science instruments," *IEEE J. Solid-State Circuits*, vol. 39, no. 1, pp. 214–222, Jan. 2004.
- [4] R. B. Staszewski, S. Vemulapalli, P. Vallur, J. Wallberg, and P. T. Balsara, "1.3 v 20 ps time-to-digital converter for frequency synthesis in 90-nm CMOS," *IEEE Trans. Circuits Syst. II, Exp. Briefs*, vol. 53, no. 3, pp. 220–224, Mar. 2006.
- [5] R. B. Staszewski, D. Leipold, C.-M. Hung, and P. T. Balsara, "TDCbased frequency synthesizer for wireless applications," in *IEE Radio Freq. Integr. Circuits (RFIC) Systems. Dig. Papers*, Forth Worth, TX, USA, Jun. 2004, pp. 215–218.
- [6] I. Nissinen, A. Mantyniemi, and J. Kostamovaara, "A CMOS timeto-digital converter based on a ring oscillator for a laser radar," in *Proc. ESSCIRC 29th Eur. Solid-State Circuits Conf.*, Estoril, Portugal, Sep. 2003, pp. 469–472.
- [7] J. Kostamovaara, K. Maatta, T. Rahkonen, and R. Rankinen, "ECL and CMOS ASICs for time-to-digital conversion," in *Proc. 2nd Annu. IEEE* ASIC Seminar Exhibit, Rochester, NY, USA, Sep. 1989, pp. P5-2/1.
- [8] P. Dudek, S. Szczepanski, and J. V. Hatfield, "A high-resolution CMOS time-to-digital converter utilizing a Vernier delay line," *IEEE J. Solid-State Circuits*, vol. 35, no. 2, pp. 240–247, Feb. 2000.
- [9] J. Yu, F. F. Dai, and R. C. Jaeger, "A 12-bit Vernier ring time-to-digital converter in 0.13μm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 45, no. 4, pp. 830–842, Apr. 2010.
- [10] L. Vercesi, A. Liscidini, and R. Castello, "Two-dimensions Vernier time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 8, pp. 1504–1512, Aug. 2010.
- [11] M. Lee and A. A. Abidi, "A 9 b, 1.25 ps resolution coarse–fine time-todigital converter in 90 nm CMOS that amplifies a time residue," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 769–777, Apr. 2008.
- [12] K. Kim, Y.-H. Kim, W. Yu, and S. Cho, "A 7 bit, 3.75 ps resolution two-step time-to-digital converter in 65 nm CMOS using pulse-train time amplifier," *IEEE J. Solid-State Circuits*, vol. 48, no. 4, pp. 1009–1017, Apr. 2013.

- [13] E. Raisanen-Ruotsalainen, T. Rahkonen, and J. Kostamovaara, "A high resolution time-to-digital converter based on time-to-voltage interpolation," in *Proc. 23rd Eur. Solid-State Circuits Conf.*, Southampton, U.K., Sep. 1997, pp. 332–335.
- [14] S. Henzler, S. Koeppe, D. Lorenz, W. Kamp, R. Kuenemund, and D. Schmitt-Landsiedel, "A local passive time interpolation concept for variation-tolerant high-resolution time-to-digital conversion," *IEEE J. Solid-State Circuits*, vol. 43, no. 7, pp. 1666–1676, Jul. 2008.
- [15] K. Kim, W. Yu, and S. Cho, "A 9 bit, 1.12 ps resolution 2.5 b/stage pipelined time-to-digital converter in 65 nm CMOS using time-register," *IEEE J. Solid-State Circuits*, vol. 49, no. 4, pp. 1007–1016, Apr. 2014.
- [16] E. Raisanen-Ruotsalainen, T. Rahkonen, and J. Kostamovaara, "A low-power CMOS time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 30, no. 9, pp. 984–990, Sep. 1995.
- [17] P. Chen, S.-L. Liu, and J. Wu, "A CMOS pulse-shrinking delay element for time interval measurement," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 47, no. 9, pp. 954–958, Sep. 2000.
- [18] A. Mantyniemi, T. Rahkonen, and J. Kostamovaara, "A CMOS timeto-digital converter (TDC) based on a cyclic time domain successive approximation interpolation method," *IEEE J. Solid-State Circuits*, vol. 44, no. 11, pp. 3067–3078, Nov. 2009.
- [19] H. Chung, H. Ishikuro, and T. Kuroda, "A 10-bit 80-MS/s decision-select successive approximation TDC in 65-nm CMOS," *IEEE J. Solid-State Circuits*, vol. 47, no. 5, pp. 1232–1241, May 2012.
- [20] E. Temporiti, C. Weltin-Wu, D. Baldi, M. Cusmai, and F. Svelto, "A 3.5 GHz wideband ADPLL with fractional spur suppression through TDC dithering and feedforward compensation," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2723–2736, Dec. 2010.
- [21] R. B. Staszewski, K. Waheed, F. Dulger, and O. E. Eliezer, "Spur-free multirate all-digital PLL for mobile phones in 65 nm CMOS," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2904–2919, Dec. 2011.
- [22] J.-S. Kim, Y.-H. Seo, Y. Suh, H.-J. Park, and J.-Y. Sim, "A 300-MS/s, 1.76-ps-resolution, 10-b asynchronous pipelined time-to-digital converter with on-chip digital background calibration in 0.13-μm CMOS," *IEEE J. Solid-State Circuits*, vol. 48, no. 2, pp. 516–526, Feb. 2013.
- [23] C.-W. Yao *et al.*, "A 14-nm 0.14-psrms fractional-N digital PLL with a 0.2-ps resolution ADC-assisted coarse/fine-conversion chopping TDC and TDC nonlinearity calibration," *IEEE J. Solid-State Circuits*, vol. 52, no. 12, pp. 3446–3457, Dec. 2017.
- [24] D. Liao, H. Wang, F. F. Dai, Y. Xu, R. Berenguer, and S. M. Hermoso, "An 802.11a/b/g/n digital fractional-N PLL with automatic TDC linearity calibration for spur cancellation," in *IEEE J. Solid-State Circuits*, vol. 52, no. 5, pp. 1210–1220, May 2017.
- [25] H. Wang, F. Foster Dai, and H. Wang, "A reconfigurable Vernier time-todigital converter with 2-D spiral comparator array and second-order ΔΣ linearization," *IEEE J. Solid-State Circuits*, vol. 53, no. 3, pp. 738–749, Mar. 2018.
- [26] M. Z. Straayer and M. H. Perrott, "A multi-path gated ring oscillator TDC with first-order noise shaping," *IEEE J. Solid-State Circuits*, vol. 44, no. 4, pp. 1089–1098, Apr. 2009.
- [27] P. Lu, A. Liscidini, and P. Andreani, "A 3.6 mW, 90 nm CMOS gated-Vernier time-to-digital converter with an equivalent resolution of 3.2 ps," *IEEE J. Solid-State Circuits*, vol. 47, no. 7, pp. 1626–1635, Jul. 2012.
- [28] Y. Cao, W. De Cock, M. Steyaert, and P. Leroux, "1-1-1 MASH ΔΣ time-to-digital converters with 6 ps resolution and third-order noiseshaping," *IEEE J. Solid-State Circuits*, vol. 47, no. 9, pp. 2093–2106, Sep. 2012.
- [29] A. Elshazly, S. Rao, B. Young, and P. K. Hanumolu, "A 13b 315fsrms 2 mW 500MS/s 1MHz bandwidth highly digital time-to-digital converter using switched ring oscillators," in *Proc. IEEE Int. Solid-State Circuits Conf.*, San Francisco, CA, USA, Feb. 2012, pp. 464–466.
- [30] A. Elshazly, S. Rao, B. Young, and P. K. Hanumolu, "A noiseshaping time-to-digital converter using switched-ring oscillators— Analysis, design, and measurement techniques," *IEEE J. Solid-State Circuits*, vol. 49, no. 5, pp. 1184–1197, May 2014.
- [31] M. Gande, N. Maghari, T. Oh, and U.-K. Moon, "A 71dB dynamic range third-order ΔΣ TDC using charge-pump," in *Proc. Symp. VLSI Circuits* (*VLSIC*), Honolulu, HI, USA, Jun. 2012, pp. 168–169.
- [32] M. B. Dayanik and M. P. Flynn, "Digital fractional-N PLLs based on a continuous-time third-order noise-shaping time-to-digital converter for a 240-GHz FMCW radar system," *IEEE J. Solid-State Circuits*, vol. 53, no. 6, pp. 1719–1730, Jun. 2018.

- [33] J.-P. Hong *et al.*, "A 0.004 mm<sup>2</sup> 250μW ΔΣ TDC with time-difference accumulator and a 0.012mm<sup>2</sup> 2.5mW bang-bang digital PLL using PRNG for low-power SoC applications," in *Proc. IEEE Int. Solid-State Circuits Conf.*, San Francisco, CA, USA, Feb. 2012, pp. 240–242.
- [34] W. Yu, K. Kim, and S. Cho, "A 0.22 ps rms integrated noise 15 MHz bandwidth fourth-order ΔΣ time-to-digital converter using timedomain error-feedback filter," *IEEE J. Solid-State Circuits*, vol. 50, no. 5, pp. 1251–1262, May 2015.
- [35] Y. Wu, P. Lu, and R. B. Staszewski, "A 103fsrms 1.32 mW 50MS/s 1.25MHz bandwidth two-step flash-ΔΣ time-to-digital converter for ADPLL," in *Proc. IEEE Radio Freq. Integr. Circuits Symp. (RFIC)*, Phoenix, AZ, USA, May 2015, pp. 95–98.
- [36] J. L. McCreary and P. R. Gray, "All-MOS charge redistribution analogto-digital conversion techniques. I," *IEEE J. Solid-State Circuits*, vol. 10, no. 6, pp. 371–379, Dec. 1975.
- [37] R. Schreier and G. C. Temes, Understanding Delta-Sigma Data Converters. Piscataway, NJ, USA: IEEE Press, 2005.
- [38] R. Khoini-Poorfard, L. B. Lim, and D. A. Johns, "Time-interleaved oversampling A/D converters: Theory and practice," *IEEE Trans. Circuits Syst. II, Analog Digit. Signal Process.*, vol. 44, no. 8, pp. 634–645, Aug. 1997.
- [39] Y. Wu, M. Shahmohammadi, Y. Chen, P. Lu, and R. B. Staszewski, "A 3.5–6.8-GHz wide-bandwidth DTC-assisted fractional-N all-digital PLL with a MASH ΔΣ -TDC for low in-band phase noise," *IEEE J. Solid-State Circuits*, vol. 52, no. 7, pp. 1885–1903, Jul. 2017.
- [40] J. A. McNeill, "Jitter in ring oscillators," *IEEE J. Solid-State Circuits*, vol. 32, no. 6, pp. 870–879, Jun. 1997.
- [41] B. H. Leung, "A novel model on phase noise of ring oscillator based on last passage time," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 51, no. 3, pp. 471–482, Mar. 2004.
- [42] Z. Y. Chang and W. M. C. Sansen, Low-Noise Wide-Band Amplifiers in Bipolar and CMOS Technologies. Boston, MA, USA: Springer, 1991.
- [43] A. A. Abidi, "Phase noise and jitter in CMOS ring oscillators," *IEEE J. Solid-State Circuits*, vol. 41, no. 8, pp. 1803–1816, Aug. 2006.
- [44] B. Razavi, Design of Analog CMOS Integrated Circuits, vol. 1. New York, NY, USA: McGraw-Hill, Oct. 2002.
- [45] Y. Wu and R. B. Staszewski, "A 0.5ps 1.4 mW 50MS/s Nyquist bandwidth time amplifier based two-step flash-ΔΣ time-to-digital converter," in *Proc. 2nd Int. Conf. Event-based Control, Commun., Signal Process.* (*EBCCSP*), Kraków, Poland, Jun. 2016, pp. 1–4.
- [46] W. Yu, K. Kim, and S. Cho, "A 148fsrms integrated noise 4 MHz bandwidth second-order ΔΣ time-to-digital converter with gated switchedring oscillator," in *IEEE Trans. Circuits Syst. I: Reg. Papers*, vol. 61, no. 8, pp. 2281–2289, Aug. 2014.
- [47] T. Tokairin, M. Okada, M. Kitsunezuka, T. Maeda, and M. Fukaishi, "A 2.1-to-2.8-GHz low-phase-noise all-digital frequency synthesizer with a time-windowed time-to-digital converter," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2582–2590, Dec. 2010.
- [48] J. Rivoir, "Fully-digital time-to-digital converter for ATE with autonomous calibration," in *Proc. IEEE Int. Test Conf.*, Santa Clara, CA, USA, Oct. 2006, pp. 1–10.
- [49] C.-R. Ho and M. S.-W. Chen, "A fractional-N DPLL with calibrationfree multi-phase injection-locked TDC and adaptive single-tone spur cancellation scheme," *IEEE Trans. Circuits Syst. I, Reg. Papers*, vol. 63, no. 8, pp. 1111–1122, Aug. 2016.



Ying Wu (Student Member, IEEE) received the B.Sc. degree in microelectronics from the Chongqing University of Posts and Telecommunications, Chongqing, China, in 2008, and the M.Sc. degree in electronic engineering from Prof. Andreani's RF and Mixed-Signal Group, Lund University, Lund, Sweden, in 2011. He is currently pursuing the Ph.D. degree in electronic engineering with the Microelectronics Department, Delft University of Technology, Delft, The Netherlands.

His current research interests include high resolution time-to-digital converters, frequency synthesizer techniques, and analog/mixed-signal integrated circuit design.



**Ping Lu** (Senior Member, IEEE) received the B.S. and Ph.D. (package with M.S.) degrees from the Department of Microelectronics Engineering, Fudan University, Shanghai, China, in 2002 and 2007, respectively.

In 2005, she was an Intern Student with Samsung Electronics, Suwon, South Korea. From 2007 to 2015, she was a Research Fellow with Lund University, Lund, Sweden, where she was involved in high-resolution time-to-digital converters and digital PLL-based frequency synthesizers applied to LTE.

In 2012, she was also an Intern Researcher with Marvell Technology Group Ltd., Pavia, Italy. In September 2015, she joined Silicon Laboratories, Nashua, NH, USA, where she focused on high-performance clock chip design. Her main work included clock system noise modelling/estimation, frequency dividers, phase interpolators, high-speed clock tree, and so on. Since August 2019, she has been with Microsoft Corporation, Raleigh, NC, USA, where she is in charge of clock generation.

Dr. Lu was a recipient of the Motorola Scholarship in 2000, the Samsung Scholarship in 2003, the Infineon Scholarship in 2004, the Philips Scholarship and Applied-Material Scholarship (Shanghai City) in 2005, and the first Honor Award of "Technology Entrepreneurship Cup" Competition from Shanghai Government in 2007. She was the Technical Program Committee Member of the IEEE Norchip Conference from 2011 to 2013 and the IEEE Nordic Circuits and Systems Conference in 2015. She is currently serving on the Technical Program Committees of the IEEE European Solid-State Circuit Conference. She was a Guest Editor of the IEEE SOLID-STATE CIRCUITS LETTERS (SSC-L) in 2019.



**Robert Bogdan Staszewski** (Fellow, IEEE) was born in Bialystok, Poland. He received the B.Sc. (*summa cum laude*), M.Sc., and Ph.D. degrees in electrical engineering from The University of Texas at Dallas, Richardson, TX, USA, in 1991, 1992, and 2002, respectively.

From 1991 to 1995, he was with Alcatel Network Systems, Richardson, TX, USA, where he was involved in SONET cross-connect systems for fiber optics communications. In 1995, he joined Texas Instruments Inc., Dallas, TX, USA, where he was

elected as a Distinguished Member of Technical Staff (limited to 2% of Technical Staff). From 1995 to 1999, he was involved in advanced CMOS read channel development for hard disk drives. In 1999, he co-started the Digital RF Processor (DRP) Group, Texas Instruments, with a mission to invent new digitally intensive approaches to traditional RF functions for integrated radios in deeply scaled CMOS technology. From 2007 to 2009, he was a CTO of the DRP Group. In 2009, he joined the Delft University of Technology, Delft, The Netherlands, where he currently holds a guest appointment of a Full Professor (Antoni van Leeuwenhoek Hoogleraar). Since 2014, he has been a Full Professor with the University College Dublin, Dublin, Ireland. He is also a Co-Founder of a startup company, Equal1 Labs, with design centers located in Silicon Valley and Dublin, Ireland, aiming to produce single-chip CMOS quantum computers. He has authored or coauthored five books, seven book chapters, 310 journal and conference publications, and holds 180 issued U.S. patents. His current research interests include nanoscale CMOS architectures and circuits for frequency synthesizers, transmitters, and receivers, as well as quantum computers.

Prof. Staszewski has been a TPC Member of ISSCC, RFIC, ESSCIRC, ISCAS, and RFIT. He was a recipient of the 2012 IEEE Circuits and Systems Industrial Pioneer Award. In May 2019, he received the title of Professor from the President of the Republic of Poland. He was a TPC Chair of the 2019 ESSCIRC Conference, Krakow, Poland.