A Power-Efficient Continuous-Time Incremental Sigma-Delta ADC for Neural Recording Systems — Source link

Sha Tao, Ana Rusu

Institutions: Royal Institute of Technology


Topics: Comparator

Related papers:

- A 6.3 µW 20 bit Incremental Zoom-ADC with 6 ppm INL and 1 µV Offset
- Understanding Delta-Sigma Data Converters
- A Micro-Power Two-Step Incremental Analog-to-Digital Converter
- Theory and applications of incremental /spl Delta//spl Sigma/ converters
- A High-Resolution Low-Power Incremental $\Sigma\Delta$ ADC With Extended Range for Biosensor Arrays
This is the accepted version of a paper published in *IEEE Transactions on Circuits and Systems Part 1: Regular Papers*. This paper has been peer-reviewed but does not include the final publisher proof-corrections or journal pagination.

Citation for the original published paper (version of record):

A Power-Efficient Continuous-Time Incremental Sigma-Delta ADC for Neural Recording Systems.
*IEEE Transactions on Circuits and Systems Part 1: Regular Papers*

Access to the published version may require subscription.

N.B. When citing this work, cite the original published paper.

Permanent link to this version:
http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-163179
A Power-Efficient Continuous-Time Incremental Sigma-Delta ADC for Neural Recording Systems

Sha Tao, Student Member, IEEE, and Ana Rusu, Member, IEEE

Abstract—This paper presents an analog-to-digital converter (ADC) dedicated to neural recording systems. By using two continuous-time incremental sigma-delta ADCs in a pipeline configuration, the proposed ADC can achieve high-resolution without sacrificing the conversion rate. This two-step architecture is also power-efficient, as the resolution requirement for the incremental sigma-delta ADC in each step is significantly relaxed. To further enhance the power efficiency, a class-AB output stage and a dynamic summing comparator are used to implement the sigma-delta modulators. A prototype chip, designed and fabricated in a standard 0.18 µm CMOS process, validates the proposed ADC architecture. Measurement results show that the ADC achieves a peak signal-to-noise-plus-distortion ratio of 75.9 dB over a 4 kHz bandwidth; the power consumption is 34.8 µW, which corresponds to a figure-of-merit of 0.85 µJ/conv.

Index Terms—Analog-to-digital converter (ADC), incremental sigma-delta ADC, two-step ADC, continuous-time.

I. INTRODUCTION

O

VER the past decade, recordings of neuopotentials using multi-electrode arrays (MEAs) have emerged as an effective solution for brain-computer interface (BCI) research and applications [1], [2]. Consequently, there is growing interest in the development of integrated circuits for multi-channel neural recording systems, which demand low power consumption and small chip area. In such systems, the Analog-to-Digital Converter (ADC) is an important building block that has a dominant impact on the speed and resolution of the entire system. A successive-approximation-register (SAR) ADC, which features medium-resolution and excellent power-efficiency, is usually adopted in state-of-the-art neural recording systems [2]–[4]. Such an SAR ADC based signal chain, however, imposes stringent requirements on front-end circuitries [5]. By employing a high-resolution ADC, these requirements can be much relaxed, and several power hungry signal conditioning blocks (e.g., additional gain stages and active anti-aliasing filters) can be simplified or even eliminated. The significant reduction of front-end circuit complexity, on the other hand, is traded with the challenge of designing a high-resolution ADC with high power-efficiency.

Sigma-delta (ΣΔ) ADCs achieve high-resolution with relaxed matching requirements comparing to their Nyquist counterparts. Traditional ΣΔ ADCs capture a band-limited stream of inputs and produce the corresponding outputs as average. Without one-to-one mapping between input and output samples, they can hardly be multiplexed between channels, with the exception of applying special techniques [6], [7]. Incremental ΣΔ (IΣΔ) ADCs, on the contrary, reset their modulators and digital filters after each conversion, and thus are well suited to process multiplexed signals. The first-order IΣΔ ADC [8] can achieve high resolution, but at the cost of very long conversion time, thus resulting in poor power-efficiency. Higher-order and multi-bit IΣΔ ADCs have been developed to speed up the conversion rate of IΣΔ ADCs [9]–[12]. Alternatively, ADC architectures that combine the ΣΔ with a Nyquist ADC, such as extended counting (EC) [13] or extended range (ER) ADCs [14], have been proposed to effectively improve the resolution of IΣΔ ADCs.

IΣΔ ADCs developed up to now, almost exclusively employ discrete-time (DT) loop filters. So far, the only continuous-time (CT) IΣΔ prototypes existing in the literature include the first-order CT IΣΔ ADCs [15], [16] reported recently. IΣΔ ADCs implemented with CT loop filters have relaxed settling and bandwidth requirements on the active blocks compared with their DT counterparts, thus leading to potential power reduction. This advantage still holds even with the existence of a sample-and-hold (S/H) preceding the ADC, as the loop filter processes each input in a continuous fashion. The front-end S/H, however, would induce considerable power and noise penalties. Therefore, implementations without a front-end S/H are preferred.

In this paper, we take a step further by proposing a more power-efficient CT IΣΔ alternative for high-resolution multi-channel A/D conversion. The proposed architecture combines the speed of pipelined ADCs and the resolution of ΣΔ ADCs. By pipelining CT IΣΔ stages, high-resolution can be achieved without sacrificing the conversion rate [18]. This feature would be important in the next generation neural recording systems, in which the high-density MEAs will be scaling to thousands of electrodes [19]. Compared to higher-order single-loop IΣΔ ADCs, the 2<sup>nd</sup>-order loop filter in each stage requires less coefficient scaling and is thus more power-efficient. Compared to EC/ER architectures, which require extra cycles for EC/ER conversion, this architecture can operate faster when achieving the same resolution. The paper is organized as follows. Section II describes design considerations of the target neural recording system and derives ADC specifications. Section III presents operation principle and design methodology of the proposed CT IΣΔ ADC. The circuit implementation of the ADC is detailed in Section IV. Measurement results of the prototype ADC and comparison with state-of-the-art are given in Section V. Finally, Section VI concludes the paper.

This work was supported by Swedish Research Council (VR).

The authors are with KTH Royal Institute of Technology, School of Information and Communication Technology, SE-164 40 Kista, Sweden (Email: {tao,arusu}@kth.se).
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS—I: REGULAR PAPERS.

Fig. 1: Simplified architecture of a multi-channel neural recording system based on a high-resolution ADC.

II. SYSTEM ARCHITECTURE AND ADC SPECIFICATIONS

The electrocorticography (ECoG), which records the neuron activity from the surface of the brain [1], is the target neural signal. ECoG-based BCI systems are more invasive than the traditional electroencephalography (EEG)-based systems, and thus can provide better spatial resolution and record higher-frequency content of the signal. Clinical studies using ECoG recordings in humans show that the functional activation of a cortex is associated with an increase in power in the high-gamma frequency range (60-200 Hz) [20]. It has been also shown that at least a 64-electrode setup is needed for achieving useful and reliable results [21].

The simplified architecture of a multi-channel ECoG-based neural recording system is shown in Fig. 1, where the proposed two-step CT $\Sigma\Delta$ ADC is shared among multiple recording channels. In $\Sigma\Delta$ ADCs, to achieve ideal behavior, the input signal is held constant for the entire conversion, which requires a S/H preceding the ADC. In the proposed system architecture, the S/H is removed from the front-end circuitry. As it is illustrated later, removing the S/H alters the ideal signal transfer function of the loop filter, and marginally attenuates the signal amplitude. This might be a good trade-off since it eliminates the need for implementing both a high-resolution S/H, and a low-noise high-order anti-aliasing filter. By this means, further reduction in both power and area can be achieved for the entire integrated neural recording system.

In a multi-channel system, signals from all channels should be fed into a single ADC to save chip area. However, this demands a high sampling rate ADC, which leads to excessive power consumption. A practical approach is to find the optimal number of channels per ADC so as to obtain a better trade-off between power and area. For the targeted 64-channel system, 16 channels are shared by one ADC, to achieve the minimum power-area product for the entire system [22]. Without utilizing a variable-gain amplifier as in many conventional ECoG systems, such as [23], a 80 dB dynamic range (DR) is typically required for the ADC to resolve ECoG signals [24]. So, the proposed ADC should achieve a 13-14 bits resolution and handle inputs from 16 channels with 60-200 Hz per channel.

III. ADC ARCHITECTURE

A. Operation Principle

The block diagram of the proposed two-step CT $\Sigma\Delta$ ADC and the associated timing diagram are shown in Fig. 2. At the beginning of each conversion, one of the input channels is selected by the MUX. Then, two CT $\Sigma\Delta$ ADCs are initially reset and process the sample in a pipeline fashion: stage 1 for coarse conversion and stage 2 for fine conversion. The CT $\Sigma\Delta$ modulator in stage 1 processes the input sample, $U_x$, with an oversampling clock, $f_{OS} = M f_s$, where $M$ is the number of clock cycles per conversion and $f_s$ is the frequency of the periodical resetting. After $M$ cycles, the most significant bits (MSBs), $N_1$, are extracted by the digital filter. The modulator and digital filter are then reset, and an analog residue $V_{res}$ is captured by the S/H. Note that $V_{res}$ can fit directly the input range of the second stage ADC. So, this architecture does not require the inter-stage amplification needed in conventional pipelined ADCs. The sampled residue voltage, $U_2$, is passed to the CT $\Sigma\Delta$ modulator in stage 2, and oversampled again with $f_{OS}$. After $M$ cycles, the least significant bits (LSBs), $N_2$, are extracted, and the modulator and digital filter are reset. By digitally combining the MSBs and LSBs from the two conversion stages, a resolution of $N_1 + N_2$ can be ideally achieved. Due to the two-step pipelining operation, the coarse conversion can process the next sample immediately after the fine conversion starts. Therefore, the effective conversion rate is determined by the conversion time of only one stage while the conversion resolution is doubled.

As shown in Fig. 3, for each 2$^{nd}$-order $\Sigma\Delta$ modulator, the cascaded integrators in feed-forward (CIFF) configuration with input signal feed-forward (IFF) topology is used to reduce the
signal swing in the integrators and minimize the performance
degradation due to coefficient variations. The use of a single-
bit quantizer minimizes the digital filter complexity and avoids
the need of linearization techniques in the feedback digital-to-
analog converter (DAC). For the feedback DAC, a non-return-
to-zero (NRZ) scheme is employed considering the best trade-
off between jitter sensitivity and power consumption [25].

B. Design Methodology

Given the specifications described in Section II, the system
level design of the proposed ADC follows these steps.

1) Effective conversion rate: In this work, the multi-
channel system consists of 16 channels, and each channel
carries a signal bandwidth of 200 Hz. Due to time multi-
plexing, the effective sampling rate for each channel is
given by the ADC’s conversion rate divided by the number of
channels. By selecting a sampling rate that is slightly higher
than the Nyquist rate, the maximum time slot devoted to each
channel is: $T_{S,max} = 1/(200 \text{ Hz} \times 2.5 \times 16) = 125 \mu s$. This
corresponds to an effective conversion rate of $f_S = 8 \text{ kHz}$.

2) Resolution of each step: In traditional pipelined ADCs,
a range overlap is usually introduced between adjacent con-
version stages to make sure that the residue voltage will not
exceed the input range of the following stage. In an
ΣΔ modulator based conversion stage, on the other hand,
the output of the last integrator, where the residue voltage
is sampled, is theoretically bounded assuming constant input
[26]. In addition, even considering the circuit non-idealities,
the voltage swing at the last integrator’s output can still be
limited within a practical bound (i.e., $\pm V_{ref}$) by applying
proper coefficient scaling. Therefore, range overlap is not
necessary for the proposed ADC architecture. As the two-step
ADC targets a resolution of 14-bit, 8-bit is allocated for each
conversion step to account for circuit non-idealities.

3) Time-domain analysis: Assuming that each conversion
requires $n$ cycles, the two integrators’ outputs $V_{x1}(t)$ and
$V_{x2}(t)$ can be expressed in the time domain as [9]:

$$V_{x1}(n) = c_1 b_1 n V_i - c_1 a_1 V_{ref} \sum_{i=0}^{n-1} D_v(i)$$

$$V_{x2}(n) = c_2 c_1 b_1 \frac{n(n-1)}{2} V_i - c_2 c_1 a_1 V_{ref} \sum_{j=0}^{n-1-i} \sum_{i=0}^{j-1} D_v(i)$$

where $V_i$ is the input from one channel that is assumed
to change very slowly during each conversion; $V_{x1}(n)$ and
$V_{x2}(n)$ are the analog voltages at the two integrators’ outputs
after $n$ clock cycles; $D_v(i) = \pm 1$ is the modulator output in
the $i^{th}$ cycle. Note that through coefficient scaling, $V_{x2}(n)$ can
be bounded between the DAC references as:

$$-V_{ref} \leq c_2 c_1 b_1 \frac{n(n-1)}{2} V_i - c_2 c_1 a_1 V_{ref} \sum_{j=0}^{n-1-i} \sum_{i=0}^{j-1} D_v(i) \leq V_{ref}$$

By rearranging (3), it gives the following equation:

$$-\frac{V_{ref}}{c_2 c_1 b_1 \frac{n(n-1)}{2}} \leq V_i - \frac{a_1}{b_1 \frac{n(n-1)}{2}} \sum_{j=0}^{n-1-i} \sum_{i=0}^{j-1} D_v(i)$$

$$\leq +\frac{V_{ref}}{c_2 c_1 b_1 \frac{n(n-1)}{2}}$$

4) Required number of clock cycles: The middle term of
(4) denotes the difference between the input $V_i$, and the
measurable modulator output $D_v$, after $n$ clock cycles. It is
then possible to estimate the input as:

$$\hat{V}_i = \frac{a_1}{b_1 \frac{n(n-1)}{2}} V_{ref} \sum_{j=0}^{n-1-i} \sum_{i=0}^{j-1} D_v(i)$$

The difference between $V_i$ and $\hat{V}_i$ is the quantization error,
which is bounded within $\pm \frac{1}{2} V_{LSB}$ in an ideal ADC. According
to (4), the $V_{LSB}$ can be expressed by:

$$V_{LSB} = 2 \times \frac{V_{ref}}{c_2 c_1 b_1 \frac{n(n-1)}{2}}$$

After $n$ clock cycles, the conversion resolution is given by:

$$n_{bit} = \log_2 \left( \frac{2 \times V_{i,max}}{V_{LSB}} \right) = \log_2 \left( \frac{u_{max} V_{ref} c_2 c_1 b_1 \frac{n(n-1)}{2}}{V_{ref}} \right)$$

$$= \log_2 \left( \frac{n(n-1)}{2} + \log_2 \left( c_2 c_1 b_1 \right) + \log_2 \left( \frac{u_{max}}{2} \right) \right)$$

where $u_{max} = \frac{V_{max}}{V_{ref}}$ is the normalized maximum input,
which limits the peak input amplitude to a fraction of the DAC
reference, $V_{ref}$. Accordingly, the number of clock cycles, $M$,
for achieving the desired resolution, $n_{bit}$, can be found as:

$$M = 2^{\frac{n_{bit} + \log_2 \left( b_1 c_1 c_2 \right) + \log_2 \left( 0.5 u_{max} \right) }{2}}$$

5) Modulator design: For the 2nd-order modulator, the
DT noise transfer function $NTF(z)$ is designed with the
Schreier’s Toolbox [27]. The DT loop filter is derived as
$LFB(z) = 1/NTF(z) - 1$. The signal transfer function of the
CIFF+IFF topology is unity. However, removing the S/H block
in front of an ΣΔ results in a modified $STF$ as [28]:

$$STF(z) = \frac{1 + 2z^{-1} + 3z^{-2} + \cdots + Mz^{-M-1}}{M(M+1)/2}$$

The frequency response of the modified $STF(z)$ is plotted
in Fig. 4 for $M = 40$. It shows that the signal is attenuated
approximately 2.55 dB at the edge of the signal band.
6) CT loop filter coefficients: The CT loop filter is determined by the impulse invariant transformation [29] as:

$$LF(z) = Z \left\{ L^{-1} [LF(s) R_{DAC, NRZ}(s)] \right\}_{t=nT_{OS}}$$

(10)

Transient simulations are used to assure that the second integrator’s output is bounded between $\pm V_{ref}$. The resulting modulator’s coefficients, assuming a normalized sampling rate of 1 (i.e., $f_{OS} = 1$), are listed in Table I.

7) Residue voltage generation: In the proposed two-step architecture, the input of the fine conversion stage is the sampled residue voltage of the coarse conversion stage. As aforementioned, because of the CIFF+IFF loop filter topology and the incremental operation, the quantization error of the coarse conversion stage can be readily obtained in analog form. Following the analysis in steps 3) and 4), the relative quantization error (with respect to 1 LSB) can be found by:

$$q = \frac{\hat{V}_i - V_i}{V_{LSB}} = \frac{c_2 c_1 a_1}{2} \sum_{j=0}^{n-1} \sum_{i=0}^{j-1} D_c(i) - \frac{c_2 c_1 b_1}{2} \frac{n(n-1)}{2} \frac{V_i}{V_{ref}}$$

(11)

By combining (11) and (2), the second integrator’s output is determined as $V_{r_2}(n) = -2V_{ref} \cdot q$. Then, at the end of each conversion, i.e., after $M$ clock cycles, the residue voltage can be found as $V_{res} = V_{r_2}(M) = -2V_{ref} \cdot q$. So far, a constant input, $V_i$, has been assumed in deriving the equations. By removing this assumption, the analog representation of the residue, $V_{res}$, will not be affected. A time-varying input, $V_i(i)$, however, would lead to in-band amplitude attenuation, as illustrated in Fig. 4, which can be compensated in the digital domain if needed [14].

8) Digital filter design: The digital filter is designed to make sure that the quantization error of the ΣΔ ADC in each stage, given by (11), can be derived from the second integrator’s output, as $-V_{r_2}(M)/(2V_{ref})$, after $M$ clock cycles. Following the method in [17], the digital filter transfer function, $H_{DF}(z)$, can be derived as:

$$H_{DF}(z) = \left[ \frac{z^{-2}}{(1-z^{-1})^2} + \frac{1}{2} \frac{z^{-1}}{1-z^{-1}} \right] \frac{2}{M(M-1)}$$

(12)

9) Theoretical resolution: Considering only the quantization noise, the theoretical signal-to-quantization-noise-ratio (SQNR) of the two-step ADC can be estimated as:

$$\text{SQNR}_{2\text{step}} [\text{dB}] \approx 20 \log_{10} \left[ \frac{2V_{ref}}{u_{max} V_{ref}} \frac{1}{M(M-1)} \right] + 20 \log_{10} \left[ \frac{2V_{ref}}{V_{res,peak}} \frac{1}{M(M-1)} \right]$$

(13)

where $V_{res,peak}$ is the peak amplitude of the residue voltage. The theoretical SQNR estimated by (14) is compared against system-level simulation results, when the number of cycles in each conversion, $M$, is swept, as shown in Fig. 5. Good matching between the simulated and estimated SQNRs is achieved. For the sake of comparison, the theoretical results of the EC ΣΔ ADC [13], the ER ΣΔ ADC [14] (8-bit is used for the extended conversion in both cases) and the 3rd-order ΣΔ ADC [17] are included in this plot. As it can be appreciated from the figure, the proposed two-step ADC requires the lowest $M$ to achieve a high theoretical resolution.

It is also interesting to compare the proposed ADC to a 2-2 MASH CT ΣΔ ADC, as both architectures have similar analog complexity and benefit from processing the residue of a 2nd-order ΣΔ stage. Under ideal conditions, the resolution of the proposed architecture is the sum of two 2nd-order ΣΔ ADCs, while the 2-2 MASH architecture achieves a resolution equivalent to 4th-order noise shaping. Thus, for a given $M$, the SQNR of the 2-2 MASH architecture is not as good as the one achieved by the proposed two-step ADC. In addition, the effectiveness of the noise shaping in MASH architectures relies on the perfect cancellation of the quantization noise of the first stage. This indicates that the non-idealities of all integrators in the first stage are equally important in determining the noise leakage to the overall output. As it is shown later, in the proposed architecture, only the non-idealities of the first integrator limits the overall performance.

10) Circuit non-idealities: Prior to circuit implementation, extensive simulations were performed at system-level in Matlab/Simulink and at behavioral-level in Cadence using Verilog-A models. Circuit non-idealities that are critical for the CT implementations have been examined to derive the specifications of circuit blocks and clock signals. Simulation results show that the 1st integrator in stage 1 limits the overall performance and the specifications for the 2nd integrator as well as the integrators in stage 2, e.g., circuit noise, amplifier’s gain, integrator’s coefficient variation, clock jitter and excess loop delay, are greatly relaxed. In addition, since both the noise and the accuracy of the inter-stage S/H block are set only by the DR of stage 2, the requirements for the inter-stage S/H block are also very relaxed.

IV. CIRCUIT DESIGN AND IMPLEMENTATION

The block diagram of the implemented two-step CT ΣΔ ADC is shown in Fig. 6. The two 2nd-order CT ΣΔ modulators and the S/H between the two stages have been implemented in a 0.18 μm CMOS process. To achieve the target resolution, $M = 40$ is chosen for the 4 kHz signal bandwidth.
This corresponds to a 320 kHz oversampling clock frequency. Active-RC integrators are used to implement the loop filters for better linearity, larger signal swing and lower sensitivity to parasitics than their GmC counterparts.

Given the coefficients specified in Table I, the values of passive components are derived with respect to the oversampling frequency, \( f_{OS} \). In the proposed architecture, the first integrator in the first stage, Int\(_{11}\), is the most critical block that dominates the noise performance of the entire ADC. The circuit noise at Int\(_{11}\) has five contributors: the input resistance, \( R_{11} \), the DAC resistance, \( R_{dac} \), the reset switch resistance, \( R_{res} \) (\( kT/C \) noise), the input transconductance, \( g_m \), and the flicker noise of the Opamp. Considering that for a power-efficient design, the ADC’s performance is limited by circuit noise rather than quantization noise [30], \( R_{11} \), \( R_{dac} \), and \( C_{11} \) can be determined. The passive values of Int\(_{12}\) are determined in a similar way with half of the resolution specification. Due to the noise-shaping characteristics of the loop filter, linearity and noise requirements on Int\(_{12}\) and Int\(_{22}\) are further relaxed with respect to the first integrator. \( R_{sh} \) and \( C_{sh} \) in the S/H are chosen, considering the trade-off among sampling accuracy, settling time, thermal noise and loading condition.

### A. Opamps and OTAs

A power-efficient, low noise Class-A/Class-AB amplifier, OPA\(_1\), is chosen for the Int\(_{11}\), as shown in Fig. 7 (a). In the input stage, PMOS differential pairs \( M_1 \) and \( M_2 \) with very large gate area (\( W/L = 320 \mu m/4 \mu m \)) are used to achieve low flicker noise and reduced sensitivity to mismatch. Additionally, the thermal noise contribution from the current source transistors \( M_3 \) and \( M_4 \) is minimized by using source degeneration resistors \( R_{dsc} \). In the output stage, the Class-AB operation is achieved by dynamically biasing the output transistors \( M_5/M_6 \) and \( M_7/M_8 \) through the current mirrors \( M_9/M_{10} \) and \( M_{11}/M_{12} \). The peak transient current delivered from this output stage to the integrating capacitor \( C_{11} \) can be much higher than the DC biasing current. The common-mode voltage of the output stage can be sensed and stabilized by the common-mode feedback (CMFB) shown in Fig. 8 (a) [31]. Since the DC current in dynamic biasing current mirrors are set by the DC voltages at nodes 1 and 2, another dedicated feedback loop is usually desired for the input stage [31], [32]. Instead of employing two independent CMFB loops which may potentially lead to instability, a simple CMFB structure is used here to set the common-mode voltages for the input stage. As shown in Fig. 7 (a), two very large resistors, \( R \), are used to sense and average the DC voltages at nodes 1 and 2, and then feed it back to the gates of \( M_5 \) and \( M_4 \). The OPA\(_2\) used in Int\(_{22}\) is implemented as a two-stage Class-A Opamp, as shown in Fig. 7 (b), and its CMFB circuit is shown in Fig. 8 (b). The amplifiers employed in the second integrators (Int\(_{12}\) and Int\(_{22}\)) in the loop filters have relaxed swing, GBW, slew rate and loading requirements. They have been implemented with a current-mirror OTA, as shown in Fig. 7 (c). An NMOS input differential pair is adopted to obtain better \( g_m \) efficiency and consequently less current consumption. To avoid loading the OTA resistively, two differential pairs are used to sense the output voltages, as shown in Fig. 8 (c). Linearity of these differential pairs and hence the output swing of the OTA is enhanced by the linearization transistors \( M_{R1} - M_{R4} \) acting as source degeneration resistances [33].

### B. Summation, Quantizer, and Feedback DAC

State-of-the-art designs of CT CIFF loop filters usually employ a dedicated summing amplifier to perform the weighted addition of feed-forward coefficients [31]. An alternative solution is to integrate the weighted addition into the last integrator [34]. This solution, however, imposes tougher requirements (e.g., larger output swing) on the last integrator. For the two-stage architecture, in particular, reusing the last integrator for summation is not desired, as its outputs need to be directly accessed by an inter-stage S/H. In order to save power and area, a dynamic summing comparator is designed to perform both the 1-bit quantization and the weighted addition, as shown in

Fig. 6: Simplified circuit block diagram of the implemented two-step CT ΣΔ ADC.
In this design, the power and area by integrating it into the comparator [36]. In the weighted addition is realized with virtually no additional dynamic current consumption is proportional to the clock rate. To the differential inputs of the 1st integrating Opamp, $Q_1$ and $Q_2$, and the weights in the current addition are related to the feed-forward coefficients $d_1$, $d_2$ and 1. When the input transistors are designed with the same length, these coefficients can be realized by sizing the widths of these input transistors.

The 1-bit NRZ DAC, as shown in Fig. 9 (b), consists of two sets of complementary switches and feedback resistors. The positive and negative DAC references $V_{dac+}$ and $V_{dac-}$ are switched by the differential outputs of the quantizer, $Q$ and $Q$. In addition, dummy transistors, $M_{ND1}$, $M_{PD1}$, $M_{ND2}$, and $M_{PD2}$, are placed alongside with the switch transistors to reduce the glitches induced by the switching instances. Outputs of the feedback resistances $R_{dac}$ are connected to virtual ground nodes of the first integrating Opamp.
C. Digital Filter and Combination Logic

As shown in Fig. 2, the two IΣΔ modulator’s output bit-streams, V₁ and V₂, are digitally filtered before the weighted combination. The ideal transfer function of the digital filter, i.e., (12) in Section III-B, is derived as a matched filter that realizes the digital filter as the exact replica of the analog loop filter. This digital filter is the sum of a cascade of integrators that processes M = 40 samples coming from one of the IΣΔ modulators. It can be treated as an M-length finite impulse response (FIR) filter with appropriate filter coefficients [26]. These coefficients can be obtained by computing the M-length impulse response of the transfer function \( H_{DF}(z) \). The final outputs of the digitized results, \( D_1 \) and \( D_2 \), are the weighted sum of the \( V_1 \) and \( V_2 \) samples with a decimation ratio of \( M \). The digital combination logic in the two-step ADC is expressed as: \( D_{out,2step} = (D_1 \times 2^8 + D_2)/2^8 \).

The presence of circuit non-idealities would induce mismatch between the analog and digital transfer functions. This makes the residue, \( V_{res} \), no longer an accurate representation of the quantization error in the coarse conversion stage. The ADC’s sensitivity to the integrator’s coefficient variation and amplifier’s finite GBW has been evaluated by behavioral simulations, as shown in Fig. 10 (a) and Fig. 10 (b). As it can be seen in Fig. 10, without any calibration, the signal-to-noise and distortion ratio (SNDR) of the two-step ADC is sensitive to both the coefficient variation and the finite GBW in the coarse conversion stage (stage 1). Therefore, an optimal digital filter is required in stage 1 to take better advantage of the quantization error refinement of the proposed architecture. In order to maximize the SNDR of the two-step ADC, the built-in Matlab optimization algorithm, \texttt{fmincon} [37], which finds the optimal coefficients of the FIR filter, is employed. The optimization algorithm searches for a constraint minimum of an objective function of multiple variables at an initial estimate. In particular, by using the coefficients calculated from (12) as the initial estimate, and the coefficients of the M-length FIR filter as the variables, this algorithm aims to minimize the inverse of the SNDR improvement of the optimized filter coefficients with respect to the original ones [17]. This digital filter optimization is a reference-free method that can operate directly with the recorded modulator’s output stream. It can be run at each level of design phase to obtain a set of optimized filter coefficients taking into account circuit non-idealities. Implementation details and the benchmark of such an FIR digital filter can be found in [38]. As shown in Fig. 10, after applying the optimal filter, the requirements on both the integrator’s coefficient variation and the amplifier’s finite GBW are much relaxed for stage 1. Transistor-level and post-layout transient noise simulations show that the two-step CT IΣΔ ADC achieves SNDRs of 84.58 dB and 79.08 dB, respectively. In addition, Monte-Carlo simulations show that the SNDR is not affected significantly by process variation. Simulation results also reveal that when the analog supply goes below 1.1 V, the SNDR starts to degrade. In addition, the simulated SNDR is almost constant over the commercial temperature range (0 to 70 °C).

V. EXPERIMENTAL RESULTS

The proposed two-step CT IΣΔ ADC was fabricated in a standard 0.18 \( \mu \)m CMOS process. Fig. 11 shows the chip micrograph. The active area of the prototype, excluding the bonding pads and I/O drivers, is approximately 0.337 mm². The digital filters, as well as the digital combination of MLBs and LSBs, described in Section IV-C, were implemented in Matlab. Bit-stream samples from the two IΣΔ modulators are captured, and fed into the optimization algorithm to get a specific set of filter coefficients that are optimal for the fabricated chip sample. The core circuit is powered by a 1.2 V analog supply and a 1.8 V digital supply with separate grounds. For further noise reduction, decoupling capacitors are placed between power supplies and grounds in the unused chip area. The prototype chip is assembled in a 44-pin plastic leaded chip carrier (PLCC) package and mounted on a customized evaluation board (EVB), as shown in Fig. 12.

A. Measurement Setup

Apart from the device under test (DUT) ADC chip, the EVB features mainly signal conditioning and voltage/current biasing circuitries. The input test signal is brought on board.
Fig. 12: Photograph of the customized evaluation board.

Fig. 13: Simplified block diagram of the measurement setup.

through an SMA connector and then passed through a single-ended to differential conversion (SE-DIFF) buffer. A single-pole RC filter is placed between the buffer outputs and the ADC inputs to reduce the noise contribution due to the driving circuit. To attenuate voltage ripple and noise from the external power supplies, low-dropout regulators (LDOs) are used on the EVB to generate various power supplies and voltage references required for the chip operation. The bias currents generated on board are derived from one of the power supplies using potentiometers and series resistors.

The measurement setup is depicted in Fig. 13. A sinusoidal input with the bandwidth of one channel (approximately 200 Hz), is used as the test signal, and the performance is determined by the ADC’s bandwidth which covers 16 channels. An ultra-low distortion function generator (Stanford Research DS360) is used to drive the test EVB. The required synchronized signals for the chip operation are generated by a data pattern generator (Sony/Tektronix DG2020A). As shown in Fig. 13, the generated signals are used as S/H clock (sh), reset signals (rst1, rst2), and oversampling clocks (clk1, clk2) for the two conversion steps. The digital output data streams, i.e., the modulators outputs (v1, v2) and reset signals (rst1, rst2) are captured by a logic analyzer (Tektronix TLA621). These streams are then imported into Matlab where they are processed by the digital filters described in Section IV-C. A Fast-Fourier-Transform (FFT) is performed, and a Blackman-Harris window is applied to compute the performance metrics.

B. Measurement Results

The measured static power consumption of the ADC, excluding the output drivers, is 34.8 µW. According to post-layout simulations, 15.1 µA is consumed in the Int11, 2.5 µA in the Int12, 3.2 µA in the S/H, 3.1 µA in the Int12, 2.0 µA in the Int22, and 3.1 µA in the biasing circuits, respectively.

The measured output spectrum for a −3.2 dBFS sinusoidal input at 174.4 Hz, where 0 dBFS refers to approximately 1.0 Vpp, is shown Fig. 14. From the power spectral density plot, a spurious-free dynamic range (SFDR) of 88.1 dB has been measured. Fig. 15 presents the measured SNR and SNDR versus the input signal amplitude, demonstrating a peak SNR of 76.6 dB, a peak SNDR of 75.9 dB for an input at −3.2 dBFS and a dynamic range of 85.5 dB. Fig. 16 (a) shows the measured peak SNRs and SNDRs for various in-band test frequencies. When the test signal frequency is increased to one-third of the ADC bandwidth, there is approximately 2 dB degradation in the SNR/SNDR performance. This is mainly due to the attenuation on signal amplitude when removing the S/H in front of the ΣΔ ADC, as it has been explained in section III-B. The attenuation is negligible when the input signal is relatively slow compared to the ADC’s conversion rate, which is the case of the target system, where the signal from one channel has a bandwidth of 60-200 Hz, while the ADC’s conversion rate is 8 kHz. Fig. 16 (b) presents the performance variation among all of the five samples available for measurements. The measured samples show consistent SNR/SNDR performance. It is worth to mention that although tunable capacitive arrays have been implemented for C11 and
integrators [39] can be employed to take better advantage of power-efficient circuit techniques, such as the inverter-based
From the circuit implementation perspective, state-of-the-art according to the signal bandwidth and resolution requirements.
ADC can be extended to a multi-stage pipeline architecture. From the architecture point-of-view, the two-step CT I
can be achieved by applying different enhancement techniques. Further FOM improvement of the proposed ADC architecture
speed sensor application [39]. It is worth mentioning that the DT multi-bit architecture applying a Smart-DEM algorithm
Σ∆ the lowest FOM among the state-of-the-art I
Σ∆ the-art I
Σ∆ the FOM of CT I
Σ∆ the work improves the FOM of CT IΣ∆ ADCs by a decade, and achieves one of the highest FOM among the state-of-the-art IΣ∆ ADCs, except the DT multi-bit architecture applying a Smart-DEM algorithm [12] and the DT zoom-ADC architecture targeting a very low speed sensor application [39]. It is worth mentioning that further FOM improvement of the proposed ADC architecture can be achieved by applying different enhancement techniques. From the architecture point-of-view, the two-step CT IΣ∆ ADC can be extended to a multi-stage pipeline architecture. In this case, the number of cycles in each conversion and the modulator order in each stage, can be adjusted and optimized according to the signal bandwidth and resolution requirements.
From the circuit implementation perspective, state-of-the-art power-efficient circuit techniques, such as the inverter-based integrators [39] can be employed to take better advantage of the relaxed circuit specifications in the last pipeline stages.

<table>
<thead>
<tr>
<th>Year</th>
<th>Architecture</th>
<th>Implementation</th>
<th>Conversion Rate (kS/s)</th>
<th>SNDR (dB)</th>
<th>Power (µW)</th>
<th>VDD (V)</th>
<th>Technology (±µm)</th>
<th>FOMWalden (pJ/conv)</th>
</tr>
</thead>
<tbody>
<tr>
<td>2014</td>
<td>Two-step ADC</td>
<td>CT</td>
<td>8</td>
<td>75.9</td>
<td>34.8</td>
<td>1.2/1.8</td>
<td>0.18</td>
<td>0.85</td>
</tr>
<tr>
<td>2010</td>
<td>1st-order Σ∆</td>
<td>CT</td>
<td>0.5</td>
<td>58.95</td>
<td>20</td>
<td>1.6</td>
<td>0.5</td>
<td>55.2</td>
</tr>
<tr>
<td>2013</td>
<td>3rd-order Σ∆</td>
<td>CT</td>
<td>4</td>
<td>60</td>
<td>96</td>
<td>1.8</td>
<td>0.15</td>
<td>18.5</td>
</tr>
<tr>
<td>2013</td>
<td>Zoom ADC</td>
<td>DT</td>
<td>0.025</td>
<td>119.8</td>
<td>6.3</td>
<td>1.8</td>
<td>0.16</td>
<td>0.31</td>
</tr>
<tr>
<td>2010</td>
<td>Multi-channel</td>
<td>DT</td>
<td>43.48</td>
<td>81.5</td>
<td>1.8</td>
<td>1.8</td>
<td>0.18</td>
<td>16.1</td>
</tr>
<tr>
<td>2013</td>
<td>Multibit Σ∆</td>
<td>DT</td>
<td>10</td>
<td>105</td>
<td>1.8</td>
<td>1.8</td>
<td>0.18</td>
<td>0.19</td>
</tr>
<tr>
<td>2010</td>
<td>ER (Σ∆+SAR)</td>
<td>DT</td>
<td>1000</td>
<td>84.7</td>
<td>1.8</td>
<td>3.3/1.8</td>
<td>1.8</td>
<td>1.98</td>
</tr>
<tr>
<td>2012</td>
<td>EC ADC</td>
<td>DT</td>
<td>1000</td>
<td>56</td>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>

* SNDR measurement is not available in [39]. Instead of SNDR, a derived SNR, a = \(20\log_{10}(\text{Max DC Input}/2\sqrt{2}/\text{Output Noise})\) is used in the FOM calculation.

VI. CONCLUSION

A power-efficient incremental Σ∆ ADC using continuous-time implementation has been presented. To provide a flexible and power-efficient solution for the A/D conversion required in neural recording systems, a two-step ADC architecture, consisting of two second-order CT IΣ∆ ADCs in a pipeline configuration, has been proposed. It has been shown that the ADC prototype fabricated in a standard 0.18 µm CMOS technology, achieves a peak SNDR of 75.9 dB and a dynamic range of 85.5 dB while consuming a static power of 34.8 µW. The proposed two-step CT IΣ∆ ADC provides inherent flexibility, which can be better exploited if it is generalized to an architecture composed of several conversion stages. For instance, the conversion rate can be adjusted by varying the number of cycles per conversion, while maintaining the oversampling frequency; different conversion resolutions can be achieved by using different number of stages, as in pipelined ADCs. This makes the proposed ADC a promising solution for the next generation neural recording systems where both high-channel-count and high-resolution are demanded.

ACKNOWLEDGMENT

We would like to thank Saul Rodriguez, Martin Gustafsson and Julian Garcia for their support and advice on chip implementation and measurement. We also appreciate Håkan Bengtsson, Giti Amozandeh and Mikael Pettersson from Ericsson AB for valuable discussions and comments during design reviews. In addition, we want to acknowledge Agilent Technologies for supplying the necessary lab equipment.

REFERENCES


Sha Tao (S’10) received the B.S. degree in Electronic Engineering from Beijing Raotong University, China (2007). She received the M.S. degree in System-on-Chip Design (2009) and the Licentiate degree in Electronic and Computer Systems (2012) from KTH Royal Institute of Technology, Sweden, where she is currently working towards the Ph.D. degree in Information and Communication Technology. Her doctoral work focuses on power-efficient continuous-time sigma-delta ADCs.

Ana Rusu (M’92) received the M.Sc. degree in electronics and telecommunications from TUJ (1983) and Ph.D. degree in electronics from TUCN (1998), Romania. Since 2001, she has been with KTH Royal Institute of Technology, Stockholm, Sweden, where she is Professor at the School of ICT. Her research interests include low/ultra-low power high performance CMOS circuits and systems, RF graphene circuits and high temperature SiC circuits. She has participated in several national and international research projects and has authored or coauthored more than 100 international scientific publications in journals, conference proceedings, books and book chapters.