# A TIME-TO-FIRST SPIKE CMOS IMAGER

Xin Qi, Xiaochuan Guo, and John G. Harris

University of Florida, Department of Electrical and Computer Engineering, Gainesville, FL 32611

### ABSTRACT

A novel time-to-first-spike CMOS imager is presented, in which the timing of a *single spike* from each pixel encodes the illuminance of each pixel. This temporal representation of illuminance can widen the image sensor dynamic range to over 100dB. To reduce power consumption, an asynchronous address-event readout technique is incorporated. The imager essentially implements a pixel-level A/D conversion and benefits from the continued process scaling to deep submicron levels. Results are shown from a  $32 \times 32$ pixel imager fabricated in the  $0.5\mu m$  AMI process with measured dynamic range of 104dB in one image. A test chip with one pixel and digital control logic has been fabricated through MOSIS using the TSMC  $0.18\mu m$  standard digital CMOS process. It has demonstrated the expected functionalities. A modified architecture is also proposed for a larger imager array.

# 1. INTRODUCTION

Dynamic range (DR) is an important performance criterion of image sensors. DR is commonly defined as the ratio of the maximum measurable signal to the noise floor under dark conditions. Conventional CMOS image sensors are limited to 60-70dB of DR [1], while typical outdoor scenes have DR of more than 100dB. To deal with such a wide range of illumninance, researchers have investigated several approaches. One way is to use the logarithmic response of a diode-connected MOS transistor [2]. Usually this type of sensor suffers poorer fixed pattern noise (FPN) and worse signal-to-noise ratio (SNR) compared to integration-mode image sensors [3]. Well capacity adjustment is another technique to widen the DR by compressing the illuminance to charge curve [1]. However, this scheme also exhibits lower SNR [3]. So far, the most successful technique is multiple sampling, which uses shorter exposure times to capture the brighter part of the scene and longer exposure times to capture the darker regions [3]. This type of image sensor is very promising but fundamentally consumes more power and requires a large data bandwidth to readout the multiple frames. Unlike a conventional CMOS active pixel sensor (APS), a time-based image sensor represent illuminance information in the time domain. It allows each pixel to choose its own optimal integration time, therefore achieving high dynamic range and improved SNR as well. In the previous time-based designs, each pixel crudely mimics the behavior of a neuron and works as a free-running continuous oscillator [4], [5]. Although the reconstruction methods are different, they all have to readout a large amount of redundant information, which implies more power consumption, larger required data bandwidth, and more frame memory. Another serious problem with these designs is that a long frame time may be needed to collect all useful information for the scene recovery, which is obviously not feasible for video mode applications.

Recently, biologists have claimed that the most useful visual information is contained in the first spike after onset of the neuronal response [6]. Inspired by this biological theory, we have proposed a novel time-to-first-spike (TTFS) imager [7]. To overcome the shortcomings of previous time-based designs, the TTFS imager transforms each pixel's illuminance into a pulse event that can *only occur once per pixel per frame*. Since only one spike is output for each pixel, the TTFS imager consumes less power, less bandwidth and requires smaller frame memory than other time-based imagers. In section 2, we will show that TTFS is also feasible for video mode applications by varying the reference voltage during frame capture.

The rest of the paper is organized as follows. We will describe the principles of the TTFS imager in Section 2. The system architecture and a prototype imager test results are shown in Section 3. Section 4 describes the  $0.18 \mu m$  CMOS implementation. A modified TTFS structure to reduce collision problems is presented in Section 5.

### 2. PRINCIPLES

Basically, the TTFS imager tries to extend the dynamic range on the high end, limited by the power supply. Full

Acknowledgments to Julian Chen and Texas Instrument for funding and technical support during the early stages of this work

details are given elsewhere [7], but in short when photocurrent discharges a capacitor, the following relation holds:

$$I_{photo} = C \frac{\Delta V}{\Delta t_{int}} \tag{1}$$

where  $\Delta V$  is the measured pixel output voltage or signal swing, C is the pixel capacitance (assumed to be constant),  $\Delta t_{int}$  is the integration time for each pixel, and  $I_{photo}$  is the photocurrent for each pixel (assumed to be constant for each integration time). In the TTFS imager, instead of choosing a fixed integration time as in conventional CMOS imagers, the integration time of each pixel varies with respect to the illuminance. Dynamic range can be enhanced when illuminance is encoded with the temporal information  $\Delta t_{int}$ . For this representation, the dynamic range is the ratio of the maximum achievable integration time to shortest detectable integration time. To realize this scheme, there is a comparator inside each pixel. When the voltage on a photodiode drops below a global reference voltage  $V_{ref}$ , the comparator inverts, and the pixel generates a pulse (i.e. it has fired), as shown in figure 1. After one pixel has fired, the address is read out, and then it is disabled for the rest of the frame. The time information of the readout address represents the illuminance of the pixel. For the video mode applications, the maximum achievable integration time is limited by the video frame time, which typically is about 32ms. Current CMOS processes limit typical time-based imagers in video mode to around 60dB of dynamic range which is about the same as the conventional CMOS imagers. However, by slowly increasing the voltage threshold from the lowest value to the reset voltage with one frame time, all pixels of the TTFS imager are guaranteed to pass the threshold and fire (see figure 1). Typically, the TTFS imager is able to widen the dynamic range to over 100dB within a 32ms frame time. Other time-based imagers are not able to take advantage of a globally varying reference voltage since the pixels run asynchronously.

## **3. SYSTEM ARCHITECTURE**

The TTFS image sensor architecture is shown in figure 2. Unlike most digital circuits in use today, the readout circuits of the TTFS imager operate asynchronously, which has the potential to achieve higher speed, lower power, improved noise and electromagnetic compatibility (EMC). This address-event readout architecture is widely used in neuromorphic architectures and described well by Boahen [8]. Inside each pixel, there is a photodiode, a comparator and digital control circuitry. To minimize FPN, an autozeroing technique is adopted to reset the pixel. A prototype  $32 \times 32$  TTFS imager was fabricated using the AMI  $0.5\mu m$ CMOS technology through MOSIS. This imager achieved a measured DR of 104dB limited by the optics during test.



Fig. 1. Scheme for Time-to-first-spike imager of still mode

Two captured high dynamic range images are shown in figure 3. The prototype imager has proven the functionalities of the TTFS imager architecture. There are two limitations of the current version of the TTFS imager. First, the array size of  $32 \times 32$  is too small in  $0.5\mu m$  technology. Section 4 discusses our move to  $0.18\mu m$  technology. Second, compared to other time-based imagers, collisions are much more likely to occur because of the global frame synchronization. Section 5 discusses the details on a rolling shutter option which reduces the problem of collisions.



Fig. 2. TTFS: TTFS imager without on-chip memory



**Fig. 3.** Images captured by prototype TTFS imager (without postprocessing). (A) and (B) show the bright and dark part of a word 'UF' adjacent to a lamp, respectively. (C) and (D) show the bright and dark part an incandescence bulb, respectively.

# 4. IMPLEMENTATION IN SUBMICRON CMOS TECHNOLOGY

To deal with the large pixel size of the prototype imager, and further increase asynchronous readout speed, an advanced submicron technology is necessary. In our current design, we also try to shrink the pixel schematic without degrading its performance. The new pixel schematic for  $0.18 \mu m$ technology is shown in figure 4. Global control logic provides  $V_{ref}$  or  $V_{reset}$  to each pixel according to the control signal rst. The pixel works as follows. The photodiode is initially reset to  $V_{reset}$  via a negative feedback loop. After the *rst* goes high, the photodiode is discharged and the voltage across the photodiode linearly drops. When the voltage drops below  $V_{ref}$ , the comparator output node flips and the node req goes high. Then the pixel sends out a request to the row arbiter by pulling down  $row\_request(m)$ . If the row arbiter selects this row by making *row\_select(m)* high, column request  $col_request(n)$  signals will be sent to the column latch. After  $col_request(n)$  signals are latched, the corresponding control signal  $disable_row(m)$  is generated to disable the pixel by switching on transistor M2. Thus the pixel will not fire again until the next reset phase turns off M2. We implement the front-end circuit using standard thick oxide (3.3 V) transistors (labelled with \*) to avoid the high gate and subthreshold leakage currents. Implementing the comparator using thick oxide transistors also makes it possible to use the high power supply (3.3 V) to increase the voltage swing. In order to shift down the high voltage to nominal 1.8 V supply, INV1 with thick oxide transistors is included

working as a level shifter. Positive feedback following the comparator is used to make the comparator output node immune to the switching noise. Note that the control signal *isolation* is used to disable the positive feedback during the reset phase.



Fig. 4. Schematic diagram of the TTFS pixel circuit



Fig. 5. Diagram of the analog charge injection through comparator output node

The comparator inside each pixel also works as an opamp during the reset phase. To ensure autozeroing could reduce the FPN to an acceptable level, the opamp open loop gain should be at least 100. Since the comparator only implements a 1-bit comparison, an ultrahigh gain and fast response time are not necessarily required. Therefore we choose a simple 5-transistor opamp shown inside the dashed box of figure 5. To save power, the opamp is biased in the subthreshold region. The simulated gain is A = 44dB, and bandwidth is 114kHz at room temperature with 200nA total bias current.

Offset FPN arises from two sources: comparator offset and variation in the analog switching feedthrough. Comparator offset could be efficiently reduced by autozeroing. The analog switching feedthrough has two paths. One comes from the reset switch M0, and the other is through the overlap capacitance  $C_{ov_out}$  (see figure 5). The overall offset residue is given by

$$Q_{off,res} \approx \left( -\frac{V_{off}}{A+1} + \frac{V_{dd}C_{ov0}}{C_{pd} + C_{M1} + C_{ov0}} + \frac{\alpha C_{ov0}W_0L_0(V_{reset} - V_{tp0})}{C_{pd} + C_{M1}} \right)$$

$$+ \frac{V_{reset}C_{ov\_out}}{C_{pd} + C_{M1} + C_{ov\_out}} \left( C_{pd} + C_{M1} \right)$$

$$(2)$$

where  $V_{off}$  is the offset voltage of the comparator, A is the open loop gain,  $\alpha$  is the proportion of M0's channel charge transferred to the gate of M1,  $C_{ov0}$  is the overlap capacitance of M0,  $W_0$  and  $L_0$  are the width and length of M0 respectively, and  $V_{tp0}$  is the threshold voltage of M0. The formula shows that reducing  $C_{M1}$  could reduce the offset residue. However, to ensure the reliable operation, well capacity must be maximized, which sets the lower bound on  $C_{M1}$ . The pixel layout size is  $12.4 \times 12.1 \mu m^2$  with about 7% fill factor.

### 5. TTFS WITH A ROLLING SHUTTER

The asynchronous readout inevitably introduces readout delay error into the reconstructed image. Although the readout delay error is negligible for QCIF size images, it will limit the achievable dynamic range at the short firing time end for larger arrays. In natural images, neighboring pixel intensities are strongly correlated, so the firing times tend to occur simultaneously. To avoid the high collision rate for this case, we modified the TTFS architecture with a rolling shutter, called TTFS\_RS. Instead of sharing a global reset, each row has its own reset signal. The essence of the rolling shutter is to spread out the firing times of neighboring pixels. Decorrelating the firing times will decrease the amount of incident readout requests, therefore reduce the collision rate.

To quantify the efficiency of TTFS\_RS, a Matlab-based TTFS imager simulator was built. Critical timing information is extracted from the SPICE simulation with consideration of the worst case. We investigated mean relative (MRE) errors for the high dynamic range image *vinesunset* (from [9]) with 480×720 size. The Matlab simulated MREs, listed in Table 1, demonstrate that the TTFS\_RS imager successfully decreases the reconstruction error by spreading the firing times.

Table 1. The MREs of vinesunset for TTFS\_RS

| Shutter delay | 0     | $20\mu s$ | $35 \mu s$ | $50 \mu s$ |
|---------------|-------|-----------|------------|------------|
| TTFS_RS       | 44.9% | 1.22%     | 0.275%     | 0.117%     |

#### 6. CONCLUSION

We have introduced a time-to-first-spike imager. An asynchronous readout technique was adopted to reduce the power consumption and relax the bandwidth requirements. We fabricated and tested a  $32 \times 32$  pixel array in  $0.5\mu m$  CMOS technology and discussed its implementation in  $0.18\mu m$  technology. One pixel and external digital logic circuits have been fabricated  $0.18\mu m$  CMOS technology. The test results have demonstrated the expected functionalities. A  $128 \times 128$  large array will be scheduled to be fabricated using TSMC  $0.18\mu m$  digital logic technology through MO-SIS.

#### 7. REFERENCES

- K. Brehmer S. Decker, R. D. McGrath and C. G. Sodini, "A 256x256 CMOS imaging array with wide dynamic range pixels and column-parrallel digital output," *IEEE J. Solid-State Circuits*, vol. 33(12), pp. 2081– 2091, 1998.
- [2] C. Mead, Analog VLSI and neural systems, Addison-Wesley, Reading, Massachusetts, 1989.
- [3] D. Yang, A. El Gamal, B. Fowler, and H. Tian, "A 640 x 512 CMOS image sensor with ultrawide dynamic range floating-point pixel-level ADC," *IEEE Journal of Solid-State Circuits*, vol. 34(12), pp. 1821–1834, 1999.
- [4] E. Culurciello, R. Etienne-Cummings, and K. Boahen, "Arbitrated address event representation digital image sensor," *IEEE Journal of Solid-State Circuits*, vol. 38(2), pp. 281–294, 2003.
- [5] W. Yang, "A wide-dynamic range, low-power photosensor array," in *ISSCC Digest of Technical Papers*, 1994, pp. 230–231.
- [6] S. Thorpe, D. Fize, and C. Marlot, "Speed of processing in the human visual system," *Nature*, vol. 381, pp. 520– 522, 1996.
- [7] X. Guo, M. Erwin, and J.G. Harris, "Ultra-wide dynamic range CMOS imager using pixel-threshold firing," in *Proceedings of 5th World Multiconference on Systemics, Cybernetics and Informatics*, Orlando, FL, July 2001, vol. XV, pp. 485–489.
- [8] K.A. Boahen, "A throughput-on-demand address-event transmitter for neuromorphic chips," in *Proceedings of ARVLSI*, 1999, pp. 72–86.
- [9] P. Bebevec, "Recovering high dynamic range radiance maps from photographs," in [online] http://www.debevec.org/Research/HDR/.