

Open access · Journal Article · DOI:10.1109/LPT.2010.2093876

# A 3-D Integrated Intrachip Free-Space Optical Interconnect for Many-Core Chips — Source link 🖸

Berkehan Ciftcioglu, Rebecca Berman, Jian Zhang, Zach Darling ...+11 more authors Institutions: University of Rochester Published on: 01 Feb 2011 - IEEE Photonics Technology Letters (IEEE) Topics: Optical interconnect, Optical link and Free-space optical communication

Related papers:

- · Recent progress on 3-D integrated intra-chip free-space optical interconnect
- · High-speed on-chip and inter-chip optical interconnect technology for Tbit/s communication
- · 3-D integrated heterogeneous intra-chip free-space optical interconnect
- · System design of chip and board level optical interconnects
- · On-demand laser power allocation for on-chip optical interconnects



# A 3-D Integrated Intra-Chip Free-Space Optical Interconnect for Many-Core Chips

Berkehan Ciftcioglu, Rebecca Berman, Jian Zhang, Zach Darling, Shang Wang, Jianyun Hu, Jing Xue, Alok Garg, Manish Jain, Ioannis Savidis, Duncan Moore, Michael Huang, Eby G. Friedman, Gary Wicks, and Hui Wu

*Abstract*—This paper presents a new optical interconnect system for intra-chip communications based on free-space optics. It provides all-to-all direct communications using dedicated lasers and photodetectors, hence avoiding packet switching while offering ultra-low latency and scalable bandwidth. A technology demonstration prototype is built on a circuit board using fabricated germanium photodetectors, micro-lenses, commercial vertical-cavity surface-emitting lasers, and micro-mirrors. Transmission loss in an optical link of 10-mm distance and crosstalk between two adjacent links are measured as 5 dB and -26 dB, respectively. The measured small-signal bandwidth of the link is 10 GHz.

## I. INTRODUCTION

Continuing device scaling, if not compensated, degrades performance and signal integrity of on-chip metal interconnects, hence limiting the performance of many-core microprocessors and high-speed systems-on-chip (SoC). The communications-centric nature of future high performance computing devices demands a fundamental change in intraand inter-chip interconnect technologies. Optical interconnect exhibits inherent advantages in delay and bandwidth, hence eliminating the main limitations of its electrical counterparts [1], [2]. The key for optical interconnect's success is to lower power consumption and minimize the system complexity overhead.

Several signaling and networking schemes have been proposed for intra-chip optical interconnect. Applying a conventional packet-switching interconnect architecture to optical networks requires repeated electro-optic (E/O) and optoelectronic (O/E) conversions, diminishing the advantages of optical signaling. To avoid packet-switching, bus or ring structures can be used, which rely on wavelength division multiplexing (WDM) to achieve large bandwidth [3], [4]. These systems, however, typically require precise E/O modulators with accurate wavelengths and minimal transmission losses, hence difficult to realize in large-scale systems. In comparison,

This work was partially supported by NSF grants 0829915 and the DOE Office of Inertial Confinement Fusion under Cooperative Agreement No. DE-FC52-08NA28302, the University of Rochester, and the New York State Energy Research and Development Authority. The support of DOE does not constitute an endorsement by DOE of the views expressed in this article.

B. Ciftcioglu, J. Zhang, S. Wang, J. Hu, J. Xue, A. Garg, I. Savidis, M. Huang, E.G. Friedman, and H. Wu are with the Department of Electrical and Computer Engineering, University of Rochester, Rochester, NY 14627 USA (e-mail: hui.wu@rochester.edu).

R. Berman, Z. Darling, M. Jain, D. Moore, and G. Wicks are with the Institute of Optics, University of Rochester, Rochester, NY 14627 USA.

Copyright (c) 2010 IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending a request to pubs-permissions@ieee.org.



Fig. 1. (a) Cross-sectional and (b) top view of an intra-chip free-space optical interconnect system in a 3-D integrated chip stack. The VCSEL arrays are in the center and the photodetectors are on the periphery within each node.

inter-chip optical interconnect mostly rely on point-to-point links using directly modulated lasers integrated above silicon chips and waveguides integrated on the circuit board [5], [6].

Alternatively, free space optics can be used instead of waveguides in optical interconnect to eliminate the routing/networking complexity. Vertical-cavity surface-emitting lasers (VCSELs) and normal-incidence photodetectors (PDs) can be linked by a free-space optics fabric made of mirrors and lenses. For example, systems based on planar optics [7], macro-optics [8], [9], and microoptics [10], [11] have been demonstrated for inter- and intra-chip applications.

In this paper, we propose a new optical interconnect system specifically developed for intra-chip communications in manycore microprocessors and SoCs based on free space optics. Different from earlier work, this is a multidisciplinary approach that synergistically integrate photonics devices, circuits and optics with system and architecture design.

#### II. INTRA-CHIP FREE-SPACE OPTICAL INTERCONNECT

In the proposed intra-chip optical interconnect system, GaAs photonics and free-space optics layers are placed on top of the CMOS electronics layer via 3-D integration (Fig. 1). VCSELs in the GaAs photonics layer serve as light sources, hence removing the need for external multi-wavelength lasers used in WDM systems. Each light beam from the digitally modulated, backside-emitting VCSELs is collimated through a dedicated micro-lens, built at the back of the GaAs substrate, which is transparent at the target wavelength. The collimated light bounces from a series of micro-mirrors, implemented on the chip package using low-loss metal coatings. Then, it is focused by another micro-lens onto a PD, built with a thin germanium (Ge) layer on the silicon substrate. In the electronics layer, the



Fig. 2. Intra-chip FSOI link schematic.

transmit and received electrical signals are converted from/to digital data by the transceiver electronics.

This free-space optical interconnect (FSOI) system uses point-to-point links to construct an all-to-all intra-chip communication network. Multi-bounce beam guiding scheme enables a single thin transmission medium to provide free-space links between every VCSEL-PD pair at arbitrary locations. Hence, it offers significant signaling and networking advantages over electrical and other optical interconnects. First, FSOI avoids packet-switching and the associated intermediate routing and buffering delays in electrical networks or packet-switching optical networks. Hence, it exhibits low propagation delay, providing low latency. In addition, the communication links are formed in a fully distributed fashion with minimal arbitration, reducing the arbitration latency in large-scale systems. Second, it exhibits low loss, minimal dispersion and no bandwidth degradation regardless of topological distance. It eliminates the loss and crosstalk due to large number of waveguide crossings in densely routed waveguide-based optical interconnects, which limit the optical network performance. Third, it saves significant amount of energy by a) eliminating packetswitching related energy consumption, b) powering VCSELs down in low duty-cycle operation, and c) avoiding thermal tuning of sensitive E/O modulators in WDM systems. Fourth, the fully distributed nature of FSOI renders this scheme highly scalable (see Sec.III-d). In addition, placing optics and electronics on separate planes accommodates their different technological and scaling constraints. Finally, FSOI's good signal integrity simplifies optical transceiver electronics, e.g., removing the need for equalization in high-speed electrical interconnects.

#### **III. FSOI DESIGN**

In the following, we demonstrate the design of the proposed FSOI system within a typical many-core microprocessor. The target chip is assumed to have  $2.3 \times 2.3$ -cm<sup>2</sup> area, the chip size for high performance processors at introduction [12]. The VCSELs have 20- $\mu$ m device size and 5- $\mu$ m aperture size. The 47 × 47- $\mu$ m<sup>2</sup> PDs are based on our fabricated Ge-on-Si PDs (see Sec.IV). The micro-lenses and micro-mirror sizes are scaled with the number of nodes (see Sec.III-d). Predicting technology model (PTM) [13] is used for transistor models.

a) Optics: The optical wavelength is chosen as 980 nm, a good compromise between GaAs VCSEL and Ge PD performance. At this wavelength, the far field  $1/e^2$  VCSEL full-angle divergence is estimated as  $16^{\circ}$ . Micro-lenses at the backside of a 625- $\mu$ m thick GaAs substrate, each of which has an aperture size three times of the beam radius [10], are covered with SiN anti-reflection coating to eliminate 1.5-dB/lens loss due to reflections at the air/GaAs interface. These micro-lenses can be fabricated using regular photolithography because of their



Fig. 3. (a) The area coverage ratio of VCSELs, PDs and the maximum optical path loss, and (b) BER and bandwidth at different energy efficiencies with respect to number of nodes.

small thickness (as compared to [8]). Correspondingly, the required GaAs substrate thickness is small and hence, easy for wafer-level bonding. Gold coated mirrors with 98% reflection (0.09-dB loss) are placed 1 mm from the micro-lenses on the package and at the backside of the GaAs substrate.

b) Signaling: To evaluate interconnect performance, a single-bit FSOI link is analyzed for the longest optical path of 3.24 cm, corresponding to a 115-ps latency, which is diagonally crossing the chip with five bounces from the mirror. Based on device simulation in DAVINCI, the VCSEL exhibits a 170- $\Omega$  differential series resistance and 76-*f*F capacitance with a threshold current of 0.15 mA; the PD has a 0.4-A/W responsivity at 980 nm with 40-fF capacitance. As shown in Figure 2, a CMOS optical transceiver is designed using 16-nm PTM device parameters. The link bandwidth is calculated as 13 GHz for 0.4-mW optical power, and the BER is less than  $10^{-12}$  with 0.5-pJ/b energy efficiency at 10-Gb/s data rate.

c) Networking: There are N communication nodes in this system, and each link between a pair of nodes has 8 bits. Hence, each node has  $(N - 1) \times 8$  VCSELs. To utilize the sporadic nature of intra-chip communications and to save chip area, each node has  $4 \times 8$  PDs. By intentionally avoiding arbitration, data collision at PDs is allowed, and is detected by the microprocessor; then a confirmation signal is sent to the transmit node through a separate VCSEL-PD link to retransmit the data [14]. Given a data rate of 10-Gbps per VCSEL-PD pair, the bandwidth of a 8-bit link is 80 Gbps, and aggregated interconnect system bandwidth is  $N \times 320$  Gbps.

d) Scaling: A state-of-the-art microprocessor at 45-nm technology has 8 cores, doubling every generation, increasing to 128 at 11-nm technology node in a  $2.3 \times 2.3$ -cm<sup>2</sup> fixed chip area. In our system, a communication node is shared between four adjacent cores, similar to [4]. In this one-lensper-bit design, the area occupied by the optical components depends on the lens sizes. To facilitate the mirror placement, less than half of the chip area is covered by micro-lenses. PD lens is fixed at 250  $\mu$ m for all number of nodes, whereas VCSEL lens scales from 250  $\mu$ m for 2 nodes to 136  $\mu$ m for 36 nodes. Correspondingly, it achieves maximum bandwidth densities of 4 Tbps/cm<sup>2</sup> and 6.25 Tbps/cm<sup>2</sup>, respectively, limited by the Gaussian beam divergence. For 36 nodes, the system can achieve a maximum path loss of 1.7 dB and area coverage of 48% (Fig. 3-a) with a BER below  $10^{-12}$  at 0.5-pJ/b energy efficiency and 10-Tbps aggregate bandwidth (Fig. 3-b). Therefore, this scheme can be applied to chips with 144 cores, corresponding to 11-nm node on the technology



Fig. 4. (a) 2-D cross-section of FSOI, and (b) image of the beam collimated by VCSEL lens at different distances. The beam size is 240  $\mu$ m and 250  $\mu$ m, corresponding to 1.5 and 1.9-dB clipping losses at 1 cm and 2 cm, respectively.

roadmap. For larger chip size, number of nodes or number of bits, micro-lenses with larger numerical aperture (NA) can be built on a thicker GaAs or fused silica substrate and used for all VCSELs in a 8-bit link, i.e. one-lens-per-link, to enable larger bandwidth density.

#### IV. PROTOTYPE AND FSOI MEASUREMENT RESULTS

A prototype FSOI system is built on a printed circuit board (PCB) using micro-lenses, micro-mirrors, PDs, and VCSELs. The VCSELs are a commercial  $1 \times 4$  array (Finisar HFE8004-103), with 2-mW optical power at 850 nm, 40% conversion efficiency, 30° full-angle beam divergence and 10-Gb/s speed. The micro-lenses are built on a 525- $\mu$ m thick fused silica substrate by melting and reflowing the 10- $\mu$ m thick and 220- $\mu$ m diameter cylindrical photoresist, limited by the 250- $\mu$ m VCSEL-to-VCSEL pitch size. The resultant spherical shape has a 1.22-mm radius of curvature and a corresponding focal point of 730  $\mu$ m from the surface of the lenslet. The fabricated  $1 \times 4$  array PDs are  $47 \times 47$ - $\mu$ m<sup>2</sup> metal-semiconductor-metal Ge PDs with a thin layer of undoped amorphous-silicon on the substrate, enabling low dark current and large bandwidth. The PD has a 0.23-A/W responsivity without anti-reflection coating and a 13-GHz bandwidth at 7-V bias and 850 nm [15].

As shown in Figure 4, the VCSELs and PDs are wirebonded to the PCBs with 50- $\Omega$  transmission lines, connected to the instruments via RF connectors at the edge. Two micro-lens arrays are UV-cured to spacers on top of the VCSELs with a sufficiently wide gap for the wirebonds, and directly onto PDs. The VCSEL chip is 230  $\mu$ m from the back surface and 25  $\mu$ m from the focal point of the lens. The 96% reflective mirrors are mounted 1 mm away from the top of the microlenses with  $45^{\circ}$  angle. The optical transmission for a specific link and the optical crosstalk of adjacent links with respect to distance are shown in Fig. 5-a. At 0.5-mW laser optical power, the transmission loss is 5 dB at a 10-mm distance and increases to 6.5 dB at 26 mm. The crosstalk power is -29 dB at 10 mm and increases to -25 dB at 26 mm. The small-signal bandwidth of the link is measured as 10-GHz and does not change with distance (Fig. 5-b). 1.25-dB and 1.5-dB optical power clipped at the VCSEL lens and PD lens at 1 cm are predominantly due to the large divergence angle of the commercial VCSEL in conjunction with the small NA of the lenses. To eliminate these clipping losses, smaller VCSEL beam divergence and/or a larger aperture size lenses with the larger focal length can be used at both ends [10]. The reflection losses can be mitigated by using low-loss metal for mirrors and anti-reflection coating on the lenses.



Fig. 5. (a) Transmission and crosstalk at different link distances, and (b) small-signal bandwidth at L=1 cm. Note that the optical transmission changes between -5 and -6.5 dB due to the little change in the beam spot size.

## V. CONCLUSION

A novel intra-chip optical interconnect system using freespace optics and 3-D integrated photonic devices is proposed for many-core processors and SoCs. This system achieves direct communications between individual cores with low latency and loss, while enabling a scalable bandwidth. For technology demonstration purposes, a prototype using individual arrays of VCSELs, micro-lenses, micro-mirrors, and photodetectors has been built on a PCB carrier. The prototype achieves less than 6.5-dB loss and -25-dB crosstalk power up to 2.6 cm with more than 10-GHz bandwidth.

#### REFERENCES

- J.W. Goodman, F.J. Leonberger, S.-Y. Kung, and R. Athale, "Optical Interconnections for VLSI Systems," *Proc. of IEEE*, 72:850-866, July 1984.
- [2] A.V. Krishnamoorthy and D.A.B. Miller, "Scaling Optoelectronic-VLSI Circuits into the 21st Century: A Technology Roadmap," *EEE Journal* of Selected Topics in Quantum Electronics, 2(1):55-76, Apr. 1996.
- [3] R.J. Beausoleil, P.J. Kuekes, G.S. Snider, S.-Y. Wang, and R.S. Williams, "Nanoelectronic and Nanophotonic Interconnect," *Proc. of IEEE*, 96(2):230-247, Feb. 2008.
- [4] N. Kirman et al., "Leveraging Optical Technology in Future Bus-based Chip Multiprocessors," *In Proc. Int'l Symp. on Microarch.*, pages 492-503, Dec. 2006.
- [5] L. Schares et al., "Terabus: Terabit/second-Class Card-Level Optical Interconnect Technologies," *Journal of Selected Topics in Quantum Electronics*, 12(5):1032-1044, Sept.-Oct. 2006.
- [6] I. Young et al., "Optical I/O Technology for Tera-Scale Computing," IEEE International Solid-State Circuits Conference, pages:468-69, 2009.
- [7] J. Jahns et al., "Hybrid Integration of Surface-Emitting Microlaser Chip and Planar Optics Substrate for Interconnection Applications," *IEEE Phot. Tech. Lett.*, 4(12):1369-72, 1992.
- [8] M.W. Haney et al., "Description and Evaluation of the FAST-Net SmartPixel-Based Optical Interconnection Prototype," *Proc. of the IEEE*, 88(6):819-28, 2000.
- D.V. Plant et al., "256-Channel Bidirectional Optical Interconnect Using VCSELs and Photodiodes on CMOS," *IEEE Journal of Lighwave Tech.*, 19(8):1093-103, 2001.
- [10] C. Debaes et al., "Low-Cost Microoptical Modules for MCM Level Optical Interconnections," *IEEE J. Sel. Top. Quantum Elect.*, 9(2):518-30, March-April 2003.
- [11] M.J. McFadden et al., "Multiscale Free-Space Optics Interconnects for Intrachip Global Communication: Motivation, Analysis, and Experimental Validation," *Applied Optics*, 45(25):6358-6366, 2006.
- [12] "International Technology Roadmap of Semiconductor," www.itrs.org. 2009.
- [13] "Predictive Technlogy Models," ptm.asu.edu. 2008.
- [14] J. Xue et al., "An Intra-chip Free-space Optical Interconnect," 37th International Symposium on Computer Architecture (ISCA), 2010.
- [15] Berkehan Ciftcioglu, Jie Zhang, Roman Sobolewski, and Hui Wu, "An 850-nm Normal-Incidence Germanium Metal-Semiconductor-Metal Photodetector with 13-GHz Bandwidth and 8-μA Dark Current," to appear in IEEE Phot. Tech. Lett., 2010.