# A Low-Power Low-Cost Fully-Integrated 60-GHz Transceiver System With OOK Modulation and On-Board Antenna Assembly

Jri Lee, Member, IEEE, Yentso Chen, and Yenlin Huang

Abstract—A fully-integrated 60-GHz transceiver system with on-board antenna assembly is presented. Incorporating on-off keying (OOK) and low-cost antenna designs, this prototype demonstrates a low-power solution for multi-Gb/s wireless communication. The enhanced OOK modulator/demodulator obviates baseband and interface circuitry, revealing a compact solution. Two antenna structures, folded dipole and patch array, are employed to fully examine the performance. Designed and fabricated in 90-nm CMOS technology, the transmitter and the receiver consume 183 and 103 mW and occupy 0.43 and 0.68 mm<sup>2</sup>, respectively. With 4 × 3 patch antenna array, the transceiver achieves error-free operation (BER <  $10^{-12}$ ) for  $2^{31} - 1$  PRBS of 1 Gb/s over a distance of 60 cm.

*Index Terms*—Demodulator, folded dipole antenna, low-noise amplifier (LNA), mixer, modulator, on-off keying (OOK), patch antenna array, power amplifier (PA), voltage-controlled oscillator (VCO), wireless transceiver, 60-GHz RF.

#### I. INTRODUCTION

VER the decades, communication engineers are devoted to develop better wireless systems to fulfill different requirements, and many standards have been proposed accordingly. Shown in Fig. 1 is a summarized specification plot for representative wireless standards. It is obvious that for a higher carrier frequency, one can get a larger bandwidth but shorter communication distance. The emerging 60-GHz band is believed to be a great candidate for high-speed indoor RF link owing to its 7-GHz available bandwidth. Recent research [1]-[3] has kindled the development of high-speed, short-range, and lowpower transceiver ICs for wireless links. Here, we can easily come up with lots of applications, such as uncompressed video of high definition multimedia interface (HDMI), high-speed file transfer among electronic devices and laptops, and fast movie or video game download from kiosk. Some wireless personal area networks (WPANs) have even regulated the spectrum, e.g., the IEEE 802.15.3c specifies 4 bands around 60 GHz and each of them is 2.16-GHz wide [4]. Overall speaking, within a range of several meters, video/audio signals and large files can be wirelessly transferred among the electronic devices at a data rate of

Manuscript received March 12, 2009; revised August 01, 2009. Current version published February 05, 2010. This paper was approved by Associate Editor Ranjit Gharpurey.

The authors are with the Electrical Engineering Department, National Taiwan University, Taipei, Taiwan (e-mail: jrilee@cc.ee.ntu.edu.tw).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2009.2034806

10<sup>10</sup> 10<sup>6</sup> Ē 60GHz Communication Distance UWB A ▲ GSM 10<sup>°</sup> Data Rate (bps) 10<sup>4</sup> WCDMA WCDM 10<sup>6</sup> Bluetooth ▲ DECT GSM▲ ▲ ZigBee 10<sup>2</sup> Bluetooth WiFi/ZigBee▲ UWB A 10 60GHz 10 10<sup>12</sup> **10**<sup>10</sup> **10**<sup>10</sup> 10<sup>8</sup> 10<sup>4</sup> 10<sup>6</sup> 10<sup>8</sup> 10<sup>6</sup> 10<sup>2</sup> Carrier Frequency (Hz) Bandwidth (Hz)

Fig. 1. Summary of wireless standards.

a few Gb/s. Note that conventional solutions such as Bluetooth typically can only provide a data rate of several Mb/s at most.

While achieving a very high data rate, mm-wave transceivers may suffer from tradeoffs in many aspects. For example, if we directly inherit the well-developed architecture from the 2.4/5.2-GHz WLAN systems without optimization, it would inevitably consume significant power because the interface (e.g., ADCs) and the subsequent baseband circuitry (DSPs) are now operating at GHz. For example, a 2.5-GSample/s 7-bit ADC dissipates 50 mW in 45-nm CMOS process [5]. A receiver with I/Q paths and advanced baseband DSP would also burn a few hundreds of mW at this speed. The sampling clock distribution could be another issue if interleaved digitizers are used. Meanwhile, the propagation loss is of great concern. It is well known that the loss for an isotropic radiation system is given by

$$\frac{P_R}{P_T} = \frac{D_T D_R}{(4\pi)^2} \cdot \left(\frac{\lambda}{R}\right)^2 \tag{1}$$

where  $P_T$ ,  $P_R$  denote the (transmitted and received) power,  $D_T$ ,  $D_R$  the antenna directivity,  $\lambda$  the wavelength, and R the distance [6]. At 60 GHz, the isotropic loss (e.g.,  $D_T = D_R = 1$ ) at 10 meters would be as high as 88 dB, implying the importance of antenna design. Here, the future high-speed wireless link could possibly be developed in two directions. For the so called "digital home" environment, people emphasize on entertainment applications, e.g., high-speed uncompressed video signal must be delivered over several meters from a DVD player to a TV set. For such a distance, high directivity techniques such as antenna array and/or beamforming [7], [8] become mandatory, which



Fig. 2. (a) OOK power spectral density. (b) Transceiver architecture (with folded dipole antenna). (c) Link budget for 1-meter error-free communication.

may require high power consumption and large silicon area. Adaptive algorithm [9] is also necessary if the signal is delivered through a non-line-of-sight propagation. On the other hand, for fast data communication between portable electronic devices (e.g., digital camera and laptop), the communication distance is usually less than 1 meter, and quite a few physical limitations could be relaxed. Such designs would emphasize on simple structures to minimize power and silicon area, and the radiator design becomes straightforward as well. In present technology, on board antenna is still believed to be superior to other alternatives in terms of radiation efficiency and cost. Note that 60-GHz band is the only solution for Gb/s communication so far. Other high-speed wireless standards such as ultra-wide band (UWB) (480 Mb/s) [10] and IEEE 802.11n (300 Mb/s) are not fast enough for certain applications. For example, if we were to push the data rate of wireless universal serial bus (wireless USB) to 4.8 Gb/s (compatible to USB3.0 [11]), 60-GHz band seems to be the only choice.

This paper presents a low-power low-cost solution for short-distance communication using the 60-GHz unlicensed band. It integrates purely analog modulator/demodulator into the RF front-end, eliminating the digital interface and baseband circuitry entirely. On-board antenna is also incorporated in the transceiver system by special assembly technique. Here, two fundamental structures, namely, the folded dipole and patch array, are designed to fully investigate the trade-offs. Since the antenna is made of a commercially available material (Rogers RO4003 [12]), the overall chip-on-board assembly achieves very low cost. Designed and fabricated in 90-nm CMOS technology, the transmitter (Tx) and the receiver (Rx) consume 183 mW and 103 mW, while occupying 0.43 mm<sup>2</sup> and 0.68 mm<sup>2</sup>,

respectively. With the folded dipole antennas, the transceiver system achieves error-free operation (BER <  $10^{-12}$ ) for  $2^{31} - 1$  PRBS of 1.5 Gb/s over a distance of 6 cm. While transmitting 1-Gb/s data, the error-free distance can be extended to 61 cm by using 4 × 3 patch antenna array.

This paper is organized as follows. Section II describes the transceiver architecture, displaying its advantages and considerations. Sections III and IV cover the design details of the Tx and the Rx, respectively, describing circuit techniques for each block. Section V presents the on-board antenna design together with the assembly technique. Section VI summarizes the measurement results.

## II. TRANSCEIVER ARCHITECTURE

# A. Overview of OOK Modulation

To realize an easy transceiver architecture with non-coherent modulation, we resort to on-off keying (OOK), which is the simplest amplitude-shift keying. It is well known that the OOK power spectral density is a sinc square function with  $2/T_b$  mainlobe width around the carrier  $f_c$  [Fig. 2(a)]. Here,  $T_b$  denotes the bit period. Similar to frequency shift keying (FSK) and binary phase-shift keying (BPSK), OOK supports data rate up to half the available bandwidth [13]. For 60-GHz band, the maximum data rate would be about 3.5 Gb/s, which is adequate for many high-speed applications. It can also be shown [13] that the error probability (or bit error rate, BER) of a non-coherent OOK demodulation is given by

$$BER = \frac{1}{2} \exp\left(-\frac{E_b}{2N_0}\right) + \frac{1}{2}Q\left(\sqrt{\frac{E_b}{N_0}}\right)$$
(2)



Fig. 3.  $VCO_1$  and OOK modulator.

where  $E_b$  and  $N_0$  denote the average bit energy and noise power spectral density, respectively. The error rate of OOK demodulation is very close to that of FSK, but is inferior to that of coherent BPSK [which is  $Q(\sqrt{2E_b}/N_0)$ ]. However, to avoid the complicated carrier recovery circuit, we choose OOK for such a short-distance system. Note that none of the published 60-GHz solutions utilizing BPSK, QPSK, or 16-QAM [1], [14]–[16] can discard the baseband circuitry.

# B. Architecture

The transceiver design is illustrated in Fig. 2(b). Both the Tx and the Rx are fully integrated and co-designed with the on-board antennas. The transmitter consists of a 60-GHz VCO  $(VCO_1)$ , an on-off keying (OOK) modulator, and a pseudo-differential power amplifier (PA) made of two identical singleended amplifiers. The input data directly modulates the 60-GHz clock before it is delivered to the PA. The receiver comprises a pseudo-differential LNA, a double-balanced mixer, an IF amplifier, an OOK demodulator, and the subsequent limiting amplifier. The RF signal from the antenna is amplified by the LNA and then down-converted by the mixer to about 10 GHz with another on-chip VCO (VCO<sub>2</sub>) providing the 50-GHz LO signal. The choice of  $VCO_2$  frequency is a compromise between the maximum data rate and the IF amplifier bandwidth. Presuming the maximum data rate to be 5 Gb/s, we choose 10 GHz as an optimal intermediate frequency since it ensures at least two cycles of IF signal in each bit after down conversion. It also avoids the use of inductors in the IF amplifier. As compared with direct OOK detection such as [17], the demodulator in this architecture need not deal with 60-GHz signal, saving power and reducing circuit complexity considerably. (The transceiver in [17] consumes 1.3 W.) Here, the pseudo-differential realization of the PA and the LNA facilitates the chip-antenna coupling. The on-off keying (OOK) modulation obviates the need for complicated interfacing digitizers and subsequent DSPs. Note that the baseband power consumption is no longer a trivial issue since the data rate usually reaches Gb/s. The proposed modulator/demodulator provides analog signal processing with efficiency and robustness. Since the non-coherent OOK needs no precise frequency alignment, we incorporate open loop oscillators in both Tx and Rx for simplicity. As will be demonstrated in Section IV, the frequency drift between VCO<sub>1</sub> and VCO<sub>2</sub> does not affect the BER of this transceiver. If necessary, the Tx structure can be easily modified to comply with standard frequency arrangement [4] by introducing an mm-wave frequency synthesizer [18], [19]. In this prototype, we limit the maximum data rate to about  $3.3 \sim 3.5$  Gb/s to accommodate the approximately 7-GHz available bandwidth. In fact, as shown in Section VI, the transceiver does achieve an error-free operation (BER <  $10^{-12}$ ) at 3.3 Gb/s for 2 cm.<sup>1</sup> Note that this prototype is to demonstrate the feasibility of 60-GHz low-cost systems, so the whole 7-GHz bandwidth is allocated to one channel. We are not designing a commercially ready chip which satisfies specifications of 60-GHz standards.

The link budget can be described as shown in Fig. 2(c). The PA provides output power of about 5 dBm, which leads to -35 dBm input power<sup>2</sup> at the Rx front-end after 1-meter radiation distance. Here we assume the antenna gain is equal to  $14 \times 2 = 28$  dB. Both the LNA and the mixer/IF amplifier are responsible for  $15 \sim 20$  dB gain, providing approximately -3 dBm input power for the demodulator.<sup>3</sup> In real implementation, however, the LNA suffers from slight gain degradation, so the longest error-free distance reduces to 61 cm (Section VI).

#### **III. TRANSMITTER BLOCKS**

# A. VCO and Modulator

The VCO<sub>1</sub> and the modulator design along with the waveforms of important nodes is shown in Fig. 3. Employing the  $3\lambda/4$  technique proposed in [18], this VCO extends the resonating inductors twice between the VCO<sub>1</sub> and the modulator to reduce the equivalent loading capacitance seen looking into

<sup>3</sup>Simulation suggests that the demodulator requires an input of -3 dBm (SNR  $\approx 17$  dB) to achieve the specified BER.

<sup>&</sup>lt;sup>1</sup>This measurement is conducted in laboratory. We do recognize that in actual environment, the transceiver performance may be degraded due to multi-path and interference issues.

<sup>&</sup>lt;sup>2</sup>This value is conservative because we target BER  $< 10^{-12}$ .



Fig. 4. Power amplifier design.

the  $M_4-M_5$  pair at a cost of slightly higher loss. Here, the two inductors in series in each arm can be modeled as a  $\lambda/2$  transmission line [18]. Due to the loss, the loading capacitance at nodes X and Y could appear differently at the VCO side. That is, the loading impedance could rotate with a gradually reduced radius on the Smith chart when retriving from the load. For a purely capacitive loading at nodes X and Y and inductor Q of 5, we obtain a capacitance reduction of about 20% with some additional loss (which will be absorbed by the negative resistance created by the cross-coupled pair). At 60 GHz, the equivalent capacitance reduction becomes more significant since the loading at X and Y is no longer purely capacitive. Nonetheless, transistor-level simulation confirms this observation. Without this technique, the  $VCO_1$  with the same power consumption and device dimensions would oscillate at only 36 GHz. Here, the VCO design basically follows that of [18], in which theoretical analysis regarding the  $3\lambda/4$  technique is addressed from different aspects in detail. The level-boosting transistor  $M_3$  imitates the on-resistance of the bottom switch  $M_8$ . It not only raises up the common-mode level of  $CK_{out}$ , but helps to establish a self-biased dc coupling between the two blocks. That is, with proper scaling, the pair  $M_4$ – $M_5$  experiences optimal input driving from VCO<sub>1</sub>.

In the modulator, switches  $M_6$ ,  $M_7$ , and  $M_8$  completely block the input clock when they are turned off, preventing potential signal leakage to the output. The pMOS switch  $M_9$  is also introduced here to achieve a quick shut off at the output. Simulation shows that such a resetting accelerates the on-off transitions by at least 80 ps, making the transceiver more robust for high data rate. Note that a slight offset on the VCO frequency exists between data Zero and data One since the gate capacitance of  $M_4$  and  $M_5$  would decrease when they are off. It is not an issue at all in OOK modulation because nothing is delivered to the output in the Zero state.

#### B. Power Amplifier

The power amplifier design as well as the device parameters are depicted in Fig. 4. Here, the matching between the PA and the modulator is realized as conjugate matching to deliver the maximum power. That is, the on-chip inductors and capacitors  $L_1$ ,  $L_2$ ,  $C_1$ , and  $C_2$  form an impedance of 20 - j70 ( $\Omega$ ) seen looking into the input at 60 GHz, while the modulator's output presents an impedance of 20 + j70 ( $\Omega$ ). Conjugate matching is also applied between the five PA stages. The PA's output impedance is designed as purely 50  $\Omega$  in order to match the differentially 100- $\Omega$  antenna. Such an arrangement facilitates the co-design of the chip and the on-board antenna, since a short transmission line is required as the interconnection between the two and that is usually implemented as 50  $\Omega$ . Simulation suggests that the five-stage class-A architecture achieves a peak gain of 10.8 dB, 1-dB compression point ( $P_{1dB}$ ) of 7.2 dBm, and maximum power-added efficiency (PAE) of 8.5%. The -10-dB output matching bandwidth is estimated to be 22 GHz. Two identical single-ended PAs are employed to accommodate the differential input of the antennas, arriving at pseudo-differential operation and better common-mode rejection.

## **IV. RECEIVER BLOCKS**

## A. LNA and Mixer

The simple OOK modulation scheme allows one-step downconversion to 10 GHz for non-coherent demodulation. Fig. 5(a) shows the receiver front-end. Similar to the PAs, two identical LNAs are employed to receive the differential signal from antenna. Each LNA contains three stages. Fig. 5(b) illustrates the structure of the second and the third stages. Here, cascode topology with shunt-peaking technique is used to achieve better isolation and higher conversion gain. Interestingly, the peaking inductor  $L_P$  not only resonates out the parasitic capacitance associated with the internal node P, but provides image rejection to some extent if the series capacitor  $C_S$  is properly chosen [20], [21]. As shown in Fig. 5(b), we calculate the equivalent impedance of the peaking network and obtain

$$Z_{\rm in} = \frac{1 + s^2 L_P C_S}{s \left[ (C_P + C_S) + s^2 C_S C_P L_P \right]}.$$
 (3)

That is, the impedance drops to zero at  $(L_PC_S)^{-1/2}$  and approaches infinity at the peaked frequency  $[L_PC_SC_P/(C_S + C_P)]^{-1/2}$ . In other words, the circuit could allow more RF signal current flowing toward the load and reject the image by shorting the current to ground. Fig. 5(c) shows the simulated images for different  $C_S$ . Although in this design the image at 40 GHz is a little close to the 60-GHz RF, we still can improve the image rejection by approximately 28 dB through this approach. The first stage is designed similar to that of Fig. 5(b) but with no shunt peaking network to facilitate the input matching and compact layout. Again, conjugate matching is used between stages. To cover larger bandwidth, the three stages resonate at slightly different frequencies.

The double-balanced mixer design is shown in Fig. 6(a), which incorporates capacitive coupling to feed the RF signal into the mixer core. Similar to [22], the input directly applies to the common-source nodes A and B of the switching quad, saving one stage of devices and significant voltage headroom. The parasitic capacitances associated with the common-source nodes are resonated out by the loading inductors  $L_D$  of the last LNA stage. Resistive loading  $(R_D)$  is sufficient here to provide enough conversion gain while rejecting the undesired LO coupling. Note that a differential output is mandatory for the subsequent demodulator design, which will be described in the next section. Fig. 6(b) depicts the simulated performance



Fig. 5. (a) Rx front-end architecture. (b) 2nd and 3rd LNA stage. (c) Simulated image rejection.



Fig. 6. (a) Mixer. (b) Simulated Rx front-end performance.

of the Rx front-end. The LNA+mixer combination is expected to provide a bandwidth of 5 GHz for 20-dB gain, a minimum noise figure (NF) of 7.3 dB, and a -10-dB return loss ( $S_{11}$ ) bandwidth of greater than 20 GHz.

## B. Demodulator

Conventional envelope detectors such as common-source rectifiers [23] suffer from a few drawbacks. Consider the circuit shown in Fig. 7(a), where the non-zero (differential) input gets rectified and indicates a logic One. The output swing here is usually quite small. Assuming complete switching on  $M_1$  and  $M_2$ at peak inputs and neglecting body effect and channel-length modulation, we derive the output magnitude  $A_{out}$  as

$$A_{\text{out1}} \cong \frac{1}{2} (V_{GS1} + A_{\text{in}} - V_{GS2})$$
 (4)

where  $V_{GS1}$  and  $V_{GS2}$  denote the gate-source voltages under zero and peak inputs, respectively, and  $A_{in}$  the input magnitude. Note that the factor 1/2 represents the averaging introduced by  $C_P$ . It follows that

$$A_{\text{out1}} \cong \frac{1}{2} \left[ A_{\text{in}} - (\sqrt{2} - 1) \sqrt{\frac{I_{\text{SS}}}{\mu_n C_{\text{ox}} (W/L)_{1,2}}} \right].$$
 (5)

In other words,  $A_{out1}$  is proportional to  $A_{in}$  with a slope of 1/2 and a large offset. Such a signal reduction leads to poor signal-to-noise ratio (SNR) and degrades the performance significantly. Higher power consumption may be required as well since the faint output needs to be amplified after all.

To overcome the difficulties, we propose a much more efficient approach combining rectification with amplification. As depicted in Fig. 7(b), the modified circuit rectifies the differential input by the common-source amplifiers  $M_1$  and  $M_2$ , which



Fig. 7. Output swing analysis for (a) conventional, (b) proposed rectifiers, and (c) simulated results.



Fig. 8. (a) OOK demodulator. (b) Received data eye opening as a function of intermediate frequency.

are class-AB biased. By the same token, we obtain the output swing  $A_{\rm out2}$ 

$$A_{\text{out2}} \cong \frac{1}{4} \mu_n C_{\text{ox}} (W/L)_{1,2} R \cdot A_{\text{in}}^2.$$
 (6)

Here, we assume  $M_1$  and  $M_2$  are barely on for the case of logic Zero. The bias current flowing through  $M_1$  (or  $M_2$ ) is 280  $\mu$ A. Now the output becomes proportional to the square of the input, increasing the magnitude significantly. Fig. 7(c) compares the output swings of the two approaches in transistor-level simulation. With a differential input of 600 mV, for example, the proposed rectifier provides more than 4 times larger output than the conventional one. It is worth noting that the single-ended structure in Fig. 7(b) is inevitably subject to supply coupling. A special care is therefore taken in layout to ensure proper bypass around the local power line. The input magnitude is also enlarged as much as possible to increase the SNR. The complete demodulator design is shown in Fig. 8(a). Here, the class-AB biasing is accomplished by means of  $M_3$ ,  $R_b$ , and  $I_b$ . The  $R_1 - C_1$ network filters out the undesired 20-GHz ripple while allowing the recovered data to pass through it. That is, the corner frequency of it must locate between the data rate and twice the IF. The single-ended output  $V_Q$  is converted back to differential signal by distilling the dc level through  $R_2$  and  $C_2$ . As a result, we obtain the following design rule:

$$\frac{1}{2\pi R_2 C_2} \ll \text{Data Rate} < \frac{1}{2\pi R_1 C_1} < 20 \text{ GHz.}$$
 (7)

With the design parameters listed in Fig. 8(a), the upper and lower bounds of the data rate are 1.65 Gb/s and 1.59 Mb/s, respectively. Note that the demodulator would introduce 7 dB loss if we were to deliver 3.3-Gb/s data. If necessary, it is possible to raise the upper bound to further speed up the operation, but the unwanted ripple would also degrade the SNR. Since  $V_Q$  is relatively faint, the generated  $D_{out}$  would need to be further enlarged to a typical logic level ( $\approx$ 500 mV). This task is carried out by the subsequent limiting amplifier, which is realized as a two-stage buffer with resistive loads. The maximum gain of the limiter is 24 dB. Although not observed in this prototype, the



Fig. 9. (a) Impedance of simple dipole and folded dipole antenna. (b) Assembly of folded-dipole antenna. (c) HFSS simulation setup and its radiation pattern.



Fig. 10. (a) Patch array layout (unit:  $\mu$ m), (b) radiation pattern for 4 × 3 elements.

finite offset of the limiting amplifier may degrade the signal integrity. A compensation pair  $M_8-M_9$  is also introduced to neutralize any possible mismatch along the data path. If necessary, automatic offset cancellation techniques [24], [25] can be directly applied as well to further improve the performance.

It is worth noting that the OOK modulation presents great immunity against VCO frequency drift. Fig. 8(b) illustrates the eye opening as a function of IF for 2-Gb/s data rate. The eye opening stays at around 80% (due to ripples) for all the cases. In other words, the signal integrity is barely affected even for a few GHz frequency variation, primarily because the bit period (500 ps) is much longer than the  $R_1C_1$  time constant (96 ps). That implies the BER can be maintained even when the VCO<sub>2</sub> suffers from significant frequency deviation.

# V. ANTENNA DESIGN

#### A. General Consideration

Antenna design and its integration with transceivers are of great concern in mm-wave scale. From (1) we realize that the propagation loss at 60 GHz is at least 20 dB worse than that of the existing 5-GHz RF systems, since the received power is proportional to  $\lambda^2$  for an isotropic radiation. That is why the 60-GHz RF is usually dedicated to short-distance applications.



Fig. 11. (a) Die photos, (b) board assembly (with folded dipole).

Several attempts have been made to realize a more integrated antenna assembly. For example, it is always desirable to put the antenna on chip [26], [27]. However, this approach is not very realistic (at least in present technology), since the highly-doped silicon substrate ( $\rho < 10 \ \Omega - cm$ ) causes very low radiation efficiency (<10%). Using silicon lens [8] could overcome this issue to some extent, but it involves complicated post-fabrication process. Planar antenna [1] might be an alternative, but the yield is not clear. Actually, micromachined techniques such as deep-trench etching may be used to remove the substrate under the antenna, but it also leads to other issues. For instance, the



Fig. 12. PA measurement results: (a) setup, (b) large signal performance, (c)  $S_{22}$ .



Fig. 13. Rx front-end measurements: (a) setup, (b) gain and NF, (c) linearity.

trenched area must be large enough (at least  $1 \times 1 \text{ mm}^2$ ) to accommodate the entire cone-shaped radiation pattern, which in turn requires long interconnection between the core circuit and the antenna. Meanwhile, the whole silicon may become vulnerable, and pulling could be induced because of the substrate coupling. As of now, the on-board antenna is still believed to be the best candidate (even at 60 GHz) in terms of cost and yield. Here we use Rogers RO4003 board [12] for the antenna design. It provides a dielectric constant which is low enough ( $\varepsilon_r = 3.38$ ) to suppress all the higher order modes (other than  $TM_0$ ) of the surface wave. The distance h from the antenna to its ground plane is chosen as a compromise between surface wave suppression and bandwidth. To fully evaluate the transmission capability, we have designed the antenna in two different types, namely, the folded dipole and patch array. The former is a single-element antenna differentially connected to the PA and the LNA, while the latter comprises antenna arrays with different number of elements. We look at design details in the following subsections.

# B. Folded Dipole

The folded dipole structure is superior to a simple dipole since it provides a much higher input impedance. As depicted in Fig. 9(a), the matching would become very difficult if we were to use the simple dipole structure. Here, we choose dielectric height  $h = 508 \ \mu m$ , which provides 100  $\Omega$  differential (50  $\Omega$  single-ended) impedance with -10-dB bandwidth of 8 GHz. If a lower substrate thickness were used, it would become very difficult to achieve 100  $\Omega$  impedance matching. The antenna gain and bandwidth would be degraded as well. Meanwhile, we can not used thicker substrate either. For example, if  $h = 800 \ \mu m$ ,



Fig. 14. Setup for antenna measurements.

higher-order modes could appear for 60-GHz operation. It is indeed possible to design non-50  $\Omega$  antenna and related matching interface. However, to facilitate testing, we will adopt standard impedance design. Note that a simple dipole with the same *h* would present a differential impedance of only 25.8  $\Omega$ . Generally speaking, the folded dipole arrives at a larger bandwidth because its reactance moves slower in Smith chart.

To avoid disturbing the near-end electromagnetic field, the chip should stay away from the antenna by a few millimeters (the equivalent wavelength at 60 GHz on board is 2.8 mm). A differential transmission of 100  $\Omega$  is therefore required. On the other hand, in a generic process, the thickness of the die may be less than 508  $\mu$ m, resulting in longer bonding wires and larger parasitic inductance. In this prototype, we boost the chip by placing BGA tin balls [28] underneath to minimize the bonding distance. Each output node on chip has three gold wires (with 20- $\mu$ m diameter) parallelly connected to the antenna trace on board, presenting overall parasitic inductance of less



E-Plane 0 20 15 10 Gain S11 (dB), Gain (dBi 5 0 S<sub>11</sub> -5 -10 -10 -5 -15 -15 -5-10 H-Plane -15 30 -20 Meas -25 Sim. -30 56 57 55 58 59 60 61 62 63 64 65 Frequency (GHz) -10 - 15-15 -10 -5

Fig. 16. Measured gain, matching, and radiation patterns of  $4 \times 3$  array.

than 100 pH. We do the estimation by measuring the S-parameters of the antenna modules with and without wire bonding. After subtraction, we could obtain the equivalent bonding inductance accordingly. The three-pad arrangement of the RF outputs also preserves the flexibility for on-chip probing on PA and LNA. All the low-speed, dc, and power lines are directly wire bonded to the main board, which is made of regular FR4 material. Fig. 9(c) shows the radiation pattern simulated in HFSS [29]. The folded-dipole antenna is expected to present a gain of 6.1 dBi, bandwidth of 8 GHz, and beamwidth of 72°.

# C. Patch

To achieve longer communication distance, we resort to patch antenna array. To fit in the differential PA and LNA, the array must be arranged as illustrated in Fig. 10(a). Here, the signal flows through a differential transmission line and distributes into the elements on both sides. This arrangement ensures constructive radiation, since all the signal currents flow in the same direction momentarily. In other words, the patches have the same direction of polarization. To resonate the patch antenna at its fundamental mode  $TM_{010}$ , we need a thinner dielectric height ( $h \approx$  $203 \,\mu$ m), so the stuffing tin balls are not necessary. By enlarging the array, we concentrate more energy of radiation and increase the antenna gain at a cost of lower bandwidth and beamwidth. However, due to the loss of the line, the antenna gain reaches a maximum of around 16 dBi for the case of 4 × 3 elements. Fig. 10(b) shows the radiation pattern. Note that the beam would be tilted by a few degrees. It is because the relatively wide bandwidth affects the equivalent wavelength on the transmission lines and the patch phase. The -10 dB return loss bandwidth has been reduced to 1-2 GHz for the  $4 \times 3$  element array.

50

#### VI. EXPERIMENTAL RESULTS

The transceiver chipset has been designed and fabricated in 90-nm CMOS technology. The transmitter consumes 183 mW and the receiver 103 mW. Fig. 11(a) shows the die micrographs, where the Tx occupies  $0.43 \text{ mm}^2$  and the Rx  $0.68 \text{ mm}^2$  including pads. A testing board with folded dipole antenna is also illustrated in Fig. 11(b). To fully evaluate the performance, we have also made individual blocks for testing. The measurement results are summarized as follows.

# A. PA

The PA testing setup and performance are depicted in Fig. 12. Here the device under testing is slightly modified to have  $50-\Omega$  input impedance to facilitate the coupling to the equipment. It achieves a peak gain of 9.6 dB,  $P_{1dB}$  of 5 dBm, and maximum power-added efficiency (PAE) of 4.3% while consuming 84 mW from a 1.2 V supply. Inaccurate device modeling at 60 GHz may account for the difference between simulated and measured results. The output return loss  $(S_{22})$  of PA reveals a -10-dB bandwidth of at least 15.8 GHz (from 49.2 GHz to 65 GHz), suggesting a great matching to the antenna. Note that the actual range could be even larger because the network analyzer (37397D) itself is limited to 65 GHz.



Fig. 17. Measured OOK signal at PA's output in (a) time-domain (data rate = 2.5 Gb/s) (vertical scale: 70 mV/div, horizontal scale: 800 ps/div), (b) frequency domain (data rate = 1 Gb/s).



Fig. 18. Recovered data at 3 Gb/s with  $2^{31} - 1$  PRBS input. (Vertical scale: 60 mV/div, horizontal scale: 150 ps/div).

# B. LNA/Mixer

We have also conducted the measurement on a standalone Rx front-end (LNA+mixer+IF amplifier). This test chip includes an LNA and a mixer similar to those in the integrated Rx, but with slightly different device sizes and routing methods. Fig. 13 reveals the setup as well as the results. The LNA input match  $(S_{11})$ presents a -10-dB bandwidth of 13.6 GHz (from 51.4 GHz to 65 GHz, equipment limited). With a supply of 1.2 V, it achieves a maximum conversion gain of 24.8 dB at 57 GHz and -3-dB bandwidth of 3.6 GHz (from 54.7 GHz to 58.3 GHz) around it. The measured minimum noise figure (NF) is 7.3 dB. The linearity is also recorded. As can be observed in Fig. 13(b), the  $P_{1dB}$  and IIP<sub>3</sub> at 57 GHz are -25.9 dBm and -16.8 dBm, respectively. Note that the slight frequency deviation ( $\approx 3 \text{ GHz}$ ) of this testing block is also due to inaccurate modeling of passive devices, and has been corrected in the final integrated version. In the integrated version, the LNA+mixer+IF amplifier gain is estimated to be  $30 \sim 32$  dB.

#### C. Antenna

Both antenna designs are examined thoroughly and independently. Due to the lack of fully-differential signal source at 60 GHz, we use the Tx to drive the antenna directly (Fig. 14). By setting the input data to logic One, the antenna radiates a continuous wave at around 60 GHz for the receiving horn to capture. Since the PA's output power, the distance between the



Fig. 19. BER test for (a) folded dipole, (b)  $4 \times 3$  patch antenna ( $2^{31} - 1$  PRBS).



Fig. 20. Demonstration of high-speed real-time video signal delivery.

testing antenna and the horn, and the gain of the horn antenna are available, we can estimate the antenna gain accordingly. Note that the distance L must be greater than  $2D^2/\lambda$  in order to capture the far-field radiation, where D denotes the aperture diameter of the horn [30]. Here we choose L = 25 cm in our testing setup. The radiation pattern is obtained with similar manner. Fig. 15 illustrates the impedance matching, gain, and antenna patterns for the E- and H-plane. The -10-dB  $S_{11}$  bandwidth measures 8 GHz and the gain 5.5 dBi, which fit the simulation closely. The beamwidth for E-plane and H-plane reads 78° and 66°, respectively. Fig. 16 depicts the measurement results for the  $4 \times 3$  patch array. The peak gain is improved to 14.2 dBi, while the -10-dB bandwidth is reduced to about 1 GHz. The beamwidth for E- and H-plane becomes 22°. The measured gain for patch antenna arrays with  $2 \times 2$ ,  $2 \times 3$ , and  $4 \times 4$  elements are 8, 10, and 13 dBi, respectively.

| Тх                     |                          | Folded Dipole   |                   |       | Overall              |                            |  |
|------------------------|--------------------------|-----------------|-------------------|-------|----------------------|----------------------------|--|
| S <sub>22</sub>        | -28.1dB                  | Gain            | 5.5dBi            |       | Supply               | LNA & PA: 1.2V             |  |
| VCO <sub>1</sub>       |                          | S <sub>11</sub> | -23.5dB           | 1     | ouppiy               | Others: 1.5V               |  |
| Freq.                  | 59.5GHz                  | Deservitet      | 78° (E)           | 1     | 2 <sup>31</sup> –1 P | RBS, BER<10 <sup>-12</sup> |  |
| <br>PA                 |                          | Beamwidth       | 66° (H)           |       | 2.5Gb/s              | 4.0cm (F. dipole)          |  |
| P <sub>1dB</sub>       | 5dBm                     | Bandwidth       | 8GHz              |       | 2Gb/s                | 5.0cm (F. dipole)          |  |
| P <sub>out,max</sub>   | 7.6dBm                   | 2×2 Patch       | n Array           |       | 1.5Gb/s              | 6.0cm (F. dipole)          |  |
| Gain                   | 9.6dB                    | Gain            | 8.2dBi            | 1     | 1.566/5              | 38cm (4×3 patch)           |  |
| Built                  | Rx                       | Beamwidth       | 38° (E)           | 1Gb/s | 7.0cm (F. dipole)    |                            |  |
|                        |                          | Deanwidth       | 37° (H)           |       | 100/3                | 61cm (4×3 patch)           |  |
| S <sub>11</sub>        | _30.4dB                  | 4×3 Patcl       | n Array           | 1     |                      | Tx: 0.43mm <sup>2</sup>    |  |
| VCO <sub>2</sub>       |                          | Gain            | 14.2dBi           | 1     | Area                 | Rx: 0.68mm <sup>2</sup>    |  |
| Freq.                  | 51.1GHz                  | Guin            | 22° (E)           | ł     | _                    | Tx:183mW                   |  |
| LNA+Mixer (standalone) |                          | Beamwidth       | 22 (⊏)<br>22° (H) |       | Power                | Rx:103mW                   |  |
| NF                     | 7.3dB                    |                 | . ,               | -     | Tech.                | 90nm CMOS                  |  |
| Gain                   | 24.8dB<br>(with IF amp.) |                 |                   | •     |                      |                            |  |

TABLE I Performance Summary

TABLE II Performance Comparison

|                     | [1]                      | [3]                                                 | This Work                                           |
|---------------------|--------------------------|-----------------------------------------------------|-----------------------------------------------------|
| Carrier Freq.       | 60 GHz                   | 60 GHz                                              | 60 GHz                                              |
| Modulation          | QPSK                     | QPSK                                                | ООК                                                 |
| Max. Distance       | 8 m                      | 1 m                                                 | 60 cm                                               |
| (Data rate > 1Gb/s) |                          | (2 <sup>31</sup> −1 PRBS, BER < 10 <sup>−11</sup> ) | (2 <sup>31</sup> −1 PRBS, BER < 10 <sup>−12</sup> ) |
| Antenna Gain        | 7 dBi x 2                | 25 dBi x 2                                          | 14.2 dBi x 2                                        |
|                     | (Folded Dipole)          | (Horn)                                              | (Patch)                                             |
| Power Dissipation   | Tx: 800 mW <sup>*</sup>  | Tx: 170 mW                                          | Tx: 183 mW                                          |
|                     | Rx: 526 mW <sup>*</sup>  | Rx: 138 mW                                          | Rx: 103 mW                                          |
| Area                | Tx: 6.4 mm <sup>2</sup>  | 6.875 mm²                                           | Tx: 0.43 mm <sup>2</sup>                            |
|                     | Rx: 5.78 mm <sup>2</sup> | (Tx+Rx)                                             | Rx: 0.68 mm <sup>2</sup>                            |
| Technology          | 0.13-μm SiGe             | 90-nm CMOS                                          | 90-nm CMOS                                          |
| Maximum             | 2 Gb/s                   | 4 Gb/s                                              | 3.3 Gb/s <sup>™</sup>                               |
| Data Rate           |                          | (2 <sup>31</sup> −1 PRBS, BER < 10 <sup>-11</sup> ) | (2 <sup>31</sup> −1 PRBS, BER < 10 <sup>−12</sup> ) |

\*Front-end only.

\*\* With folded-dipole antennas.

# D. Overall Tx and Rx

Applying a pseudo-random bit sequence (PRBS) of length  $2^{31}-1$  to the data input, we capture the OOK signal at the PA's output in time and frequency domains. As shown in Fig. 17(a), it presents the envelope of a random data stream with single-ended swing of 320 mV<sub>pp</sub> on 50  $\Omega$  load. Here we use Agilent 86118A sampling head module with 70-GHz sampling bandwidth to capture the 60-GHz signal. A sinc-function spectrum around the 60-GHz carrier is also observed [Fig. 17(b)], which corresponds to the OOK modulation. Filters could be added in future design to further suppress the sidelobes. Note that both measurements in Fig. 17 suffer from cable and connector loss of a few dB. Fig. 18 illustrates the receiver output, under 2-cm transmission distance using single folded dipole antenna. Note that we have no clock and data recovery (CDR) circuit in the receiver. The already very clean eyes demonstrate the superiority of the transceiver and the analog signal processing.

To verify the signal integrity, we have also conducted a complete BER test (Fig. 19). Using  $2^{31} - 1$  PRBS, the transceiver with folded dipole antenna maintains error-free operation up to 6 cm for a data rate of 1.5 Gb/s. The longest transmission distance occurs when the 4  $\times$  3 antenna array is used. In such a case, a BER of less than  $10^{-12}$  is obtained if 1 Gb/s data is delivered over 60 cm. Nonetheless, this transceiver is capable of real-time delivering of a full HD 1080p video signal (data rate  $\approx$  1.5 Gb/s) at a distance of 38 cm. Fig. 20 shows the demonstration setup. Table I summarizes the performance of this work, and Table II compares this work with some other previously published 60-GHz transceivers.

# VII. CONCLUSION

A compact high-speed transceiver using 60-GHz band has been proposed. Fully demonstrating the signal integrity, this transceiver achieves up to multi-Gb/s data rate with less than 300 mW (Tx: 183 mW, Rx: 103 mW) overall power consumption. It also reveals a remarkable cost down in antenna design as well as the system integration, providing promising solution for future low-cost low-power wireless transceivers between mobile devices.

# REFERENCES

- S. K. Reynolds *et al.*, "A silicon 60-GHz receiver and transmitter chipset for broadband communications," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2820–2831, Dec. 2006.
- [2] S. Pinel et al., "A 90 nm CMOS 60 GHz radio," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2008, pp. 130–131.
- [3] C. Marcu et al., "A 90 nm CMOS low-power 60 GHz transceiver with integrated baseband circuitry," in *IEEE Int. Solid-State Circuits Conf.* (ISSCC) Dig. Tech. Papers, Feb. 2009, pp. 314–315.
- [4] IEEE 802.15.3c. [Online]. Available: https://mentor.ieee.org/802.15/ file/07/15-07-0760-03-003c-tensorcom-phy-presentation.ppt
- [5] E. Alpman et al., "A 1.1 V 50 mW 2.5 GS/s 7 b time-interleaved C-2C SAR ADC in 45 nm LP digital CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2009, pp. 76–77.
- [6] C. A. Balanis, Antenna Theory Analysis and Design, 3rd ed. New York: Wiley, 2005.
- [7] H. Krishnaswamy and H. Hashemi, "A fully integrated 24 GHz 4-channel phased-array transceiver in 0.13 μm CMOS based on a variable-phase ring oscillator and PLL architecture," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2007, pp. 124–125.
- [8] A. Babakhani *et al.*, "A 77-GHz phased-array transceiver with on-chip antennas in silicon: Receiver and antennas," *IEEE J. Solid-State Circuits*, vol. 41, no. 12, pp. 2795–2805, Dec. 2006.
- [9] Sibeam. [Online]. Available: http://www.sibeam.com
- [10] Ecma International. [Online]. Available: http://www.ecma-international.org/publications/files/ECMA-ST/ECMA-368.pdf
- [11] USB3.0. [Online]. Available: http://www.usb.org/developers/docs/
- [12] Rogers RO4003, Petlas. [Online]. Available: http://www.petlas.fi/ ro4003.htm
- [13] F. Xiong, *Digital Modulation Techniques*, 2nd ed. Boston/London: Artech House, 2006.
- [14] S. Sarkar and J. Laskar, "A single-chip 25 pJ/bit multi-gigabit 60 GHz receiver module," in *IEEE MTT-S Int. Microwave Symp. Dig.*, Jun. 2007, pp. 475–478.
- [15] C. Wang et al., "A 60 GHz low-power six-port transceiver for gigabit software-defined transceiver applications," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2007, pp. 192–193.
- [16] A. Tomkins *et al.*, "A zero-IF 60 GHz transceiver in 65 nm CMOS with >3.5 Gb/s links," in *Proc. IEEE Custom Integrated Circuits Conf.*, Sep. 2008, pp. 471–474.
- [17] K. Ohata et al., "Wireless 1.25 Gb/s transceiver module at 60 GHzband," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2002, pp. 236–237.
- [18] J. Lee et al., "A 75-GHz phase-locked loop in 90-nm CMOS technology," *IEEE J. Solid-State Circuits*, vol. 43, no. 6, pp. 1014–1026, Jun. 2008.
- [19] K. Scheir et al., "A 57-to-66 GHz quadrature PLL in 45 nm digital CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech.* Papers, Feb. 2009, pp. 494–495.
- [20] H. Samavati et al., "A 5-GHz CMOS wireless LAN receiver front end," IEEE J. Solid-State Circuits, vol. 35, no. 5, pp. 765–772, May 2000.
- [21] J. A. Macedo and M. A. Copeland, "A 1.9-GHz silicon receiver with monolithic image filtering," *IEEE J. Solid-State Circuits*, vol. 33, no. 3, pp. 378–386, Mar. 1998.
- [22] B. Razavi, "A mm-wave CMOS heterodyne receiver with on-chip LO and divider," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2007, pp. 188–189.
- [23] G. Zhang et al., "A BiCMOS 10 Gb/s adaptive cable equalizer," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2004, pp. 482–541.
- [24] J. Lee et al., "A 20 Gb/s duobinary transceiver in 90 nm CMOS," in IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers, Feb. 2008, pp. 102–103.
- [25] B. Afshar et al., "A robust 24 mW 60 GHz receiver in 90 nm standard CMOS," in *IEEE Int. Solid-State Circuits Conf. (ISSCC) Dig. Tech. Papers*, Feb. 2008, pp. 182–183.

- [26] B. Razavi, "CMOS transceivers for the 60-GHz band," in *RFIC Symp. Dig.*, Jun. 2006, pp. 231–234.
- [27] T. Mitomo et al., "A 60-GHz CMOS receiver front-end with frequency synthesizer," *IEEE J. Solid-State Circuits*, vol. 43, no. 4, pp. 1030–1037, Apr. 2008.
- [28] Accurus. [Online]. Available: http://www.accurus.com.tw/c/4/images/ MSDS-2.pdf
- [29] Ansoft. [Online]. Available: http://www.ansoft.com/products/hf/hfss/
- [30] IEEE Standard Definitions of Terms for Antennas, IEEE Std 145-1993, Mar. 1993.



**Jri Lee** (S'03–M'04) received the B.Sc. degree in electrical engineering from National Taiwan University (NTU), Taipei, Taiwan, in 1995, and the M.S. and Ph.D. degrees in electrical engineering from the University of California at Los Angeles (UCLA), both in 2003.

After two years of military service (1995–1997), he was with Academia Sinica, Taipei, Taiwan, from 1997 to 1998, and subsequently Intel Corporation from 2000 to 2002. He joined National Taiwan University (NTU) in 2004, where he is currently

Associate Professor of electrical engineering. His current research interests include high-speed wireless and wireline transceivers, phase-locked loops, and data converters.

Prof. Lee is now serving in the Technical Program Committees of the International Solid-State Circuits Conference (ISSCC), Symposium on VLSI Circuits, and Asian Solid-State Circuits Conference (A-SSCC). He received the Beatrice Winner Award for Editorial Excellence at the 2007 ISSCC, the Takuo Sugano Award for Outstanding Far-East Paper at the 2008 ISSCC, the Best Technical Paper Award from the Y. Z. Hsu Memorial Foundation in 2008, the T. Y. Wu Memorial Award from the National Science Council (NSC), Taiwan, in 2008, and the Young Scientist Research Award from Academia Sinica in 2009. He also received the NTU Outstanding Teaching Award in 2007, 2008, and 2009. He served as a guest editor of the IEEE JOURNAL OF SOLID-STATE CIRCUITS in 2008.



Yen-Tso Chen was born in Kaohsiung, Taiwan, in 1985. He received the B.S. and M.S. degrees in electrical engineering from National Taiwan University, Taipei, Taiwan, in 2007 and 2009, respectively.

His research interests include millimeter wave integrated circuits and antenna-in-package (AiP) designs.



Yen-Lin Huang was born in Taipei, Taiwan, in 1981. He received the B.S degree in electrical engineering from National Tsing-Hua University, Hsinchu, Taiwan, in 2005, and the M.S. degree in electrical engineering from National Taiwan University, Taipei, Taiwan, in 2008. His research interests focused on millimeter-wave transceiver circuits design for multi-gigabit wireless communication.

He is currently with MediaTek Inc., Hsinchu, Taiwan, and he is engaged in the development of millimeter-wave system for multi-gigabit wireless

networking applications.