# A CMOS High-Power Broadband 260-GHz Radiator Array for Spectroscopy

Ruonan Han, Student Member, IEEE, and Ehsan Afshari, Senior Member, IEEE

Abstract—A high-power broadband 260-GHz radiation source using 65-nm bulk CMOS technology is reported. The source is an array of eight harmonic oscillators with mutual coupling through four 130-GHz quadrature oscillators. Based on a novel self-feeding structure, the harmonic oscillator simultaneously achieves the optimum conditions for the fundamental oscillation and the 2nd-harmonic generation. The signals at 260 GHz radiate through eight on-chip slot antennas, and are in-phase combined inside a hemispheric silicon lens attached at the backside of the chip. Similar to the laser pulse-driven photoconductive emitter in many THz spectrometers, the radiation of this source can also be modulated by narrow pulses generated on chip, which achieves broad radiation bandwidth. Without modulation, the chip achieves a measured continuous-wave radiated power of 1.1 mW, and an EIRP of 15.7 dBm. Under modulation, the measured bandwidth of the source is 24.7 GHz. This radiator array consumes 0.8-W DC power from a 1.2-V supply.

*Index Terms*—CMOS, EIRP, harmonic oscillator, on-chip slot antenna, self feeding, signal source, silicon lens, spectroscopy, terahertz.

## I. INTRODUCTION

▼ MOS circuits working in the millimeter-wave and terahertz (THz) frequency range (100 GHz  $\sim$  1 THz) are gaining increasing attentions, due to their promising applications in security, biomedicine and communication areas [1]–[3]. In specific, recent works have demonstrated fully-integrated image sensors working up to 1 THz [4]-[9], and wireless data links over 200 GHz [10], [11]. For these applications, a signal source that can generate high radiation power to overcome large propagation loss at this frequency range is a critical building block. Unfortunately, it is well known that a "terahertz gap" exists, which keeps the generated terahertz power low. This is because such frequency range is too high for electronics, while too low for optics [12]. In the context of CMOS technology, such difficulty is mainly attributed to three factors: (i) In spite of the aggressive scaling-down trend of CMOS, the maximum frequency of oscillation,  $f_{\text{max}}$ , of the transistor is

Manuscript received April 05, 2013; revised May 05, 2013; accepted July 03, 2013. Date of publication July 29, 2013; date of current version November 20, 2013. This paper was approved by Guest Editor Hooman Darabi. This work was supported in part by C2S2 Focus Center under the Focus Center Research Program (FCRP), a Semiconductor Research Corporation (SRC) entity, and by the National Science Foundation (NSF). This is an expanded version of the paper from the IEEE International Solid-State Circuits Conference (ISSCC), San Francisco, CA, USA, February 17-21, 2013.

The authors are with the Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY 14853 USA (e-mail: rh383@cornell.edu; ehsan@ece.cornell.edu).

Color versions of one or more of the figures in this paper are available online at http://ieeexplore.ieee.org.

Digital Object Identifier 10.1109/JSSC.2013.2272864

still below 300 GHz, especially when the device interconnects are included. This sets a theoretical limitation, beyond which no fundamental oscillation nor power amplification is possible [13]. (ii) The thinner gate oxide in the advanced technology node results in lower breakdown voltage. This severely reduces the device output power, which is strongly correlated to the voltage swing. (iii) The passive metal structures fabricated in CMOS have high loss, especially with the presence of the lossy silicon substrate. The challenge lies in the thin metal layers and the thick, lossy silicon substrate (to be discussed in detail in Section IV). Because of these drawbacks in CMOS, high-power terahertz generation is more commonly demonstrated in III-V compound semiconductors. For example, using InP HEMTs, a 650-GHz power amplifier module with 3-mW output power is reported in [14]. In [15], 4.2-mW output power is demonstrated with a 600-GHz GaAs diode frequency tripler when cooled to 120 K (1.8 mW at room temperature).

In addition to the signal power level, another challenge in CMOS THz sources is the output frequency bandwidth. It is especially important for material identifications using THz spectroscopy. For example, prior research shows that many types of hazardous gas (e.g., methylchloride [16]) and warfare chemical agents (e.g., sarin [17]) exhibit vibrational resonance between 200 GHz to 300 GHz. To obtain such spectrum, a broadband radiation source is required.

To overcome the cutoff frequency limitation in CMOS, the device nonlinearity and harmonic generation are utilized in prior works. The signal sources based on such principle can be further divided into two categories: (i) frequency multipliers and (ii) harmonic oscillators. The frequency multipliers normally have both high output power and bandwidth. In [18], the 180-GHz active doubler achieves 0-dBm output power and 11.1%-3-dB bandwidth. In [19], [20], these performance metrics achieved by a traveling-wave doubler at 275 GHz are -6.6 dBm and 7.8%. In the 480-GHz passive doubler in [21], the measured output power and frequency range are larger than -6.3 dBm and 4.2% (limited by testing equipment). However, these multipliers need a large-power and wide-tuning-range fundamental signal source to drive, which is another challenge. In comparison, the second nonlinear circuit category, i.e., the harmonic oscillator, has the advantage of being self-sustainable. The reported output power is competitive to that of the frequency doubler, especially with multi-cell power combining. The 482-GHz triple-push oscillator in [22] achieves -7.9-dBm power. The coupled oscillator in [23], [24] achieves -1.2 dBm. Normally, there is significant power loss in the process of radiation. Nevertheless, the 16-element 280-GHz distributed active radiator (DAR) in [25], [26] still achieves a radiated power of -7.2 dBm and an EIRP (effective isotropic radiated power) of 9.4 dBm. Utilizing a pair of triple-push oscillators and a differential ring antenna, a high radiated power of -4.1 dBm at 288 GHz is reported in [27]. Despite this progress, large frequency tuning in harmonic oscillators remains very challenging. This is mostly due to the lossy MOS varactors used in the resonance tank. In [24], the variable-coupling solution effectively reduces such loss, and achieves a tuning range of 4.5%. Although this is the highest tuning range in prior CMOS THz oscillators, it is still insufficient for THz spectroscopy.

In this paper, we describe the design of a 260-GHz CMOS pulse radiator array, that was first presented by the authors in [28]. In Section II, the architecture of the array based on a narrow-pulse modulation scheme is described. Then in Section III, the key of this work, a novel self-feeding harmonic oscillator structure, is discussed. We show that such structure effectively maximizes the fundamental oscillation power and 2nd-harmonic generation efficiency. In Section IV, the designs of other circuit blocks are presented. After the measurement of the radiator array in Section V, a performance comparison is given in the conclusion in Section VI. To the best of our knowledge, this work demonstrates the highest radiated power and EIRP among all CMOS sources of similar frequency range.

#### II. OVERVIEW OF THE RADIATOR ARRAY

To overcome the previously mentioned design tradeoff between the output power and bandwidth, a new system approach is needed. We found that prior CMOS THz oscillators generate only a single-tone signal, of which the frequency can be continuously changed. Such scheme is common in the design of RF transceiver circuits, and its counterparts in THz spectroscopy, like quantum cascaded laser (QCL) [29] and backward wave oscillator (BWO) [30], are popular too. Nevertheless, we should not ignore another important optical-electro method that is also widely used in the physics/optics community. Such method is illustrated in a standard fourier-transform spectrometer in Fig. 1. On the source side, a photoconductive emitter is excited periodically by femto-second laser pulses. Upon each excitation, the photo-induced current in the emitter semiconductor flows through the differential antenna connecting the semiconductor, and causes radiation. Due to the carrier lifetime, such radiation lasts for a few picoseconds, which corresponds to a wide bandwidth ( $\sim$  THz). Such radiation is then used for spectroscopy which simultaneously samples the molecule response to all in-band frequencies.

In this paper, similar signal-generation principle is applied to the design of our CMOS THz source, which is expected to replace the bulky and costly optical-electro source in Fig. 1. The architecture of this source is shown in Fig. 2. It contains four radiator units, which are placed on four sides of the chip symmetrically. Inside each radiator unit, there are two 130-GHz oscillators that generate a second-harmonic signal near 260 GHz. The signal is then gated by a switch, before it radiates through a pair of on-chip antennas. The switch is controlled by a narrow-pulse train from a local pulse generator. Fig. 2 also shows that the two harmonic oscillators inside each radiator unit are coupled with 90° phase shift at  $f_0$  through a quadrature oscillator. This way, the signal at  $2f_0$  from each radiator unit is differential. Such coupling scheme facilitates the symmetrical placement of the



Fig. 1. In time-domain and Fourier-transform spectroscopy, broadband terahertz radiation can be generated using ultra-fast laser pulses. Using similar principle, the broad bandwidth can also be achieved through narrow-THz-pulse generation in CMOS.

on-chip antennas, and also enhances the switching capability as will be shown in Section IV. Through the in-phase radiation of the eight antennas, beam combining in free space is achieved, which enhances the total power and directivity of the radiation.

From the signal analysis in Fig. 2, we see that since the harmonic oscillators only need to generate a single tone signal, the power–bandwidth tradeoff in Section I is avoided. This enables us to use a new self-feeding structure which is dedicated to generate high-power second-harmonic signal (to be shown in Section III). With the pulse modulation, the output radiated signal is still centered around  $2f_0$ , but has an available bandwidth that is inversely proportional to the width of the controlling pulse (Fig. 2). Since picosecond pulse generation in CMOS has been demonstrated [31], [32], such architecture has the potential to achieve a bandwidth near 100 GHz, which is sufficient to perform THz spectroscopy for some chemicals like methylchloride and sarin [16], [17].

The capability of modulating the radiation makes this chip also suitable for other applications, like terahertz data transmission and tomography. It can also pair with the THz image sensors that do not have integrated Dicke switches [4]–[7], because the source radiation needs to be chopped at MHz frequency to avoid the flicker noise of the sensors.

# III. HIGH-POWER HARMONIC OSCILLATOR: SELF-FEEDING STRUCTURE

In Fig. 2, the harmonic oscillator inside the radiator array unit is the most important part to realize high-power terahertz radiation. In this section, we discuss several drawbacks of the traditional harmonic oscillators in terms of power generation. Meanwhile, a new harmonic oscillator structure called *Self-Feeding* is presented. Harmonic generation relies on the transistor nonlinearity, which grows with the fundamental oscillation swing. To maximize the oscillation, in the first part of this section, we briefly revisit the classic optimum gain theory [22] with a different handling of the optimum gain amplitude. This then leads to the basic structure of our proposed self-feeding oscillator. What makes this structure practical is that we also give a rigorous (yet still concise) theory, that provides all values of the oscillator design parameters to precisely achieve the optimum gain



Fig. 2. The architecture of the 260-GHz radiator array.



Fig. 3. The two-port network representation of a MOS transistor.

condition. After that, we change the focus to the harmonic generation, and show how the basic self-feeding structure evolves into the final 2nd-harmonic oscillator, in order to greatly increase the available harmonic power of the transistor.

#### A. Fundamental Oscillation: Optimum Gain Condition

In harmonic oscillators, there is no external load to the fundamental  $(f_0)$  signal. In the steady state, all the fundamental power generated by the transistor (denoted as  $P_{out}$ ) is delivered and dissipated in the peripheral passive network. The network is linear, so higher  $P_{out}$  leads to higher voltage swing and higher nonlinearity. To maximize  $P_{out}$ , we start with the transistor modeling. The traditional lumped model, which contains a transconductance cell and several parasitic components, is complicated for the calculations of  $P_{out}$ . Therefore, to optimize  $P_{out}$ , in [22] and [33] the transistor is modeled as a two-port network with Y-parameters (Fig. 3):

$$[Y_0] = \begin{bmatrix} y_{11} & y_{12} \\ y_{21} & y_{22} \end{bmatrix}.$$
 (1)

The accurate values of (1) can be directly obtained through S-parameter simulation of the foundry model or measurement of the test structures. In this paper, (1) is obtained from the conversion of the simulated large-signal S-parameters, in order to more accurately capture the transistor behavior in oscillators.



Another advantage of such modeling is that  $P_{out}$  is readily expressed as a function of the root-mean-square voltages and currents of the gate  $(V_1, I_1)$  and the drain  $(V_2, I_2)$ :

$$P_{\rm out} = -Re(V_1I_1^*) - Re(V_2I_2^*).$$
<sup>(2)</sup>

Having obtained the Y-parameters of the transistor, the currents are expressed as

$$\begin{cases} I_1 = y_{11}V_1 + y_{12}V_2\\ I_2 = y_{21}V_1 + y_{22}V_2. \end{cases}$$
(3)

Using (2), (3), and the defined gate-to-drain complex voltage gain

$$A = \frac{V_2}{V_1} \tag{4}$$

we derive

$$P_{\text{out}} = -g_{11}|V_1|^2 - g_{22}|V_2|^2 - |V_1||V_2|[(g_{12} + g_{21})\cos \angle A + (b_{21} - b_{12})\sin \angle A].$$
(5)

In (5),  $g_{ij}$  and  $b_{ij}$  are the real (i.e., conductance) and imaginary (i.e., susceptance) parts of  $y_{ij}$ , respectively. To maximize the third term of (5), the optimum gain phase  $\angle A_{opt}$  is [22], [33]

$$\angle A_{\rm opt} = \angle -(y_{21} + y_{12}^*). \tag{6}$$

Fig. 4(a) shows the simulated optimum phase of an NMOS transistor (W/L =  $27\mu/60n$ ) in a 65-nm CMOS process based on (6). When frequency increases, extra phase shift ( $\Delta \varphi_{opt}$ ) beyond  $-180^{\circ}$  is required. As Fig. 4(b) shows, at 130 GHz the optimum phase of  $-210^{\circ}$  (or  $150^{\circ}$ ) provides a simulated output power  $P_{out}$  twice as much as the  $180^{\circ}$  does. Fig. 5(a) intuitively explains this. First, a delay exists between the input gate voltage  $v_g$  and the output drain current  $i_d$ . Such delay is caused by the parasitic R-L-C network of the gate (denoted by  $\Delta \varphi_1$ ) and by the feedforward current  $i_{gd}$  through  $C_{gd}$  (denoted



Fig. 4. (a) The simulated phase of the optimum gate-to-drain voltage gain of an NMOS (W/L =  $27\mu/60n$ ). The negative sign on the left y-axis represents the delay from the gate to the drain. (b) The simulated (solid line) and calculated (dashed line)  $P_{out}$  of the transistor at 130 GHz versus varying gate-to-drain phase shift. The amplitudes of  $V_1$  and  $V_2$  are 1 V.

by  $\Delta \varphi_2$ ). On the other hand, to maximize the output power of the drain node, the phase of the drain voltage  $v_d$  should align with that of the drain current  $i_d$ ; when  $i_d$  is delayed with respect to the gate voltage  $v_q$ , the drain voltage  $v_d$  should also be intentionally delayed. The relation between  $v_q$  and  $i_d$  is represented by the  $y_{21}$  of the NMOS. And the simulated phase change of  $y_{21}$  in Fig. 5(b) closely matches that of  $A_{opt}$  in Fig. 4(a), which validates such intuitive explanation. It is also noteworthy that the  $\Delta \varphi_{opt}$  in Fig. 4(a) is slightly affected by the large-signal gate/drain voltage amplitude. In the large-signal S-parameter simulation,  $\Delta \varphi_{\rm opt}$  changes from 25° to 30° when the gate/drain voltage amplitude changes from 10 mV (small-signal) to 1 V (large-signal). This is because in Fig. 5(a), the transconductance  $g_m$  is smaller under large-signal excitation, resulting in larger  $\Delta \varphi_2$ . Finally, we also use the Y-parameters extracted from the large-signal S-parameter simulation to calculate  $P_{out}$  using (5). In Fig. 4(b), the comparison with simulation result indicates that the proposed method well predicts and optimizes  $P_{\rm out}$ .

In [22] and [33], the amplitude of the optimum gain  $|A_{opt}|$ is derived by normalizing the output power  $(P_{out}/(|V_1||V_2|))$ or  $P_{out}/|V_1|^2)$ . This however ignores the voltage limitation of the oscillation swing. For example, when  $|A_{opt}|$  is greater than unity  $(|V_1| < |V_2|)$ , then as the oscillation swing grows, the drain voltage saturates near  $V_{DD}$  first, and the gate voltage only approaches  $V_{DD}/|A_{opt}|$ . Therefore, even if  $P_{out}/|V_1|^2$  is maximized,  $P_{out}$  is not necessarily the maximum that can be achieved. To correct that, a variable range

$$\begin{cases} |V_1| \le \frac{V_{DD}}{\sqrt{2}} \\ |V_2| \le \frac{V_{DD}}{\sqrt{2}} \end{cases}$$
(7)

is applied to the optimization of  $P_{out}$ . The factor of  $\sqrt{2}$  in (7) translates the root-mean-square (RMS) value to magnitude. In (5), the amplitude and phase of  $A_{opt}$  are independent, so  $\angle A_{opt}$  in (6) is still valid. Then we substitute it into (5) and get

$$P_{\rm out} = -g_{11}|V_1|^2 - g_{22}|V_2|^2 + |y_{12} + y_{21}^*| \cdot |V_1||V_2|.$$
(8)



Fig. 5. (a) The extra gate-to-drain voltage phase delay caused by the gate parasitics and the feedforward current through  $C_{gd}$ . (b) The simulated  $y_{21}$  of an NMOS (W/L =  $27\mu/60n$ ).

At maximum  $P_{out}$ , at least one of  $|V_1|$  and  $|V_2|$  reaches  $V_{DD}/\sqrt{2}$ . Then, depending on the Y-parameters of the transistor, there are three possibilities for the parabolic curve of  $P_{out}$  (shown in Fig. 6), which lead to different values of  $|A_{opt}|$ :

$$\begin{aligned} \text{Case 1:} & |A_{\text{opt}}| = 1, & \text{if} \begin{cases} 2g_{11} < |y_{12} + y_{21}^*| \\ 2g_{22} < |y_{12} + y_{21}^*| \\ 2g_{22} < |y_{12} + y_{21}^*| \end{cases} \\ \text{Case 2:} & |A_{\text{opt}}| = \frac{2g_{11}}{|y_{12} + y_{21}^*|} > 1, & \text{if} \begin{cases} 2g_{11} > |y_{12} + y_{21}^*| \\ 2g_{22} < |y_{12} + y_{21}^*| \\ 2g_{22} < |y_{12} + y_{21}^*| \\ 2g_{22} > |y_{12} + y_{21}^*| \end{cases} \\ \text{Case 3:} & |A_{\text{opt}}| = \frac{|y_{12} + y_{21}^*|}{2g_{22}} < 1, & \text{if} \begin{cases} 2g_{11} < |y_{12} + y_{21}^*| \\ 2g_{22} < |y_{12} + y_{21}^*| \\ 2g_{22} > |y_{12} + y_{21}^*| \end{cases} \\ \end{aligned}$$

It is noteworthy that in (9), it is not possible to have both  $2g_{11}$  and  $2g_{22}$  larger than  $|y_{12} + y_{21}^*|$ . This is because the fundamental oscillation frequency is always below the cut-off frequency  $f_{\text{max}}$ , and the unilateral gain U is greater than unity, which is equivalent to the condition [22]:

$$4g_{11}g_{22} < |y_{12} + y_{21}^*|^2.$$
(10)

Normally, the transistor transconductance  $|y_{21}|$  is much larger than the device's input and output conductances  $(g_{11} \text{ and } g_{22})$ . Therefore, Case 1 in (9) gives the maximum  $P_{\text{out}}$ . Actually, the transistor of the 65-nm CMOS process used in our work falls into this category, so  $|A_{\text{opt}}|$  in our oscillator is unity. This is, however, not always true. Case 2 and Case 3 may occur for two reasons: (1) the device loss  $(g_{11} \text{ and } g_{22})$  increases as the frequency approaches  $f_{\text{max}}$ , and (2) the transconductance  $(y_{21})$ decreases as the oscillation swing grows.

As indicated in Fig. 4(b), the  $180^{\circ}$  phase shift in conventional push-push oscillators provides only half of the peak  $P_{out}$ . The



Fig. 6. Three possibilities for the parabolic curve of the output power in the voltage-limited oscillation.

power is even less in the real layout, because the metal line between the transistors' gate and drain causes the  $\Delta \varphi$  in Fig. 4(a) smaller than  $\Delta \varphi_{opt}$ . Using a triple-push structure, the oscillator in [22] and [27] eliminates such problem. However, the generation of the third harmonic is normally weaker than that of the second. Moreover, the output in the center of the triple-stage loop [27] is hard to connect to the large-size on-chip antenna. Finally, the structure does not facilitate the symmetrical coupling in a power-combined oscillator array.

By observation, the transistor inside push-push or triple-push topologies needs another transistor(s) (i.e., active network) to form an oscillation loop. Having such active network around the transistor is critical, because the oscillation frequency is so high that the transistor becomes unconditionally stable. As Fig. 7 shows, at 130 GHz the stability factor k of the transistor is 1.19, and the simulated source/load instability regions,  $\Gamma_G$  and  $\Gamma_D$ , are outside of the circle  $|\Gamma| < 1$ . Therefore, any passive network at the gate and drain cannot cause oscillation. Nevertheless, if we intentionally degrade the inverse isolation of the transistor, by inserting a self-feeding transmission line between the drain and the gate, the transistor becomes conditionally stable, as Fig. 7 shows. This means once the admittances of the passive terminations at the gate and drain  $(Y_G \text{ and } Y_D)$  lie inside the *in*stability region, the transistor is able to oscillate by itself. Such concept leads to our proposed self-feeding oscillator structure in Fig. 8(a), which contains a self-feeding line and two shunt components  $Y_1$  and  $Y_2$ . Ideally, both  $Y_1$  and  $Y_2$  should be lossless, but it is not the case in reality. So we assume  $Y_1$  to be lossless, essentially modeling all loss to  $Y_2$ .

The stability analysis only determines whether the oscillation can occur or not, not how strong it is. So next, a rigorous method is presented to give the values of the oscillator design parameters ( $Z_0$ ,  $\varphi_{TL}$ ,  $Y_1$  and  $Y_2$ ), which precisely achieve the optimum gain conditions ( $\angle A_{opt}$  and  $|A_{opt}|$ ) we derived previously. It is important to note that the electrical length  $\varphi_{TL}$ of the self-feeding line in Fig. 8(a) does not result in the same



Fig. 7. The simulated stability of a transistor  $(W/L = 27\mu/60n)$  with and without a self-feeding line. The stability factor of the network is k. The circles  $\Gamma_G$  and  $\Gamma_D$  are the stability boundaries of the terminations of the transistor gate  $Y_G$  and drain  $Y_D$ , respectively. When the locations of  $Y_G$  and  $Y_D$  on the Smith Chart are inside of the instability regions, the circuits oscillate.



Fig. 8. (a) The basic block of the self-feeding oscillator structure. (b) The twoport network analysis of the self-feeding structure, which is decomposed into three subnetworks  $N_1$ ,  $N_2$  and  $N_3$ .

voltage phase shift  $\angle A$ , because in addition to the traveling wave, standing wave also exists inside the line.

First, the whole circuit is considered to be a two-port network with Y-parameter  $[Y_{total}]$ , which relates the external currents and voltages (Fig. 8(b)):

$$\begin{bmatrix} I_{\text{ext},1} \\ I_{\text{ext},2} \end{bmatrix} = [Y_{\text{total}}] \cdot \begin{bmatrix} V_{\text{ext},1} \\ V_{\text{ext}2} \end{bmatrix}.$$
 (11)

The entire network is composed of three sub two-port networks,  $N_1$ ,  $N_2$  and  $N_3$ . They share the same voltages, and their currents add up. Therefore,  $[Y_{total}]$  is the sum of the parameters of the sub networks:

$$[Y_{\text{total}}] = [Y_0] + [Y_{\text{TL}}] + [Y_{\text{shunt}}]$$
(12)

where the transmission line network  $[Y_{TL}]$  is

$$[Y_{\text{TL}}] = j \begin{bmatrix} -\frac{1}{Z_0 \tan \varphi_{\text{TL}}} & \frac{1}{Z_0 \sin \varphi_{\text{TL}}} \\ \frac{1}{Z_0 \sin \varphi_{\text{TL}}} & -\frac{1}{Z_0 \tan \varphi_{\text{TL}}} \end{bmatrix} = j \begin{bmatrix} B_{T1} & B_{T2} \\ B_{T2} & B_{T1} \end{bmatrix}$$
(13)

and the shunt components network  $[Y_{\text{shunt}}]$  is

$$[Y_{\text{shunt}}] = \begin{bmatrix} Y_1 & 0\\ 0 & Y_2 \end{bmatrix} = \begin{bmatrix} G_1 + jB_1 & 0\\ 0 & G_2 + jB_2 \end{bmatrix}.$$
 (14)

Next, to solve for  $[Y_{\text{total}}]$  that contains all circuit design parameters, two conditions are applied to the linear (14). First, the self-feeding structure in Fig. 8 is supposed to oscillate at fundamental frequency  $f_0$ . For an oscillator that is a self-sustaining network, the external currents  $I_{\text{ext},1}$  and  $I_{\text{ext},2}$  are zero. Therefore, we get

$$[Y_{\text{total}}] \cdot \begin{bmatrix} V_{\text{ext},1} \\ V_{\text{ext},2} \end{bmatrix} = \begin{bmatrix} I_1 \\ I_2 \end{bmatrix} = \begin{bmatrix} 0 \\ 0 \end{bmatrix}.$$
(15)

The second condition relates to our goal: to achieve the optimum gain  $A_{opt}$ , which is, by definition, the ratio between  $V_{ext,1}$  and  $V_{ext,2}$ . Therefore, (15) becomes

$$[Y_{\text{total}}] \cdot \begin{bmatrix} 1\\ A_R + jA_I \end{bmatrix} = \begin{bmatrix} 0\\ 0 \end{bmatrix}$$
(16)

where  $A_R$  and  $A_I$  are the real and imaginary parts of  $A_{opt}$ , respectively. Based on (16), we get

$$\begin{cases} y_{11} + Y_1 + jB_{T1} + (y_{12} + jB_{T2})(A_R + jA_I) = 0\\ y_{21} + jB_{T2} + (y_{22} + Y_2 + jB_{T1})(A_R + jA_I) = 0 \end{cases} (17)$$

which can also be expressed in matrix form after separating the real and imaginary parts of each equation:

$$\begin{bmatrix} 1 & 0 & 0 & 0 & 0 & -A_I \\ 0 & 0 & 1 & 0 & 1 & A_R \\ 0 & A_R & 0 & -A_I & -A_I & 0 \\ 0 & A_I & 0 & A_R & A_R & 1 \end{bmatrix} \cdot \begin{bmatrix} G_1 \\ G_2 \\ B_1 \\ B_2 \\ B_{T1} \\ B_{T2} \end{bmatrix}$$
$$= \begin{bmatrix} -g_{11} - \operatorname{Re}(A \cdot y_{12}) \\ -b_{11} - \operatorname{Im}(A \cdot y_{12}) \\ -g_{21} - \operatorname{Re}(A \cdot y_{22}) \\ -b_{21} - \operatorname{Im}(A \cdot y_{22}) \end{bmatrix} . \quad (18)$$

As assumed before,  $Y_1$  is lossless, so  $G_1$  is zero. Then the linear equations (18) are solved. In specific, for the transmission line design, we get

$$B_{T2} = \frac{1}{Z_0 \sin \varphi_{\rm TL}} = \frac{g_{11} + \text{Re}(A \cdot y_{12})}{A_I}.$$
 (19)

In (19), the transmission line impedance  $Z_0$  and electric length  $\varphi_{\text{TL}}$  are coupled, and their relation is simulated in Fig. 9. As we will show next, such flexibility is utilized to also optimize the harmonic generation efficiency. In our work, we choose  $Z_0$  to be 60  $\Omega$  and  $\varphi_{\text{TL}}$  to be 48° at 130 GHz (~150  $\mu$ m in physical length). The size of the transistor is  $27\mu/60n$ , which is based on a comprehensive consideration of the output power, layout, and the feasibility of implementing the  $Z_0$  of the self-feeding line associated with the transistor size.



Fig. 9. The different combinations of  $Z_0$  and  $\varphi_{TL}$  of the transmission line at 130 GHz to achieve the gain phase and the output power  $P_{out}$ .

#### B. Harmonic Generation and Efficiency Enhancement

To extract the 2nd-harmonic signal out of the basic self-feeding structure described above, the schematic of the final harmonic oscillator in the 260-GHz radiator array is shown in Fig. 10(a). Two self-feeding structures are coupled through their feedback transmission lines  $TL_1$ . Then two additional lines,  $TL_2$ , extract the 2nd-harmonic signals and combine them at the output. The output node is a virtual ground to the fundamental signal, thus reflects it back. This is because the two self-feeding structures are designed to have differential oscillation (to be discussed next). At  $f_0$ , the short-terminated lines  $TL_2$  (78° in length) also provide the susceptance  $B_2$ in (18). The susceptance  $B_1$  of the gate shunt component is provided by a pair of thick-gate MOS varactors. Such varactors have a quality factor of  $6 \sim 7$  at 130 GHz (zero bias) and a dynamic cut-off frequency of 870 GHz [21]. Besides, their capacitances are small (11 fF), so that in addition to the pulse modulation, this source can also continuously change frequency within a small range, without compromising the output power. With the RF-block resistor  $R_1$  (6.6 k $\Omega$ ), the varactors present high impedance to the common-mode signal at  $2f_0$ .

The above operation relies on the out-of-phase coupling mode of the two self-feeding structures. In this mode, the coupled lines  $TL_1$  present the even-mode impedance  $Z_{\text{even}}$  to each transistor. Therefore, to achieve the optimum gain  $A_{\text{opt}}$ , we set  $Z_{\text{even}}$  to be 60  $\Omega$ , and electrical length (at 130 GHz) to be 48°, which are derived in the previous section. However, by symmetry, the two self-feeding structures can also potentially oscillate with in-phase mode, which is undesired. In our design, this mode is suppressed. This is because in this mode, the coupled lines present odd-mode impedance  $Z_{\text{odd}}$  to each transistor, and we made  $Z_{\text{odd}}$  to be only 20  $\Omega$ , which gives a gain far from the optimum (thus smaller output power at  $f_0$ ). Moreover, the quality factor of the coupled lines in this mode is lower, because the generated magnetic fields are partially canceled.

Next, we discuss the efficiency of harmonic generation inside a harmonic oscillator. When a transistor is driven by a large voltage swing at the gate, its channel current is distorted, thus harmonic currents are generated. This is the fundamental mechanism utilized by almost all harmonic oscillators. The magnitude of the harmonic current is mainly determined by the distortion of the current at  $f_0$ , which is a function of the fundamental oscillation power  $P_{out}$  and the transistor nonlinear I-V



Fig. 10. (a) The differential oscillator based on the self-feeding structure. (b) The desired out-of-phase oscillation mode. (c) The undesired in-phase oscillation mode.



Fig. 11. The equivalent circuits of the NMOS modeled a power source at  $2f_0$  in (a) an push-push oscillator, and (b) an oscillator where the gate signal is zero.

relationship  $i_d(v_{g,f_0}, v_{d,f_0})$ .<sup>1</sup> Since  $P_{\text{out}}$  has been maximized in Section III(a) and  $i_d(v_{g,f_0}, v_{d,f_0})$  is a property of the device itself, the current distortion, namely the harmonic channel current, is modeled as an independent current source at  $2f_0$  (denoted as  $i_{2f_0}$  in Fig. 11(a)). Then, at  $2f_0$ , the transistor is considered as a power source, and the amount of available power at the output load is

$$P_{\text{out},2f_0} = \frac{i_{2f_0}^2}{4G_{\text{out}}}$$
(20)

<sup>1</sup>The effect of  $v_{g,2f_0}$  on  $i_d$  is represented by the  $g_m$  at  $2f_0$ , and the effect of  $v_{d,2f_0}$  is represented by the  $g_{ds}$  at  $2f_0$ . Therefore, these two effects are not included in  $i_{2f_0}$ .



Fig. 12. The simulated  $G_{\rm out}$  of the NMOS inside a push-push oscillator, and the NMOS with its gate grounded.

where  $G_{\text{out}}$  is the real part (i.e., conductance) of the total internal admittance in shunt with  $i_{2f_0}$ . For harmonic oscillators, it is important to have smaller  $G_{\text{out}}$  to get higher  $P_{\text{out},2f_0}$ .

For the transistor inside a push-push oscillator, the 2nd-harmonic signals at the gate and drain nodes are equal in both magnitude and phase (Fig. 11(a)). The transistor is therefore diode-connected at  $2f_0$ . Unfortunately, for the transistor core (without parasitic capacitances), such configuration increases  $G_{\text{out}}$  from  $g_{ds}$  to  $g_{ds} + g_m$ . Essentially, this is due to the negative feedback path between the gate and drain. The simulated  $g_{ds}$  and  $g_m$  of the NMOS (W/L = 27  $\mu/60n$ ) used in our self-feeding oscillator are 3.2 mS and 24.7 mS, respectively. This means in push-push oscillators the available harmonic power of the NMOS core is reduced by a factor of 8.7. At the frequencies of our interest, the parasitic capacitances are not negligible. As Fig. 11(a) shows, the direct gate-to-drain connection shorts the capacitor  $C_{gd}$ . However,  $C_{gs}$ , which is normally larger  $C_{gd}$ , is in shunt with  $i_{2f_0}$ . This further increases  $G_{out}$  and decreases  $P_{\text{out},2f_0}$ . Such degradation is even more significant with the presence of the inevitable interconnect from the drain of one transistor to the gate of another transistor (denoted as  $L_c$  in Fig. 11(a)). The final  $G_{out}$  of the circuit in Fig. 11(a) is plotted in Fig. 12, by simulating the total conductance of the diode-connected NMOS. In addition to  $g_{ds}$  and  $g_m$ , the parasities dominated by  $C_{gs}$  increase  $G_{out}$  to 35.4 mS at 260 GHz. With an  $L_c$  of only 5 pH, the  $G_{out}$  is further increased to 43.5 mS at 260 GHz.

The above analysis indicates that the reduction of the available harmonic power is mainly due to the presence of the 2ndharmonic signal at the gate. As a comparison, in Fig. 11(b), the gate signal is shorted to ground through a bypass capacitor  $C_{bps}$ . This way, the negative feedback path is eliminated, and the output conductance of the NMOS core becomes  $g_{ds}$ . At high frequency,  $G_{out}$  is increased by  $C_{gd}$ , but since  $C_{gd}$  is smaller than  $C_{gs}$ , such degradation is less than that in the push-push oscillator case. As Fig. 12 shows, the simulated  $G_{out}$  of the NMOS in this case is 11.5 mS at 260 GHz. The available harmonic power  $P_{out,2f_0}$  is therefore 3 ~4 times larger than that of the push-push oscillator.

Similar gate isolation is implemented by the self-feeding lines in our proposed self-feeding harmonic oscillator. As Fig. 13



Fig. 13. The small, lossy impedance of the transistor gate is transformed into a much higher one at 260 GHz through the self-feeding transmission line.

shows, the flexible choice of  $\varphi_{TL}$  in (19) and Fig. 9 is utilized. With the electrical length of  $48^{\circ}$  at  $f_0$ , the coupled lines are slightly over a quarter wavelength at  $2f_0$ . This quarter-wavelength line is in series with the transistor gate, and in simulation, the small impedance of the gate is increased by a factor of 3.5. Therefore, the  $2f_0$  signal path from the drain to the gate is blocked. The waveforms in Fig. 14(a) are the 2nd-harmonic signals at the gate and drain from the harmonic-balance simulation of the self-feeding harmonic oscillator shown in Fig. 10(a). It can be seen that with the coupled-line-blocking, the signal level at the gate is 8.5 dB lower than that at the drain. The simulated output conductance,  $G_{out}$ , of one of the self-feeding structures (Fig. 13) is plotted in Fig. 14(b). The loss of the transmission lines is included. At 260 GHz,  $G_{out}$  is only 10.7 mS, which is even slightly lower than that of the gate-grounded NMOS in Fig. 12. This is because the  $V_{G,2f_0}$  and  $V_{D,2f_0}$  in Fig. 14(a) are nearly out-of-phase, which create a negative conductance through  $g_m$  in the NMOS core and partially cancels the loss. Lastly, it is also noteworthy that without the shunt gate capacitance, the drain is better impedance-matched to the extraction line  $TL_2$  in Fig. 10(a). So the signal at  $2f_0$  inside  $TL_2$  is mainly traveling wave, which reduces the signal loss caused by multi-reflection.

From above analyses, we conclude that in comparison to conventional push-push oscillator, the proposed self-feeding harmonic oscillator increases the fundamental oscillation power by a factor 2, and increases the output conductance of the current source at  $2f_0$  by a factor of 4. The simulated output power of a single self-feeding harmonic oscillator is 0.82 mW. When the antenna feed line and output matching stubs are included, the output power is 0.6 mW. The simulated DC power consumption of one harmonic oscillator is 49 mW from a 1.2-V supply.

#### IV. OTHER CIRCUIT BUILDING BLOCKS

In this section, the designs of other building blocks of the 260-GHz CMOS radiator array are described, which include the 130-GHz quadrature oscillator, the 260-GHz switch with pulse modulation, and the 260-GHz on-chip antenna.

## A. 130-GHZ Quadrature Oscillator

As is shown in Fig. 2, the eight differential self-feeding oscillators are inter-coupled through four quadrature oscillators. The schematic of the quadrature oscillator is shown in Fig. 15(a).



Fig. 14. (a) The simulated 2nd-harmonic voltage waveforms at the gate and drain nodes of the self-feeding harmonic oscillator. (b) The simulated output conductance,  $G_{\rm out}$ , of the self-feeding harmonic oscillator.

It is composed of four single-transistor amplifier stages connected end to end. At the fundamental oscillation frequency (~130 GHz), the transistor is unconditionally stable (k = 1.14). Therefore, for each amplifier stage, the input and output transmission-line networks are designed to achieve the simultaneous conjugate matching [34] (Fig. 15(b)). This way, each amplifier not only has the maximum available gain  $G_{ma}$ , but also has a phase shift that is easily adjusted to  $-270^{\circ}$  by changing the lines between the stages [35]. The simulated  $S_{21}$  of each stage (including the loss of lines) in Fig. 16 (Mode 1) has a magnitude of 2.1 dB at 137 GHz, where the phase shift is  $-270^{\circ}$ . Therefore, the loop oscillates near this frequency with the desired quadrature phase.

Unfortunately, the loop can also potentially oscillate in another mode with undesired phase. As is shown in Fig. 16, at a lower frequency near 100 GHz (Mode 2), the stage phase shift is  $-180^{\circ}$ , and the gain is also larger than one. To address this issue, in Fig. 15(a) the source nodes of MOSFETs in the two opposite sides are combined, and a transmission line  $TL_s$  ( $\varphi = 30^{\circ}$ at 130 GHz) is inserted between the combined node and the ground. In the desired quadrature oscillation mode, the currents in those two source nodes are out of phase, and cancel once combined (Fig. 16). This way the line  $TL_s$  does not change the operation described above. But in the undesired out-of-phase mode, the two currents are in phase, and they flow into  $TL_s$ after combined. The simulation results in Fig. 16 indicates that



Fig. 15. (a) The schematic of the quadrature oscillator and (b) the simultaneous conjugate matching of the stage.



Fig. 16. The simulated stage gain of the quadrature oscillator in the desired quadrature mode (Mode1) and undesired out-of-phase mode (Mode2).

with such source degeneration, the stage gain in the undesired mode (Mode 2) is suppressed below 0 dB. This circuit modification therefore selects the correct mode. In simulation, the quadrature oscillator consumes 76-mW DC power.



Fig. 17. (a) Shunt switches based on MOSFET and MOS varactor, with the imaginary parts tuned out by an ideal inductor. (b) Simulated OFF and ON impedance ratios of PMOS, NMOS and MOS varactor.

## B. Sub-Millimeter-Wave Switch and Pulse Modulation

Normally, MOS transistor is used for switching. However, for frequency in millimeter-wave and terahertz range, the parasitic capacitors of the transistor significantly leaks the signal when the channel is pinched off [36]. Note that even if the capacitance is tuned out by a shunt inductor, such leakage path still exists, because the quality factor of the parasitic capacitors in such high frequency is low. To evaluate the switching capability of the device, the impedance ratio of the device in OFF and ON status is simulated, after the impedance imaginary part is tuned out by an lossless inductor (Fig. 17(a). Obviously, high  $Z_{OFF}/Z_{ON}$  is desired. But Fig. 17(b) indicates that such ratio for an NMOS decreases from  $65 \times$  at 50 GHz to only  $9 \times$  at 260 GHz. To make things worse, since the signal lines to be switched carry the power supply  $(V_{DD})$ , only PMOS (instead of NMOS) can be used. And the impedance ratio of PMOS (Fig. 17(b)) is  $4 \times$  at 260 GHz. Such low OFF/ON ratio leads to an inefficient switching which reduces the power and bandwidth of the pulse-modulated output.

On the other hand, the simulated  $Z_{OFF}/Z_{ON}$  of an n-type MOS varactor in the same process is as high as  $21 \times at 260$  GHz. This is due to several reasons. (1) The switching of MOS transistors relies on the resistive change of the channel,  $\Delta R_{\rm ch}$ . To reduce  $Z_{ON}$  ( $\approx R_{\rm ch,on}$ ), larger channel width is needed, which directly increases the lossy parasitic capacitance and reduces  $Z_{OFF}$ . In comparison, a varactor switch utilizes the capacitance change  $C_{\rm max}/C_{\rm min}$  of the core (excluding the parasitics), which is not limited by the device size. The parasitic capacitance of varactor, therefore, is minimized. (2) At higher frequency, while the  $Z_{ON}$  of MOS transistor ( $\approx R_{\rm ch,on}$ ) remains the same, the  $Z_{ON}$  of varactor is smaller, which partially compensates the degradation of  $Z_{OFF}/Z_{ON}$  due to the smaller  $Z_{OFF}$  at higher

frequency. (3) The silicon for current conduction in MOS varactors is in accumulation mode, while that in MOS transistors is in inversion mode, which has more loss. In addition to the above reasons, more analysis is to be done to fully explain the results in Fig. 17(b), because the nonlinearity and loss comparison of different devices is highly dependent on process and layout. The superior nonlinearity and loss performance of MOS varactor in the same process has also been demonstrated by another 480-GHz passive frequency doubler work [21]. In the schematic shown in Fig. 18(a), a pair of MOS varactors modulates the differential signals, which are driven by two self-feeding oscillators inside a radiator unit. So the varactor bottom control node behaves as a virtual ground at  $2f_0$ . This way the large and lossy parasitics, including the n-well to p-substrate capacitor and the digital pulse generator output impedance, are invisible to the 260-GHz signal, hence the associated loss is eliminated. The tuning inductors in Fig. 17(a) are absorbed into the design of the output networks.

Near each switch, a local digital pulse generator is placed to provide the control narrow-pulse train. Compared to the single central pulse generator scheme, our solution minimizes the dispersion caused by the long distribution network, which smoothes out the sharp pulses. The schematic of the pulse generator is shown in Fig. 18(a) [37]. A multi-GHz signal feeds the inputs of a NOR gate through two inverter chains with a small delay mismatch  $\Delta t_{delay}$ . The output of the NOR gate is therefore a sharp pulse with width close to  $\Delta t_{delay}$ . The transistors of the pulse generators are 2.5-V thick-gate I/O devices, which provide larger pulse amplitude to fully turn on/off the varactor switches. The simulated differential output waveform of one self-feeding oscillator pair, with pulse modulation, is shown in Fig. 18(b). The pulse width is 45 ps, and in the idle cycle, 80% of the radiation power is attenuated.

## C. Slot Antenna and Radiation With A Silicon Lens

To radiate the high-power terahertz pulse, the on-chip antenna should have broad bandwidth and high radiation efficiency. However, there is normally a design tradeoff between these two merits. To avoid the lossy silicon substrate, some antennas like microstrip patch have a ground plane which reflects the radiation to the front side. But in CMOS process, the radiator plate and ground plate are so close (< 10  $\mu$ m), that the resonance cavity they form has a very high quality factor. Therefore the impedance matching bandwidth is only around 5% [4]. On the other hand, for the antennas without ground shield, the bandwidth is greatly improved. But in this case, most of the wave is absorbed into the silicon and travels towards the back side. At the silicon-to-air interface on the back, the power reflection rate for an incident angle  $\theta_i$  is [38]

$$R_{Si} = \left(\frac{n_1 \cos \theta_i - n_2 \sqrt{1 - n_{Si}^2 \sin^2 \theta_i}}{n_1 \cos \theta_i + n_2 \sqrt{1 - n_{Si}^2 \sin^2 \theta_i}}\right)^2$$
(21)

where the refractive index of the silicon  $n_{Si}$  is 3.45. The values of  $n_1$  and  $n_2$  are  $n_{Si}$  and 1 for s-polarization, and 1 and  $n_{Si}$ for p-polarization. Based on the calculated plots in Fig. 19(a), the total reflection critical angle is only 16° due to the large



Fig. 18. (a) The schematic of the 260-GHz switch with pulse modulation. (b) Simulated output waveform of a self-feeding oscillator pair with pulse modulation.

 $n_{Si}$ . All the wave outside of such small "window" is totally reflected and trapped in the substrate. The radiation efficiency is therefore greatly reduced. To reduce the substrate wave and loss, the wafer is thinned near 100  $\mu$ m in some works [7], [25].

In our work, an array of 8 slot antennas is used. Without having the ground reflector, the bandwidth for impedance matching  $(|S_{11}| < -10 \text{ dB})$  is over 60 GHz. Compared to the commonly-used dipole antenna, slot antenna better suppresses the undesired front side radiation [39], and the wide metal plane for current conduction reduces loss. Moreover, being a slot in a ground plane (Fig. 20(a)), the antenna fits better into the chip layout for a compact, efficient feed-line network. Shorter distance between antenna elements is also crucial to reduce the side lobes of the array combined beam. To handle the reflection issue, a high-resistivity silicon lens is attached to the chip backside (Fig. 20(b)). The lens is hemispheric, so that the incident wave in each direction is very close to normal to the lens surface, thus has the minimum reflection. In HFSS [40], the simulated directivity of one slot antenna element, including the 260- $\mu$ m-thick substrate (10 $\Omega \cdot$  cm), is 7.6 dBi for the radiation into a hypothetical semi-infinite silicon space (Fig. 21(a)). The directivity is further enhanced by 9 dB in the 8-element array (Fig. 21(b)). The simulated radiation efficiency from the antenna array to the inside of the silicon lens is 60%. Even with 30% reflection at the lens surface (calculated in Fig. 19(a)), the total radiation efficiency (antenna-to-air) is still as high as 42%.



Fig. 19. (a) The calculated reflection rate at the silicon-to-air interface. (b) The illustration of the radiated and reflected waves from an on-chip antenna.

The diameter of the silicon lens  $D_{\text{lens}}$  is 10 mm, which is 40× larger than the chip thickness  $h_{\text{chip}}$ . Therefore, the convergence to the array radiation beam in Fig. 21(b) by the lens is very small [41]. According to the geometrical optics in Fig. 20(b), the convergence effect is more significant for large polar angle  $\theta$ , or in another word, for antennas with less directivity. This is a difference between the previous works, which integrate silicon lens with single antenna element [41] or an array of independent elements [7], and our work, where the array beam is much more concentrated in the vertical direction. Even at the edge of the main lobe ( $\theta = 13^{\circ}$ ), where the radiation intensity already drops to ~2% of the peak, the deviation of  $\Delta \theta_i$  is only  $0.7^{\circ}$ .

With a hyper-hemispheric lens, more directive radiation beam and higher EIRP (effective isotropic radiated power) are expected. However, since the wave radiated from the chip to the lens is not normal to the lens surface, the total radiated power is reduced by the higher reflection loss (Fig. 19(a)). Meanwhile, the beam convergence by the hyper-hemispheric lens makes it difficult to compare the simulated pattern in Fig. 21 with measurements. Therefore, hyper-hemispheric lens is not used in this work.

# V. PROTOTYPE AND EXPERIMENTAL RESULTS

The 260-GHz radiator array is implemented using a 65-nm bulk CMOS technology. Fig. 22 shows the micrograph of the die, which has an area of  $1.5 \times 1.5 \text{ mm}^2$ . Fig. 22 also illustrates the packaging of the chip and the integration with the silicon lens. First, the edges of the chip front side is glued onto an FR-4 PCB, which has a hole to expose the chip pads. Then wires are bonded to connect the chip pads with the PCB pads. Since the EM field of the slot antennas is concentrated in the back side, the bond wires do not interfere with the circuit operation. As the



Fig. 20. (a) The implementation of the 260-GHz slot antenna. (b) The 8-element on-chip antenna array with a silicon lens attached on the chip back.

die photo shows, the antennas are laid out in the diagonal direction of the chip; so to align the radiation E-field with that of the receiver antenna in the measurement, the chip is rotated by 45° when mounted on the PCB. Next, the silicon lens is pressed onto the chip backside by a 2-D micromanipulator. The alignment of the lens with respect to the chip is adjusted until the direction of the output radiation beam is vertical to the chip. Finally, the lens is glued with the PCB. The chip consumes 0.8-W power from a 1.2-V DC supply. The large DC current flows through only 40 RF transmission lines (W = 2  $\mu$ m) into the transistors, so it is critical to prevent overheating that may burn down the lines. Unfortunately, since the two sides of the chip are occupied by bond wires and silicon lens respectively, it is not possible to mount a heat sink for thermal pathway. So a cooling fan mounting to the front of the chip is used instead. A photo of the packaging is shown in Fig. 23.

The setup for measuring the radiation frequency and spectrum is shown in Fig. 24. The modulation function of the chip is first turned off. The radiation from the chip is received by a diagonal horn antenna (gain = 25 dBi) cascaded by a VDI WR-3.4 even-harmonic mixer (EHM). The diode-based mixer is first forward biased with a 10- $\mu$ A current to operate in the direct power detection mode. Its output response is much faster than that of the calorimeter, which greatly helps the initial silicon lens-to-chip alignment. Next, through a PMP-MD4A diplexer,



Fig. 21. The simulated radiation pattern of (a) the slot antenna unit and (b) the 8-element slot antenna array.



Fig. 22. The micrograph of the 260-GHz radiator array in CMOS and the integration of the silicon lens.

an LO signal is fed into the harmonic mixer. The input radiation signal is mixed with the 16th harmonic of the LO signal, thus down converted to a low frequency signal ( $f_{\rm IF} \approx 1$  GHz). The IF signal is measured by a spectrum analyzer after amplified by an LNA (gain  $\approx 30$  dB). The chip radiation frequency is given by  $f_{\rm RF} = N \cdot f_{\rm LO} \pm f_{\rm IF}$  (N = 16), and in practical, the input radiation signal can also be mixed with other LO harmonic. In our measurement, the value of N is determined by the fact that  $f_{\rm IF}$  is shifted by 160 MHz if  $f_{\rm LO}$  is intentionally changed by 10 MHz. When  $f_{\rm LO}$  is 16.19 GHz, the measured IF spectrum around 1 GHz is obtained (Fig. 25). This gives a measured radiation frequency of 260 GHz. Fig. 25 also shows the measured phase noise spectrum from 500 kHz to 5 MHz. At 1-MHz offset, the phase noise is -78.3 dBc/Hz. In Fig. 10(a), there are a pair



Fig. 23. The photo of the chip package and the testing setup with sub-harmonic mixer.



Fig. 24. The block diagram of the testing setups for measuring the radiation frequency and spectrum of the 260-GHz radiator array.



Fig. 25. The measured baseband spectrum from the sub-harmonic mixer.

of small-value varactors inside each self-feeding oscillator. By changing the varactor bottom plate bias,  $V_{tune}$ , the measured radiation frequency of the chip is changed by 3.7 GHz, shown in Fig. 26.

The aperture diameter of the horn,  $D_h$ , is 5.6 mm [42], which results in a minimum far-field distance of 54 mm in free space  $(2D_h^2/\lambda_0, [43], [44])$ . Due to the silicon lens with a radius of



Fig. 26. The measured frequency tuning range of the radiated signal.



Fig. 27. The measured radiation pattern of the 260-GHz signal source.

5 mm (equivalent to  $5 \cdot \sqrt{\epsilon_{Si}} = 17$  mm in free space), the distance between the horn antenna and the silicon lens surface should be larger than 37 mm. In our measurement, it is 40 mm (45 mm from the horn to the chip antenna). Next, the radiation pattern is characterized by rotating the chip in the azimuth  $\varphi$  and elevation  $\theta$  directions (Fig. 24) with a set of servo motors. The measured normalized intensity,  $F(\theta, \varphi)$ , in the E-plane ( $\varphi = 0$ ) and H-plane ( $\varphi = \pi/2$ ) is shown in Fig. 27. The directivity of the chip with the backside radiation ( $\theta = \pi$ ) is determined by [43]

$$D_{0} = 4\pi \frac{F(\theta_{0},\varphi_{0})}{\int_{0}^{2\pi} \int_{0}^{\pi} F(\theta,\varphi) \sin \theta d\theta d\varphi}$$
$$\approx 4\pi \frac{F_{\theta=\pi}}{\frac{2\pi}{M} \frac{2\pi}{2N} \sum_{j=1}^{M} \sum_{i=1}^{N} F(\theta_{i},\varphi_{j}) \sin \theta_{i}}, \qquad (22)$$

where M, the number of measured pattern cuts, equals to 2. For highly-directive antennas, the two orthogonal planes (E-plane and H-plane) are adequate [43], [44]. In (22), N, which equals to 90, is the number of measured points within each pattern cut  $(\Delta \theta = (\pi/2N)(180/\pi) = 1^{\circ})$ . The measured directivity of the chip is 15.2 dBi. The higher side lobes in the measurement cause the 1.4-dB directivity degradation compared to the simulation in Fig. 21(b). Next, an Erikson PM4 calorimeter is used to accurately measure the radiation power (Fig. 28).<sup>2</sup> At the same far-field distance, the measured power is  $48\mu$ W, which gives



Fig. 28. The block diagram of the testing setup for the accurate radiation power measurement of the 260-GHz radiator array.



Fig. 29. The measured radiation spectrum of the 260-GHz radiator array with and without the narrow-pulse modulation.

an EIRP of 15.7 dBm based on the Friis equation [45]. This is equivalent to a 37-mW isotropic source for the same radiation power density. The actual radiation power of our source is  $(EIRP_{dBm}-Directivity_{dB})$ , which is 0.5 dBm, or 1.1 mW in the measurement.

Finally, the narrow pulse modulation of the chip is turned on by injecting a 3.5-GHz sinusoidal signal into the modulation port. Then by sweeping the LO frequency  $f_{\rm LO}$  of the mixer, the frequencies of other side bands are found. Although the power of each side band cannot be measured separately by the power meter, it is still estimated using the mixer, regarding its relative difference from the single-tone power measured before. The measured radiation spectrum is plotted in Fig. 29. In total, 6 side bands are measured above the noise floor. They are spaced by 3.5 GHz, which gives a null-to-null radiation bandwidth of 21 GHz. Fig. 29 also presents the simulated radiation spectrum, which has a bandwidth as high as 40 GHz. In the measurement, the radiation frequency can also be continuously changed by 3.7 GHz as is described earlier. Therefore, the entire spectrum is shifted by the same amount, with moving range overlap between neighboring sidebands. This means the radiation from our 260-GHz array continuously covers a bandwidth of 24.7 GHz. In Fig. 29, the lowest measured power of the side bands is  $-32 \, dBm \, (0.6 \, \mu W)$ , which is higher than the typical average power of the incoherent blackbody source inside a Fourier transform spectroscopy system (0.1  $\mu$ W, [46]). This indicates the feasibility of integrating the 260-GHz radiator array into an FTIR-based spectrometer.

## VI. CONCLUSIONS

The self-feeding oscillator structure proposed in this paper achieves the optimum gain conditions for the fundamental

 $<sup>^{2}</sup>$ The effect of the chip infrared radiation is examined by rotating the powered chip by 90°, in order to preserve the infrared radiation but block the coupling at 260 GHz using the polarization orthogonality of the antennas. The thermal effect is not observed, which may due to the active cooling on the other side of the PCB. The power measurement is also calibrated using the sub-harmonic mixer in its direct-power-detection mode (Fig. 24).

| References | Frequency | Output              | EIRP  | Bandwidth | Phase Noise  | DC Power | Area                       | CMOS       |
|------------|-----------|---------------------|-------|-----------|--------------|----------|----------------------------|------------|
|            | (GHz)     | Power (dBm)         | (dBm) |           | @ 1 MHz      | (W)      | ( <b>mm</b> <sup>2</sup> ) | Technology |
| [24]       | 290       | -1.2†               | -     | 4.5%      | -78 dBc/Hz   | 0.32     | 0.36                       | 65-nm Bulk |
| [25]       | 280       | -7.2 <sup>††</sup>  | 9.4   | 3.2       | -            | 0.81     | 7.3                        | 45-nm SOI  |
| [10]       | 260       | -                   | 5     | -         | -            | 0.69     | ~3                         | 65-nm Bulk |
| [27]       | 288       | -4.1 <sup>†††</sup> | -     | 0.7%      | -87 dBc/Hz   | 0.28     | 0.29                       | 65-nm Bulk |
| [11]       | 210       | -                   | 5.13  | >6.7%     | -            | 0.24     | 3.5                        | 32-nm SOI  |
| This Work  | 260       | 0.5 <sup>†††</sup>  | 15.7  | 9.5%      | -78.3 dBc/Hz | 0.8      | 2.3                        | 65-nm Bulk |

 TABLE I

 Performance Comparison of Sub-Millimeter-Wave Signal Sources in CMOS

<sup>†</sup>: Power measured through probing, not radiation <sup>††</sup>: wafer thinning is used

<sup>†††</sup>: Silicon lens is used (hyperspheric in [27], and hemispheric in this work)

oscillation, and therefore maximizes the device voltage swings for the nonlinear frequency conversion  $(f_0$ -to- $2f_0)$ . Meanwhile, by blocking the negative-feedback loop and the path to the lossy gate load for the signal at  $2f_0$ , the available harmonic power from the transistor is greatly improved. The CMOS prototype deploying 8 self-feeding units demonstrates an EIRP of 15.7 dBm and radiated power of 1.1 mW. Meanwhile, the narrow-pulse modulation scheme of the chip effectively broadens the radiation spectrum to 24.7 GHz. The performance of the chip is summarized in Table I, along with a comparison with other state-of-the-art sub-millimeter-wave CMOS signal sources. Our 260-GHz radiator array achieves the highest radiated power, EIRP, and bandwidth in the table.

#### ACKNOWLEDGMENT

The authors acknowledge the TSMC University Shuttle Program for chip fabrication. The authors also thank the previous and current group members Dr. Yahya Tousi, Dr. Guansheng Li, Prof. Omeed Momeni, and Dr. Muhammad Adnan for the productive discussion, and they thank Jared Strait and Prof. Farhan Rana from Cornell University for the helpful suggestions on measurement.

#### REFERENCES

- [1] P. H. Siegel, "Terahertz technology," *IEEE Trans. Microw. Theory Tech.*, vol. 50, no. 3, pp. 910–928, Mar. 2002.
- [2] M. Tonouchi, "Cutting-edge terahertz technology," *Nature Photonics*, vol. 1, no. 2, pp. 97–105, Feb. 2007.
- [3] E. Seok, D. Shim, C. Mao, R. Han, S. Sankaran, C. Cao, W. Knap, and K. K. O, "Progress and challenges towards terahertz CMOS integrated circuits," *IEEE J. Solid-State Circuits*, vol. 45, no. 8, pp. 1554–1564, Aug. 2010.
- [4] R. Han, Y. Zhang, D. Coquillat, H. Videlier, W. Knap, E. Brown, and K. K. O, "A 280-GHz Schottky diode detector in 130-nm digital CMOS," *IEEE J. Solid-State Circuits*, vol. 46, no. 11, pp. 2602–2612, Nov. 2011.
- [5] R. Han, Y. Zhang, Y. Kim, D. Kim, H. Shichijo, E. Afshari, and K. K. O, "280GHz and 860GHz image sensors using Schottky-barrier diodes in 0.13µm digital CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2012.
- [6] F. Schuster, H. Videlier, A. Dupret, D. Coquillat, M. Sakowicz, J. Rostaing, M. Tchagaspanian, B. Giffard, and W. Knap, "A broadband THz imager in a low-cost CMOS technology," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2011.
- [7] H. Sherry, J. Grzyb, Y. Zhao, R. Hadi, A. Cathelin, A. Kaiser, and U. Pfeiffer, "A 1kPixel CMOS camera chip for 25fps real-time terahertz imaging applications," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2012.

- [8] R. Hadi, J. Grzyb, B. Heinemann, and U. Pfeiffer, "Terahertz detector arrays in a high-performance SiGe HBT technology," in *IEEE Bipolar/ BiCMOS Circuits and Technology Meeting*, Portland, OR, USA, Oct. 2012.
- [9] T. Morf, B. Klein, M. Despont, U. Drechsler, L. Kull, D. Corcos, D. Elad, N. Kaminski, M. Braendli, C. Menolfi, M. Kossel, P. Francese, T. Toifl, and D. Plettemeier, "Room-temperature THz imaging based on antenna-coupled MOSFET bolometer," in *Int. Conf. Micro Electro Mechanical Systems*, Taipei, Taiwan, Jan. 2013.
- [10] J. Park, S. Kang, S. Thyagarajan, E. Alon, and A. M. Niknejad, "A 260 GHZ fully integrated CMOS transceiver for wireless chip-to-chip communication," in *Symp. VLSI Circuits Dig.*, Jun. 2012, pp. 48–49.
- [11] Z. Wang, P. Chiang, P. Nazari, C. Wang, Z. Chen, and P. Heydari, "A 210 GHz fully integrated differential transceiver with fundamentalfrequency VCO in 32nm SOI CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2013.
- [12] T. W. Crowe, W. L. Bishop, D. W. Porterfield, J. L. Hesler, and R. M. Weikle, "Opening the terahertz window with integrated diode circuits," *IEEE J. Solid-State Circuits*, vol. 40, no. 10, pp. 2104–2110, Oct. 2005.
- [13] M. Gupta, "Power gain in feedback amplifiers, a classic revisited," *IEEE Trans. Microw. Theory Tech.*, vol. 40, no. 5, pp. 864–879, May 1992.
- [14] V. Radisic, K. Leong, X. Mei, S. Sarkozy, W. Yoshida, and W. R. Deal, "Power amplification at 0.65THz using INP HEMTs," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 3, pp. 724–729, Mar. 2012.
- [15] A. Maestrini, J. S. Ward, J. J. Gill, H. S. Javadi, E. Schlecht, C. Tripon-Canseliet, G. Chattopadhyay, and I. Mehdi, "A 540–640-GHz highefficiency four-anode frequency tripler," *IEEE Trans. Microw. Theory Tech.*, vol. 53, no. 9, pp. 2835–2843, Sep. 2005.
- [16] N. Gopalsami and A. C. Raptis, "Millimeter-wave radar sensing of airborne chemicals," *IEEE Trans. Microw. Theory Tech.*, vol. 49, no. 4, pp. 646–653, Apr. 2001.
- [17] H. J. Hansen, "Standoff detection using millimeter and submillimeter wave spectroscopy," *Proc. IEEE*, vol. 95, no. 8, pp. 1691–1704, Aug. 2007.
- [18] B. Cetinoneri, Y. Atesal, A. Fung, and G. Rebeiz, "W-band amplifiers with 6-dB noise figure and milliwatt-level 170–200-GHz doublers in 45-nm CMOS," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 3, pp. 692–701, Mar. 2012.
- [19] O. Momeni and E. Afshari, "A 220–275 GHZ traveling wave frequency doubler with -6.6 dBm power at 244 GHZ in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2011, pp. 286–288.
- [20] O. Momeni and E. Afshari, "A broadband mm-Wave and terahertz traveling-wave frequency multiplier on CMOS," *IEEE J. Solid-State Circuits*, vol. 46, no. 12, pp. 2966–2976, Dec. 2011.
- [21] R. Han and E. Afshari, "A high-power broadband passive terahertz frequency doubler in CMOS," *IEEE Trans. Microw. Theory Tech.*, vol. 61, no. 3, pp. 1150–1160, Mar. 2013.
- [22] O. Momeni and E. Afshari, "High power terahertz and sub-millimeter wave oscillator design: a systematic approach," *IEEE J. Solid-State Circuits*, vol. 46, no. 3, pp. 583–597, Mar. 2011.
- [23] Y. Tousi, O. Momeni, and E. Afshari, "A 283-to-296 GHz VCO with 0.76 mW peak output power in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, Feb. 2012, pp. 258–259.

- [24] Y. Tousi, O. Momeni, and E. Afshari, "A novel CMOS high-power terahertz VCO based on coupled oscillators: Theory and implementation," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 3032–3042, Dec. 2012.
- [25] K. Sengupta and A. Hajimiri, "A 0.28THz 4x4 power-generation and beam-steering array," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2012.
- [26] K. Sengupta and A. Hajimiri, "A 0.28 THZ power-generation and beam-steering array in CMOS based on distributed active radiators," *IEEE J. Solid-State Circuits*, vol. 47, no. 12, pp. 3032–3042, Dec. 2012.
- [27] Y. Zhao, J. Grzyb, and U. R. Pfeiffer, "A 288-GHz lens-integrated balanced triple-push source in a 65-nm CMOS technology," in *European Solid-State Circuits Conf. (ESSCIRC)*, Bordeaux, France, Sep. 2012.
- [28] R. Han and E. Afshari, "A 260 GHz broadband source with 1.1 mW continuous-wave radiated power and EIRP of 15.7 dBm in 65 nm CMOS," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2013.
- [29] B. S. Williams, "Terahertz quantum-cascade lasers," *Nature Photonics*, vol. 1, pp. 517–525, Sep. 2007.
- [30] R. Ives, C. Kory, M. Read, J. Neilson, M. Caplan, N. Chubun, S. Schwartzkopf, and R. Witherspoon, "Development of backward-wave oscillators for terahertz applications," *Proc. SPIE*, vol. 5070, pp. 71–82, Aug. 2003.
- [31] E. Afshari and A. Hajimiri, "Non-linear transmission lines for pulse shaping in silicon," *IEEE J. Solid-State Circuits*, vol. 40, no. 3, pp. 744–752, Mar. 2005.
- [32] W. Lee, M. Adnan, O. Momeni, and E. Afshari, "A nonlinear lattice for high amplitude, picosecond pulse generation in CMOS," *IEEE Trans. Microw. Theory Tech.*, vol. 60, no. 2, pp. 370–380, Feb. 2012.
- [33] R. Spence, *Linear Active Networks*. New York, NY, USA: Wiley-Interscience, 1970.
- [34] D. M. Pozar, *Microwave Engineering*, 3rd ed. New York: Wiley, 2005.
- [35] L. Franca-Neto, R. Bishop, and B. Bloechel, "64 GHz and 100 GHz VCOs in 90 nm CMOS using optimum pumping method," in *IEEE Int. Solid-State Circuits Conf. Dig. Tech. Papers*, San Francisco, CA, USA, Feb. 2004.
- [36] C. M. Ta, E. Skafidas, and R. J. Evans, "A 60-GHz CMOS transmit/ receive switch," in *Proc. IEEE Radio-Frequency Integrated Circuits Symp.*, Jun. 2007, pp. 725–728.
- [37] A. Arbabian, S. Callender, S. Kang, B. Afshar, J. Chien, and A. M. Niknejad, "A 90 GHZ hybrid switching pulsed-transmitter for medical imaging," *IEEE J. Solid-State Circuits*, vol. 45, no. 12, pp. 2667–2681, Dec. 2010.
- [38] C. A. Brau, Modern Problems in Classical Electrodynamics. New York, NY, USA: Oxford Univ. Press, 2004.
- [39] N. G. Alexopoulos, P. B. Katehi, and D. B. Rutledge, "Substrate optimization for integrated circuit antennas," *IEEE Trans. Microw. Theory Tech.*, vol. 31, no. 7, pp. 550–557, Jul. 1983.
- [40] High Frequency Structure Simulator (HFSS) User Guide, ANSYS Inc. [Online]. Available: http://www.ansys.com/.
- [41] D. F. Filipovic, S. S. Gearhart, and G. M. Rebeiz, "Double-slot antennas on extended hemispherical and elliptical silicon dieletric lenses," *IEEE Trans. Microw. Theory Tech.*, vol. 41, no. 10, pp. 1738–1749, Oct. 1993.
- [42] Nominal Horn Specifications. Virginia Diodes Inc., Charlottesville, VA, USA, 2011.
- [43] C. A. Balanis, Antenna Theory: Analysis and Design. New York, NY, USA: Harper & Row, 1982.
- [44] IEEE Standard Test Procedures for Antennas, ANSI/IEEE Std 149–1979 (R2008) (Reaffirmed 1990, 2003, 2008), 1979.
- [45] H. T. Friis, "A note on a simple transmission formula," *Proc. IRE*, vol. 34, no. 5, pp. 254–256, May 1946.

[46] P. Y. Han, M. Tani, M. Usami, S. Kono, R. Kersting, and X.-C. Zhang, "A direct comparison between terahertz time-domain spectroscopy and far-infrared Fourier transform spectroscopy," *J. Appl. Phys.*, vol. 89, no. 4, pp. 2357–2359, Feb. 2001.



**Ruonan Han** was born in Hohhot, China, in 1984. He received the B.Sc. degree in microelectronics from Fudan University, Shanghai, China, in 2007, the M.Sc. degree in electrical engineering from the University of Florida, Gainesville, FL, USA, in 2009, and is currently working toward the Ph.D. degree in electrical engineering at Cornell University, Ithaca, NY, USA. His doctoral research is focused on terahertz signal generation and detection circuits using CMOS and GaN technologies.

In the summer of 2012, he was an intern with

Rambus Inc., Sunnyvale, CA, USA, where he was involved with fast-locking clock and data recovery circuits. He has authored or coauthored over 20 journal and conference publications.

Mr. Han is a student member of the IEEE Solid-State Circuits Society. He is a reviewer for the IEEE TRANSACTIONS ON MICROWAVE THEORY AND TECHNIQUES and the IEEE International Symposium on Circuits and Systems (ISCAS). He was the recipient of the Best Student Paper Award (2nd place) of the 2012 Radio-Frequency Integrated Circuits (RFIC) Symposium and the Helic Student Scholarship of the 2010 Custom Integrated Circuits Conference (CICC). He was also the recipient of the Irwin and Joan Jacobs Fellowship (2011), the John M. Olin Fellowship (2010), and the IEEE Solid-State Circuits Society Pre-Doctoral Achievement Award (2012–2013).



**Ehsan Afshari** was born in 1979. He received the B.Sc. degree in electronics engineering from the Sharif University of Technology, Tehran, Iran, in 2001, and the M.S. and Ph.D. degree in electrical engineering from the California Institute of Technology, Pasadena, CA, USA, in 2003 and 2006, respectively.

In August 2006, he joined the faculty of the Department of Electrical and Computer Engineering, Cornell University, Ithaca, NY, USA. His research interests are millimeter-wave and terahertz elec-

tronics and low-noise integrated circuits for applications in communication systems, sensing, and biomedical devices.

Prof. Afshari is the chair of the IEEE Ithaca Section and chair of Cornell Highly Integrated Physical Systems (CHIPS). He is a member of the International Technical Committee, IEEE Solid-State Circuit Conference (ISSCC), the Analog Signal Processing Technical Committee, IEEE Circuits and Systems Society, the Technical Program Committee, IEEE Custom Integrated Circuits Conference (CICC), and the Technical Program Committee, IEEE International Conference on Ultra-Wideband (ICUWB). He was the recipient of the National Science Foundation CAREER Award (2010), Cornell College of Engineering Michael Tien Excellence in Teaching Award (2010), Defense Advanced Research Projects Agency (DARPA) Young Faculty Award (2008), and Iran's Best Engineering Student Award presented by the President of Iran (2001). He was also the recipient of the Best Paper Award of the Custom Integrated Circuits Conference (CICC) (2003), First Place of the Stanford-Berkeley-Caltech Inventors Challenge (2005), the Best Undergraduate Paper Award of the Iranian Conference on Electrical Engineering (1999), the Silver Medal of the Physics Olympiad (1997), and the Award of Excellence in Engineering Education from the Association of Professors and Scholars of Iranian Heritage (APSIH) (2004).