scispace - formally typeset
Search or ask a question

Showing papers on "CMOS published in 2020"


Journal ArticleDOI
TL;DR: In this paper, the authors provide a tutorial overview of recent efforts to develop computing systems based on spin waves instead of charges and voltages, and discuss the current status and challenges to combine spin-wave gates and obtain circuits and ultimately computing systems, considering essential aspects such as gate interconnection, logic level restoration, input output consistency, and fan-out achievement.
Abstract: This paper provides a tutorial overview over recent vigorous efforts to develop computing systems based on spin waves instead of charges and voltages. Spin-wave computing can be considered a subfield of spintronics, which uses magnetic excitations for computation and memory applications. The Tutorial combines backgrounds in spin-wave and device physics as well as circuit engineering to create synergies between the physics and electrical engineering communities to advance the field toward practical spin-wave circuits. After an introduction to magnetic interactions and spin-wave physics, the basic aspects of spin-wave computing and individual spin-wave devices are reviewed. The focus is on spin-wave majority gates as they are the most prominently pursued device concept. Subsequently, we discuss the current status and the challenges to combine spin-wave gates and obtain circuits and ultimately computing systems, considering essential aspects such as gate interconnection, logic level restoration, input–output consistency, and fan-out achievement. We argue that spin-wave circuits need to be embedded in conventional complementary metal–oxide–semiconductor (CMOS) circuits to obtain complete functional hybrid computing systems. The state of the art of benchmarking such hybrid spin-wave–CMOS systems is reviewed, and the current challenges to realize such systems are discussed. The benchmark indicates that hybrid spin-wave–CMOS systems promise ultralow-power operation and may ultimately outperform conventional CMOS circuits in terms of the power-delay-area product. Current challenges to achieve this goal include low-power signal restoration in spin-wave circuits as well as efficient spin-wave transducers.

169 citations


Proceedings ArticleDOI
01 Feb 2020
TL;DR: This work implements a fully-integrated 784-100-10 MLP model on an integrated CIM chip with158.8kb analog ReRAMs and proposes a low-power interface design with resolution-adjustable LPAR-ADC to realize flexible tradeoff between system accuracy and power consumption.
Abstract: Non-volatile memory (NVM) based computing-in-memory (CIM) shows significant advantages in handling deep learning tasks for artificial intelligence (AI) applications. To overcome the decreasing cost effectiveness of transistor scaling and the intrinsic inefficiency of data-shuttling in the von-Neumann architecture, CIM is proposed to realize high-speed and low-power system with parallel multiplication accumulation (MAC) computing [1] [2]. However, current demonstrations are mainly based on single macro and present limited computing parallelism. Realizing a fully-integrated CIM chip with a complete neural network model is still missing. The major challenges lie in: (1) The IR drop and transient errors when carrying out MAC operations in non-volatile memory arrays decrease the computing accuracy and further limit the parallelism; (2) The inefficiency of the interface blocks between different arrays due to the power overhead of the A/D and D/A converters (shown in Fig. 33.2.1). To address these challenges, this work proposes: (1) A sign-weighted 2T2R (SW-2T2R) array to reduce IR drop by decreasing the accumulative SL current (ISL), and eventually boost the computing parallelism; (2) a low-power interface design with resolution-adjustable LPAR-ADC to realize flexible tradeoff between system accuracy and power consumption. In this manner, this work implements a fully-integrated 784-100-10 MLP model on an integrated CIM chip with158.8kb analog ReRAMs. This chip realizes high recognition accuracy (94.4%) on MNIST database, high inference speed (77 µs/lmage), and 78.4 TOPS/W peak energy efficiency. The CMOS circuits are fabricated in a 130nm process.

154 citations


Journal ArticleDOI
TL;DR: XNOR-SRAM is a mixed-signal in-memory computing (IMC) SRAM macro that computes ternary-X NOR-and-accumulate (XAC) operations in binary/ternary deep neural networks (DNNs) without row-by-row data access and represents among the best tradeoff in energy efficiency and DNN accuracy.
Abstract: We present XNOR-SRAM, a mixed-signal in-memory computing (IMC) SRAM macro that computes ternary-XNOR-and-accumulate (XAC) operations in binary/ternary deep neural networks (DNNs) without row-by-row data access. The XNOR-SRAM bitcell embeds circuits for ternary XNOR operations, which are accumulated on the read bitline (RBL) by simultaneously turning on all 256 rows, essentially forming a resistive voltage divider. The analog RBL voltage is digitized with a column-multiplexed 11-level flash analog-to-digital converter (ADC) at the XNOR-SRAM periphery. XNOR-SRAM is prototyped in a 65-nm CMOS and achieves the energy efficiency of 403 TOPS/W for ternary-XAC operations with 88.8% test accuracy for the CIFAR-10 data set at 0.6-V supply. This marks $33\times $ better energy efficiency and $300\times $ better energy–delay product than conventional digital hardware and also represents among the best tradeoff in energy efficiency and DNN accuracy.

130 citations


Proceedings ArticleDOI
01 Feb 2020
TL;DR: Compute-in-memory parallelizes multiply-and-average (MAV) computations and reduces off-chip weight access to reduce energy consumption and latency and cannot meet the requirement for high-precision operations and scalability for large neural networks.
Abstract: Compute-in-memory (CIM) parallelizes multiply-and-average (MAV) computations and reduces off-chip weight access to reduce energy consumption and latency, specifically for Al edge devices. Prior CIM approaches demonstrated tradeoffs for area, noise margin, process variation and weight precision. 6T SRAM [1]–[3] provides the smallest cell area for CIM, but cell stability limits the number of activated cells, resulting in low parallelization. 10T and twin-8T [4]–[5] isolate the read/write paths for noise margin improvement, however both require special design of the bit cell using logic layout rules, resulting in over a 2x area overhead compared to foundry yield-optimized 6T SRAMs. Furthermore, single-bit precision of weights, in prior work [1]–[4], cannot meet the requirement for high-precision operations and scalability for large neural networks.

115 citations


Journal ArticleDOI
TL;DR: A neutralized bi-directional technique is introduced in this work to reduce the chip area significantly and Compact and low-cost 5G millimeter-wave MIMO systems could be realized.
Abstract: This article presents a low-cost and area-efficient 28-GHz CMOS phased-array beamformer chip for 5G millimeter-wave dual-polarized multiple-in-multiple-out (MIMO) (DP-MIMO) systems. A neutralized bi-directional technique is introduced in this work to reduce the chip area significantly. With the proposed technique, completely the same circuit chain is shared between the transmitter and receiver. To further minimize the area, an active bi-directional vector-summing phase shifter is also introduced. Area-efficient and high-resolution active phase shifting could be realized in both TX and RX modes. In measurement, the achieved saturated output power for the TX-mode beamformer is 15.1 dBm. The RX-mode noise figure is 4.2 dB at 28 GHz. To evaluate the over-the-air performance, 16 H+16 V sub-array modules are implemented in this work. Each of the sub-array modules consists of four 4 H+4 V chips. Two sub-array modules in this work are capable of scanning the beam from −50° to +50°. A saturated EIRP of 45.6 dBm is realized by 32 TX-mode beamformers. Within 1-m distance, a maximum SC-mode data rate of 15 Gb/s and the 5G new radio downlink packets transmission in 256-QAM could be supported by the module. A $2\times 2$ DP-MIMO communication is also demonstrated with two 5G new radio 64-QAM uplink streams. Thanks to the proposed area-efficient bi-directional technique, the required core area for a single element-beamformer is only 0.58 mm2. Compact and low-cost 5G millimeter-wave MIMO systems could be realized.

113 citations


Journal ArticleDOI
TL;DR: A general-purpose hybrid in-/near-memory compute SRAM (CRAM) that combines an 8T transposable bit cell with vector-based, bit-serial in-memory arithmetic to accommodate a wide range of bit-widths, as well as a complete set of operation types, including integer and floating-point addition, multiplication, and division.
Abstract: This article proposes a general-purpose hybrid in-/near-memory compute SRAM (CRAM) that combines an 8T transposable bit cell with vector-based, bit-serial in-memory arithmetic to accommodate a wide range of bit-widths, from single to 32 or 64 bits, as well as a complete set of operation types, including integer and floating-point addition, multiplication, and division. This approach provides the flexibility and programmability necessary for evolving software algorithms ranging from neural networks to graph and signal processing. The proposed design was implemented in a small Internet of Things (IoT) processor in the 28-nm CMOS consisting of a Cortex-M0 CPU and 8 CRAM banks of 16 kB each (128 kB total). The system achieves 475-MHz operation at 1.1 V and, with all CRAMs active, produces 30 GOPS or 1.4 GFLOPS on 32-bit operands. It achieves an energy efficiency of 0.56 TOPS/W for 8-bit multiplication and 5.27 TOPS/W for 8-bit addition at 0.6 V and 114 MHz.

103 citations


Journal ArticleDOI
Akira Goda1
TL;DR: In this article, a gate-all-around cell architecture realized excellent reliability and program/read performance by having a large physical cell size and good shielding of cell-to-cell interference.
Abstract: Since the introduction of a 3-D NAND product in 2014, the areal density has increased by more than 8 times (from 0.96 to 7.80 Gb/mm2) in the recent five years. The increase of word-line (WL) stacking from 24 to 128 layers, the scaling of bits per cell from 2 to 3 bits/cell and 4 bits/cell, and a CMOS under array technology enabled this successful 3-D NAND density scaling. A gate-all-around cell architecture realized excellent reliability and program/read performance by having a large physical cell size and good shielding of cell-to-cell interference. In the newly introduced 3-D NAND devices, several new cell phenomena have been reported. Unique temperature dependence and threshold-voltage instability are observed due to the polysilicon channel. Down-coupling of the floating body and its impact on program disturb and hot electron injection were reported. The cell-to-cell interference is enhanced by the effective gate length modulation due to the no-lightly doped drain (LDD) string architecture. For the future 3-D NAND scaling, WL stacking will continue to be a key driver. In addition, XYZ dimension shrink of the cell will become another key critical scaling direction in order to relieve the cost and device challenges introduced by the WL stacking. Various device and structure solutions for the XYZ cell scaling have been suggested. The good engineering in number of electrons and cell-to-cell interference in the XYZ scaled cell will be critical in order to maintain the excellent reliability and performance of 3-D NAND.

99 citations


Journal ArticleDOI
04 Mar 2020
TL;DR: To ensure the complete accuracy of signals, logic values, devices, and interconnects, manufacturing and verification costs will increase significantly, because parameter variations and faults at advanced nanoscales become difficult to control and prevent.
Abstract: Computing systems are conventionally designed to operate as accurately as possible. However, this trend faces severe technology challenges, such as power consumption, circuit reliability, and high performance. For nearly half a century, performance and power consumption of computing systems have been consistently improved by relying mostly on technology scaling. As per Dennard’s scaling, the size of a transistor has been considerably shrunk and the supply voltage has been reduced over the years, such that circuits operate at higher frequencies but nearly at the same power dissipation level. However, as Dennard’s scaling tends toward an end, it is difficult to further improve performance under the same power constraints. Power consumption has been a major concern, and it is now an industry-wide problem of critical importance. In addition to power, reliability deteriorates when the feature size of complementary metal–oxide–semiconductor (CMOS) technology is reduced below 7 nm, because parameter variations and faults at advanced nanoscales become difficult to control and prevent. Thus, to ensure the complete accuracy of signals, logic values, devices, and interconnects, manufacturing and verification costs will increase significantly.

98 citations


Journal ArticleDOI
TL;DR: New architectures, simulation methods, and process technology for nano-scale transistors on the approach to the end of ITRS technology are presented and new metrology techniques that may appear in the near future are discussed.
Abstract: The international technology roadmap of semiconductors (ITRS) is approaching the historical end point and we observe that the semiconductor industry is driving complementary metal oxide semiconductor (CMOS) further towards unknown zones. Today's transistors with 3D structure and integrated advanced strain engineering differ radically from the original planar 2D ones due to the scaling down of the gate and source/drain regions according to Moore's law. This article presents a review of new architectures, simulation methods, and process technology for nano-scale transistors on the approach to the end of ITRS technology. The discussions cover innovative methods, challenges and difficulties in device processing, as well as new metrology techniques that may appear in the near future.

89 citations


Journal ArticleDOI
01 Jun 2020
TL;DR: A monolithically integrated electro-optical transmitter that can achieve symbol rates beyond 100 GBd is reported, and addresses key challenges in monolithic integration through co-design of the electronic and plasmonic layers, including thermal design, packaging and a nonlinear organic Electro-optic material.
Abstract: To address the challenge of increasing data rates, next-generation optical communication networks will require the co-integration of electronics and photonics. Heterogeneous integration of these technologies has shown promise, but will eventually become bandwidth-limited. Faster monolithic approaches will therefore be needed, but monolithic approaches using complementary metal–oxide–semiconductor (CMOS) electronics and silicon photonics are typically limited by their underlying electronic or photonic technologies. Here, we report a monolithically integrated electro-optical transmitter that can achieve symbol rates beyond 100 GBd. Our approach combines advanced bipolar CMOS with silicon plasmonics, and addresses key challenges in monolithic integration through co-design of the electronic and plasmonic layers, including thermal design, packaging and a nonlinear organic electro-optic material. To illustrate the potential of our technology, we develop two modulator concepts—an ultra-compact plasmonic modulator and a silicon-plasmonic modulator with photonic routing—both directly processed onto the bipolar CMOS electronics. The monolithic integration of electronic and plasmonic technologies can be used to create electro-optic transmitters capable of symbol rates beyond 100 GBd.

78 citations


Journal ArticleDOI
TL;DR: Based on the simulation results, it can be stated that the proposed hybrid FA circuit is an attractive alternative in the data path design of modern high-speed Central Processing Units.
Abstract: A novel design of a hybrid Full Adder (FA) using Pass Transistors (PTs), Transmission Gates (TGs) and Conventional Complementary Metal Oxide Semiconductor (CCMOS) logic is presented. Performance analysis of the circuit has been conducted using Cadence toolset. For comparative analysis, the performance parameters have been compared with twenty existing FA circuits. The proposed FA has also been extended up to a word length of 64 bits in order to test its scalability. Only the proposed FA and five of the existing designs have the ability to operate without utilizing buffer in intermediate stages while extended to 64 bits. According to simulation results, the proposed design demonstrates notable performance in power consumption and delay which accounted for low power delay product. Based on the simulation results, it can be stated that the proposed hybrid FA circuit is an attractive alternative in the data path design of modern high-speed Central Processing Units.

Journal ArticleDOI
TL;DR: This work presents a resistive RAM-based in-memory computing (IMC) design, which is fabricated in 90-nm CMOS with monolithic integration of RRAM devices and demonstrates improvements in throughput and energy–delay product (EDP) compared with the state-of-the-art literature.
Abstract: Deep neural network (DNN) hardware designs have been bottlenecked by conventional memories, such as SRAM due to density, leakage, and parallel computing challenges. Resistive devices can address the density and volatility issues but have been limited by peripheral circuit integration. In this work, we present a resistive RAM (RRAM)-based in-memory computing (IMC) design, which is fabricated in 90-nm CMOS with monolithic integration of RRAM devices. We integrated a $128\times 64$ RRAM array with CMOS peripheral circuits, including row/column decoders and flash analog-to-digital converters (ADCs), which collectively become a core component for scalable RRAM-based IMC for large DNNs. To maximize IMC parallelism, we assert all 128 wordlines of the RRAM array simultaneously, perform analog computing along the bitlines, and digitize the bitline voltages using ADCs. The resistance distribution of low-resistance states is tightened by an iterative write-verify scheme. Prototype chip measurements demonstrate high binary DNN accuracy of 98.5% for MNIST and 83.5% for CIFAR-10 data sets, with 24 TOPS/W and 158 GOPS. This represents $22.3\times $ and $10.1\times $ improvements in throughput and energy–delay product (EDP), respectively, compared with the state-of-the-art literature, which can enable intelligent functionalities for area-/energy-constrained edge computing devices.

Journal ArticleDOI
TL;DR: The capability to translate quantum algorithms to microwave signals has been demonstrated by coherently controlling a spin qubit at both 14 and 18 GHz, thus enabling high-fidelity qubit control and exploiting the on-chip 4096-instruction memory.
Abstract: Building a large-scale quantum computer requires the co-optimization of both the quantum bits (qubits) and their control electronics. By operating the CMOS control circuits at cryogenic temperatures (cryo-CMOS), and hence in close proximity to the cryogenic solid-state qubits, a compact quantum-computing system can be achieved, thus promising scalability to the large number of qubits required in a practical application. This work presents a cryo-CMOS microwave signal generator for frequency-multiplexed control of $4\times 32$ qubits (32 qubits per RF output). A digitally intensive architecture offering full programmability of phase, amplitude, and frequency of the output microwave pulses and a wideband RF front end operating from 2 to 20 GHz allow targeting both spin qubits and transmons. The controller comprises a qubit-phase-tracking direct digital synthesis (DDS) back end for coherent qubit control and a single-sideband (SSB) RF front end optimized for minimum leakage between the qubit channels. Fabricated in Intel 22-nm FinFET technology, it achieves a 48-dB SNR and 45-dB spurious-free dynamic range (SFDR) in a 1-GHz data bandwidth when operating at 3 K, thus enabling high-fidelity qubit control. By exploiting the on-chip 4096-instruction memory, the capability to translate quantum algorithms to microwave signals has been demonstrated by coherently controlling a spin qubit at both 14 and 18 GHz.

Journal ArticleDOI
TL;DR: In this paper, the authors demonstrate voltage-controlled, symmetric and analog potentiation and depression of a ferroelectric Hf0.57Zr0.43O2 (HZO) field effect transistor (FeFET) with good linearity.
Abstract: Neuromorphic computing architectures enable the dense colocation of memory and processing elements within a single circuit. This colocation removes the communication bottleneck of transferring data between separate memory and computing units as in standard von Neuman architectures for data-critical applications including machine learning. The essential building blocks of neuromorphic systems are nonvolatile synaptic elements such as memristors. Key memristor properties include a suitable nonvolatile resistance range, continuous linear resistance modulation, and symmetric switching. In this work, we demonstrate voltage-controlled, symmetric and analog potentiation and depression of a ferroelectric Hf0.57Zr0.43O2 (HZO) field-effect transistor (FeFET) with good linearity. Our FeFET operates with low writing energy (fJ) and fast programming time (40 ns). Retention measurements have been performed over 4 bit depth with low noise (1%) in the tungsten oxide (WOx) readout channel. By adjusting the channel thickness from 15 to 8 nm, the on/off ratio of the FeFET can be engineered from 1 to 200% with an on-resistance ideally >100 kΩ, depending on the channel geometry. The device concept is using earth-abundant materials and is compatible with a back end of line (BEOL) integration into complementary metal-oxide-semiconductor (CMOS) processes. It has therefore a great potential for the fabrication of high-density, large-scale integrated arrays of artificial analog synapses.

Journal ArticleDOI
TL;DR: In this paper, a review examines potential CMOS monolithic and hybrid approaches in a variety of wide bandgap materials for power and RF electronics applications, which can switch large currents and voltages rapidly with low losses.
Abstract: Power and RF electronics applications have spurred massive investment into a range of wide and ultrawide bandgap semiconductor devices which can switch large currents and voltages rapidly with low losses. However, the end systems using these devices are often limited by the parasitics of integrating and driving these chips from the silicon complementary metal–oxide-semiconductor-based design (CMOS) circuitry necessary for complex control logic. For that reason, implementation of CMOS logic directly in the wide bandgap platform has become a way for each maturing material to compete. This review examines potential CMOS monolithic and hybrid approaches in a variety of wide bandgap materials.

Journal ArticleDOI
TL;DR: In this article, the authors present E-and W-band low-noise amplifiers (LNA) in GlobalFoundries 22-nm CMOS fully depleted silicon-on-insulator (FD-SOI) for narrowband and wideband applications.
Abstract: This article presents E- and W-band low-noise amplifiers (LNA) in GlobalFoundries 22-nm CMOS fully depleted silicon-on-insulator (FD-SOI). Both amplifiers employ a three-stage cascode design with gain-boosting transformer loads. Design procedures are presented for E- and W-band LNAs for narrowband and wideband applications. The E-band LNA focuses on a high-gain, low-power implementation, and results in a gain and noise figure (NF) of 20 and 4.6 dB at 77 GHz with a 3-dB bandwidth of 12 GHz, and an input P1dB of −27.4 dBm, for a power consumption of 9 mW. The W-band LNA focuses on wideband applications and results in a peak gain of 18.2 dB with a 3-dB bandwidth of 31 GHz, for a power consumption of 16 mW. The LNAs have a high figure-of-merit (FoM) and show very low-power operation in the 70–100 GHz range. Application areas are in phased arrays for 5G with hundreds or thousands of elements, automotive radars at 77 GHz, and sensors at 94 GHz.

Proceedings ArticleDOI
08 Mar 2020
TL;DR: Global Foundries' monolithic 45nm CMOS-Silicon Photonics 300mm high-volume manufacturing platform based on 45nm RF technology node, and optimized for high performance and low power short-reach optical interconnects for on-chip and chip-to-chip applications will be discussed.
Abstract: GLOBALFOUNDRIES' monolithic 45nm CMOS-Silicon Photonics 300mm high-volume manufacturing platform based on 45nm RF technology node, and optimized for high performance and low power short-reach optical interconnects for on-chip and chip-to-chip applications will be discussed.

Journal ArticleDOI
TL;DR: This paper proposes a novel latch design, namely QNUTL that can completely tolerate MNUs such as double-node upsets, triple- node upsets (TNUs), and even quadruple-nodeupsets (QNUs) and replaces the DICEs in the QnUTL latch by clock-gating (CG) based ones to significantly reduce power consumption.
Abstract: With the rapid advancement of CMOS technologies, nano-scale CMOS latches have become increasingly sensitive to multiple-node upset (MNU) errors caused by radiations. First, this paper proposes a novel latch design, namely QNUTL that can completely tolerate MNUs such as double-node upsets, triple-node upsets (TNUs), and even quadruple-node upsets (QNUs). The latch is mainly constructed from three dual-interlocked-storage-cells (DICEs) and a triple-level soft-error interceptive module (SIM) that consists of six 2-input C-elements. Due to the single-node-upset self-recoverability of DICEs and the soft-error interception of the SIM, the latch can completely tolerate any QNU. Next, by replacing the DICEs in the QNUTL latch by clock-gating (CG) based ones, a QNUTL-CG latch is proposed to significantly reduce power consumption. Simulation results demonstrate the MNU-tolerance of the proposed latches. Moreover, owing to the use of a high-speed transmission path, clock-gating, and a few transistors, the proposed QNUTL-CG latch has low overhead in terms of area, D-Q delay, CLK-Q delay, and setup time, compared with the state-of-the-art TNU-tolerant latch (TNUTL) which is not QNU-tolerant.

Journal ArticleDOI
TL;DR: The design is complemented by a theoretical investigation of noise upconversion caused by short-channel effects in the cross-coupled transistors, obtaining the first instance of a closed-form phase noise expression in the $1/f^{3}$ region.
Abstract: Class-C operation is leveraged to implement a $K$ -band CMOS voltage-controlled oscillator (VCO) where the upconversion of $1/f$ current noise from the cross-coupled transistors in the oscillator core is robustly contained at a very low level. Implemented in a bulk 28-nm CMOS technology, the 12%-tuning-range VCO shows a phase noise as low as −112 dBc/Hz at 1-MHz offset (−86 dBc/Hz at 100 kHz offset) from a 19.5 GHz carrier while consuming 20.7 mW, achieving a figure of merit (FoM) of −185 dBc/Hz. The design is complemented by a theoretical investigation of $1/f$ noise upconversion caused by short-channel effects in the cross-coupled transistors, obtaining the first instance of a closed-form phase noise expression in the $1/f^{3}$ region.

Journal ArticleDOI
TL;DR: In this paper, the authors proposed a wideband common-gate (CG) common-source (CS) low-noise amplifier with a dual complementary pMOS-nMOS configuration to provide a current-reuse output.
Abstract: This article proposes a novel wideband common-gate (CG) common-source (CS) low-noise amplifier (LNA) with a dual complementary pMOS–nMOS configuration to provide a current-reuse output. Triple-path noise-cancellation is effectively revealed to eliminate the thermal noise of the two CG transistors. Simultaneously, partial cancellation of intrinsic third-order distortion of output-stage transistors improves the input third-order intercept point (IIP3). In addition, we embed a resistive feedback in one of the auxiliary CS amplifiers to balance the multiple tradeoffs between noise figure (NF), input matching (S11), and forward gain (S21). Fabricated in 65-nm CMOS, the proposed wideband LNA exhibits an IIP3 of 2.2–6.8 dBm and an NF of 3.3–5.3 dB across a 19-GHz BW while consuming 20.3 mW at 1.6 V. S11 is $\pi $ -type input-matching network. The LNA exhibits a peak $\text{S}_{\vphantom {D_{j}}21}$ of 12.8 dB and occupies a very compact die area of 0.096 mm2.

Journal ArticleDOI
TL;DR: A fully integrated 76–81-GHz frequency-modulated, continuous-wave (FMCW) radar transceiver (TRX) in a 65-nm CMOS is presented and real-time experimental results show that the distance and the angular resolution of the MIMO radar achieved are 5 cm and 9°.
Abstract: A fully integrated 76–81-GHz frequency-modulated, continuous-wave (FMCW) radar transceiver (TRX) in a 65-nm CMOS is presented. Two transmitters (TXs) and three receivers (RXs) are integrated for multiple-input multiple-output (MIMO) processing. A 38.5-GHz mixed-mode phase-locked loop (PLL) with reconfigurable loop bandwidth and a frequency doubling scheme are employed to generate the reconfigurable FMCW chirp waveforms. The coarse-to-fine-segmented current DAC is utilized to support sawtooth FMCW chirps with fast frequency ramping-down capability, and the delay lock loop (DLL)-based delay time calibration is used to improve the linearity of the embedded 2-D Vernier time-to-digital converter (TDC). Passive voltage-mode down-conversion is utilized to improve the RX linearity against TX leakage and short-range interference. A bottom-switching Gilbert-type modulator in the TX is proposed to realize the bi-phase modulation, and the magnetically coupled resonator technique is used to effectively expand the link bandwidth. The measurement results show that the FMCW TRX could generate reconfigurable chirps with the bandwidth from 250 MHz to 4 GHz and the period from 30 $\mu \text{s}$ to 10 ms. The root-mean-square (rms) frequency error is 110 kHz for a sawtooth chirp with 4-GHz bandwidth and 300- $\mu \text{s}$ period. The TX maximum output power is 13.4 dBm and is adjustable within 3 dB by reconfiguring its low dropout regulator (LDO) voltage. The RX achieves a 15.3-dB noise figure at 600-kHz IF and a −8.5-dBm RF input-referred P1dB. The overall power consumption is 921 mW, with two TXs and three RXs powered ON. Based on the proposed TRX chip, prototype hardware and a data process algorithm are developed. Real-time experimental results show that the distance and the angular resolution of the MIMO radar achieved are 5 cm and 9°, respectively.

Journal ArticleDOI
TL;DR: In this article, several gate and channel engineered MOSFET structures are analyzed and compared for sub 45 nm technology node for analog/RF performance in terms of IOFF, subthreshold performance parameters and DIBL values.
Abstract: CMOS technology is one of the most frequently used technologies in the semiconductor industry as it can be successfully integrated with ICs. Every two years the number of MOS transistors doubles because the size of the MOSFET is reduced. Reducing the size of the MOSFET reduces the size of the channel length which causes short channel effects and it increases the leakage current. To reduce the short channel effects new designs and technologies are implemented. Double gate MOSFET design has shown improvement in performance as amplifiers over a single MOSFET. Silicon-based MOSFET design can be used in a harsh environment. It has been used in various applications such as in detecting biomolecules. The increase in number of gates increases the current drive capability of transistors. GAA MOSFET is an example of a quadruple gate around the four sides of channel that increases gate control over the channel region. It also increases effective channel width that improves drain current and reduces leakage current keeping short channel effects under limit. Junctionless MOSFET operates faster and uses less power with increase in ON-state current leading to a good value of ION/IOFF ratio. In this paper, several gate and channel engineered MOSFET structures are analyzed and compared for sub 45 nm technology node. A comparison among different MOSFET structures has been made for subthreshold performance parameters in terms of IOFF, subthreshold slope and DIBL values. The analog/RF performance is analyzed for transconductance, effective transistor capacitances, stability factor and critical frequency. The paper also covers different applications of advance MOSFET structures in analog/digital or IoT/ biomedical applications.

Journal ArticleDOI
TL;DR: High uniform RS and a high on/off ratio of RRAM based on graphene oxide by embedding gold nanoparticles into the device allowed reliable multilevel storage and may offer a route to develop reliable digital memristors for ANNs.
Abstract: Traditional metal-oxide semiconductor devices are inadequate for use in artificial neural networks (ANNs) owing to their high power consumption, complex structures, and difficult fabrication techniques. Resistive random access memory (RRAM) is a promising candidate for ANNs owing to its simple structure, low power consumption, and excellent compatibility with CMOS. Moreover, it can mimic synaptic functions because of its multilevel resistive switching (RS) behavior. Herein, we demonstrate highly uniform RS and a high on/off ratio of RRAM based on graphene oxide by embedding gold nanoparticles into the device. This allowed reliable multilevel storage. Further, multilevel RRAM based on spike-timing-dependent-plasticity learning rules was used for image pattern recognition. These findings may offer a route to develop reliable digital memristors for ANNs.

Journal ArticleDOI
TL;DR: In this paper, a universal compact FeFET-based CAM design, FeCAM, with search and storage functionality enabled in digital and analog domains simultaneously was proposed, which can store and search inputs in either digital or analog domain.
Abstract: Ferroelectric field effect transistors (FeFETs) are being actively investigated with the potential for in-memory computing (IMC) over other nonvolatile memories (NVMs). Content addressable memories (CAMs) are a form of IMC that performs parallel searches for matched entries over a memory array for a given input query. CAMs are widely used for data-centric applications that involve pattern matching and search functionality. To accommodate the ever expanding data, it is attractive to resort to analog CAM for memory density improvement. However, the digital CAM design nowadays based on standard CMOS or emerging NVMs (e.g., resistive storage devices) is already challenging due to area, power, and cost penalties. Thus, it can be extremely expensive to achieve analog CAM with those technologies due to added cell components. As such, we propose, for the first time, a universal compact FeFET-based CAM design, FeCAM, with search and storage functionality enabled in digital and analog domains simultaneously. By exploiting the multilevel-cell (MLC) states of FeFET, FeCAM can store and search inputs in either digital or analog domain. We perform a device-circuit codesign of the proposed FeCAM and validate its functionality and performance using an experimentally calibrated FeFET model. Circuit level simulation results demonstrate that FeCAM can either store continuous matching ranges or encode 3-bit data in a single CAM cell. When compared with the existing digital CMOS-based CAM approaches, FeCAM is found to improve both memory density by $22.4\times $ and energy saving by $8.6\times / 3.2\times $ for analog/digital modes, respectively. In the CAM-related application, our evaluations show that FeCAM can achieve $60.5\times / 23.1\times $ saving in area/search energy compared with conventional CMOS-based CAMs.

Journal ArticleDOI
TL;DR: An output-capacitorless low-dropout regulator (OCL-LDO) using a dual-active feedback frequency compensation (DAFFC) scheme with both transient and stability enhancement has been presented in this article.
Abstract: An output-capacitorless low-dropout regulator (OCL-LDO) using a dual-active feedback frequency compensation (DAFFC) scheme with both transient and stability enhancement has been presented in this paper. The DAFFC scheme consists of two parallel active feedback paths, which creates two pole-zero pairs to effectively enhance the stability and transient response for the proposed OCL-LDO. Compared to the conventional single-path active-feedback frequency compensation method, the proposed DAFFC technique has provided one more design freedom with one more active feedback loop deployed and has been proved to be capable of obtaining better compensation effects with the same capacitor budget. Besides, the induced extra ac currents by the two active feedback loops have also enhanced the transient response of the proposed OCL-LDO. To substantiate the proposed DAFFC, a telescopic cascode output stage for error amplifier, and two on-chip compensation capacitors (5 and 1 pF, respectively) are needed. The proposed OCL-LDO has been implemented in 65-nm CMOS technology and the active chip area is 0.0105 mm2. The output voltage is 0.8 V, and the minimum input voltage is 0.95 V at 100-mA loading current. The proposed OCL-LDO can work stably in a load range of 0 to 100 mA with 14-μA quiescent current.

Proceedings ArticleDOI
10 May 2020
TL;DR: In this article, an optical phased array with a record 8192 individually-addressed elements driven by flip-chip CMOS spanning a 100° × 17° field of view is presented.
Abstract: We present an optical phased array with a record 8192 individually-addressed elements driven by flip-chip CMOS spanning a 100° × 17° field of view. The reticle-sized PIC+CMOS beam steering engine enables near cm-scale apertures for long-range applications.

Journal ArticleDOI
TL;DR: A MEMS piezoelectric energy harvester is designed and co-integrated with an active synchronized switch harvesting on inductor (SSHI) rectification circuit designed in a CMOS process to achieve high output power for system miniaturization.
Abstract: Piezoelectric vibration energy harvesting has drawn much interest to power distributed wireless sensor nodes for Internet of Things (IoT) applications where ambient kinetic energy is available. For certain applications, the harvesting system should be small and able to generate sufficient output power. Standard rectification topologies such as the full-bridge rectifier are typically inefficient when adapted to power conditioning from miniaturized harvesters. Therefore, active rectification circuits have been researched to improve overall power conversion efficiency, and meet both the output power and miniaturization requirements while employing a MEMS harvester. In this paper, a MEMS piezoelectric energy harvester is designed and co-integrated with an active synchronized switch harvesting on inductor (SSHI) rectification circuit designed in a CMOS process to achieve high output power for system miniaturization. The system is fully integrated on a nail-size board, which is ready to provide a stable DC power for low-power mini sensors. A MEMS energy harvester of 0.005 cm3 size, co-integrated with the CMOS conditioning circuit, outputs a peak rectified DC power of 40.6 $\mu \text{W}$ and achieves a record DC power density of 8.12 mW/cm3 when compared to state-of-the-art harvesters.

Journal ArticleDOI
TL;DR: In this paper, the impact of static and dynamic MRM nonlinearity on PAM4 signaling and present a dual path nonlinear pre-distortion technique to compensate both effects.
Abstract: Microring modulators (MRMs) with CMOS electronics enable compact low power transmitter solutions for 400G Ethernet and future on-package optical transceivers. In this paper, we present a 112 Gb/s PAM4 transmitter using silicon photonic MRM, on-chip laser and co-packaged 28 nm CMOS driver. We describe the impact of static and dynamic MRM nonlinearity on PAM4 signaling and present a dual path nonlinear pre-distortion technique to compensate both effects. PAM4 measurement results of our transmitter at 112 Gb/s show that TDECQ <0.7 dB is achieved from 30 °C to 60 °C while dissipating 6 pJ/bit. We also present link level measurements at 112 Gb/s PAM4 obtained by coupling this transmitter with our previously published CMOS TIA-based receiver, to demonstrate the feasibility of low cost optical transceivers through CMOS integration of optical interface circuits.

Journal ArticleDOI
TL;DR: The design exploits a three-stage structure with a Reversed Miller Compensation Scheme, where the input stage is based on a non-tailed bulk-driven differential pair, and the resulting amplifier outperforms other ultra-low-voltage OTAs in terms of a DC voltage gain and power efficiency.
Abstract: A new solution for an ultra-low-voltage, ultra-low-power operational transconductance amplifier (OTA) is presented in the paper. The design exploits a three-stage structure with a Reversed Miller Compensation Scheme, where the input stage is based on a non-tailed bulk-driven differential pair. Optimization of the structure for very low supply voltage is discussed. The resulting amplifier outperforms other ultra-low-voltage OTAs in terms of a DC voltage gain and power efficiency, expressed by standard figures of merit. Experimental verification using a $0.18~\mu \text{m}$ CMOS technology, with supply voltage of 0.3-V, showed a dissipation power of 13 nW, a DC voltage gain of 98 dB, a gain-bandwidth product of 3.1 kHz and an average slew-rate of 9.1 V/ms at 30 pF load capacitance. The experimental results agree well with simulations.

Journal ArticleDOI
24 Aug 2020
TL;DR: In this article, the authors report a radiation-hardened field effect transistor (FET) that uses semiconducting carbon nanotubes as the channel material, an ion gel as the gate and polyimide as the substrate.
Abstract: Electronics devices that operate in outer space and nuclear reactors require radiation-hardened transistors. However, high-energy radiation can damage the channel, gate oxide and substrate of a field-effect transistor (FET), and redesigning all vulnerable parts to make them more resistant to total ionizing dose irradiation has proved challenging. Here, we report a radiation-hardened FET that uses semiconducting carbon nanotubes as the channel material, an ion gel as the gate and polyimide as the substrate. The FETs exhibit a radiation tolerance of up to 15 Mrad at a dose rate of 66.7 rad s−1, which is notably higher than the tolerance of silicon-based transistors (1 Mrad). The devices can also be used to make complementary metal–oxide–semiconductor (CMOS)-like inverters with similarly high tolerances. Furthermore, we show that radiation-damaged FETs can be recovered by annealing at a moderate temperature of 100 °C for 10 min. By using carbon nanotubes as a channel material, an ion gel as a gate and polyimide as a substrate, field-effect transistors can be created that have a high radiation tolerance and can be repaired by annealing.