scispace - formally typeset
Search or ask a question

Showing papers on "Chip published in 2021"


Journal ArticleDOI
09 Jun 2021-Nature
TL;DR: In this article, the authors presented a deep reinforcement learning approach to chip floorplanning, which can automatically generate chip floorplans that are superior or comparable to those produced by humans in all key metrics, including power consumption, performance and chip area.
Abstract: Chip floorplanning is the engineering task of designing the physical layout of a computer chip. Despite five decades of research1, chip floorplanning has defied automation, requiring months of intense effort by physical design engineers to produce manufacturable layouts. Here we present a deep reinforcement learning approach to chip floorplanning. In under six hours, our method automatically generates chip floorplans that are superior or comparable to those produced by humans in all key metrics, including power consumption, performance and chip area. To achieve this, we pose chip floorplanning as a reinforcement learning problem, and develop an edge-based graph convolutional neural network architecture capable of learning rich and transferable representations of the chip. As a result, our method utilizes past experience to become better and faster at solving new instances of the problem, allowing chip design to be performed by artificial agents with more experience than any human designer. Our method was used to design the next generation of Google’s artificial intelligence (AI) accelerators, and has the potential to save thousands of hours of human effort for each new generation. Finally, we believe that more powerful AI-designed hardware will fuel advances in AI, creating a symbiotic relationship between the two fields. Machine learning tools are used to greatly accelerate chip layout design, by posing chip floorplanning as a reinforcement learning problem and using neural networks to generate high-performance chip layouts.

124 citations


Journal ArticleDOI
13 May 2021-Nature
TL;DR: In this paper, a cryogenic CMOS control chip operating at 3 kelvin was proposed to drive silicon quantum bits cooled to 20 millikelvin. And the authors used it to coherently control actual qubits encoded in the spin of single electrons confined in silicon quantum dots.
Abstract: The most promising quantum algorithms require quantum processors that host millions of quantum bits when targeting practical applications1. A key challenge towards large-scale quantum computation is the interconnect complexity. In current solid-state qubit implementations, an important interconnect bottleneck appears between the quantum chip in a dilution refrigerator and the room-temperature electronics. Advanced lithography supports the fabrication of both control electronics and qubits in silicon using technology compatible with complementary metal oxide semiconductors (CMOS)2. When the electronics are designed to operate at cryogenic temperatures, they can ultimately be integrated with the qubits on the same die or package, overcoming the ‘wiring bottleneck’3–6. Here we report a cryogenic CMOS control chip operating at 3 kelvin, which outputs tailored microwave bursts to drive silicon quantum bits cooled to 20 millikelvin. We first benchmark the control chip and find an electrical performance consistent with qubit operations of 99.99 per cent fidelity, assuming ideal qubits. Next, we use it to coherently control actual qubits encoded in the spin of single electrons confined in silicon quantum dots7–9 and find that the cryogenic control chip achieves the same fidelity as commercial instruments at room temperature. Furthermore, we demonstrate the capabilities of the control chip by programming a number of benchmarking protocols, as well as the Deutsch–Josza algorithm10, on a two-qubit quantum processor. These results open up the way towards a fully integrated, scalable silicon-based quantum computer. A cryogenic CMOS control chip operating at 3 K is used to demonstrate coherent control and simple algorithms on silicon qubits operating at 20 mK.

93 citations


Journal ArticleDOI
25 Jan 2021
TL;DR: In this article, the authors present a platform based on complementary metaloxide-semiconductor (CMOS) technology operating with qubits close to 100mK, which can generate static and dynamic signals for the control of many qubits.
Abstract: Scaled-up quantum computers will require control interfaces capable of the manipulation and readout of large numbers of qubits, which usually operate at millikelvin temperatures. Advanced complementary metal–oxide–semiconductor (CMOS) technology is an attractive platform for delivering such interfaces. However, this approach is generally discounted due to its high power dissipation, which can lead to the heating of fragile qubits. Here we report a CMOS-based platform that can provide multiple electrical signals for the control of qubits at 100 mK. We demonstrate a chip that is configured by digital input signals at room temperature and uses on-chip circuit cells that are based on switched capacitors to generate static and dynamic voltages for the parallel control of qubits. We use our CMOS chip to bias a quantum dot device and to switch the conductance of a quantum dot via voltage pulses generated on the chip. Based on measurements from six cells, we determine the average power dissipation for generating control pulses of 100 mV to be 18 nW per cell. We estimate that a scaled-up system containing a thousand cells could be cooled by a commercially available dilution refrigerator. A platform based on complementary metal–oxide–semiconductor (CMOS) technology operating with qubits close to 100 mK can generate static and dynamic signals for the control of many qubits.

74 citations


Journal ArticleDOI
TL;DR: In this article, the theoretical basis of phase change materials (PCMs) applied to chip thermal management technology and summarizes the application progress of PCMs in electric Chip thermal management through the research of recent related literature.

51 citations


Posted Content
TL;DR: In this paper, a multi-skilled diffractive neural network based on a metasurface device is demonstrated for on-chip multi-channel sensing and multitasking at the speed of light in the visible.
Abstract: Replacing electrons with photons is a compelling route towards light-speed, highly parallel, and low-power artificial intelligence computing. Recently, all-optical diffractive neural deep neural networks have been demonstrated. However, the existing architectures often comprise bulky components and, most critically, they cannot mimic the human brain for multitasking. Here, we demonstrate a multi-skilled diffractive neural network based on a metasurface device, which can perform on-chip multi-channel sensing and multitasking at the speed of light in the visible. The metasurface is integrated with a complementary metal oxide semiconductor imaging sensor. Polarization multiplexing scheme of the subwavelength nanostructures are applied to construct a multi-channel classifier framework for simultaneous recognition of digital and fashionable items. The areal density of the artificial neurons can reach up to 6.25x106/mm2 multiplied by the number of channels. Our platform provides an integrated solution with all-optical on-chip sensing and computing for applications in machine vision, autonomous driving, and precision medicine.

40 citations


Journal ArticleDOI
TL;DR: In this article, the authors present μBrain, the first digital yet fully event-driven without clock architecture, with co-located memory and processing capability that exploits event-based processing to reduce an always-on system's overall energy consumption.
Abstract: The development of brain-inspired neuromorphic computing architectures as a paradigm for Artificial Intelligence (AI) at the edge is a candidate solution that can meet strict energy and cost reduction constraints in the Internet of Things (IoT) application areas. Toward this goal, we present μBrain: the first digital yet fully event-driven without clock architecture, with co-located memory and processing capability that exploits event-based processing to reduce an always-on system's overall energy consumption (μW dynamic operation). The chip area in a 40 nm Complementary Metal Oxide Semiconductor (CMOS) digital technology is 2.82 mm2 including pads (without pads 1.42 mm2). This small area footprint enables μBrain integration in re-trainable sensor ICs to perform various signal processing tasks, such as data preprocessing, dimensionality reduction, feature selection, and application-specific inference. We present an instantiation of the μBrain architecture in a 40 nm CMOS digital chip and demonstrate its efficiency in a radar-based gesture classification with a power consumption of 70 μW and energy consumption of 340 nJ per classification. As a digital architecture, μBrain is fully synthesizable and lends to a fast development-to-deployment cycle in Application-Specific Integrated Circuits (ASIC). To the best of our knowledge, μBrain is the first tiny-scale digital, spike-based, fully parallel, non-Von-Neumann architecture (without schedules, clocks, nor state machines). For these reasons, μBrain is ultra-low-power and offers software-to-hardware fidelity. μBrain enables always-on neuromorphic computing in IoT sensor nodes that require running on battery power for years.

39 citations



Journal ArticleDOI
TL;DR: In this paper, a finite element model-based adaptive ML method is presented for chip package reliability prediction and design optimization, which employs a validated multi-scale finite element models for training data generation and an adaptive sampling scheme is developed to optimize the training process.
Abstract: Machine learning (ML) is widely used for building data-driven models that are highly useful for optimization. In this study, a finite element model-based adaptive ML method is presented for chip package reliability prediction and design optimization. This ML method employs a validated multi-scale finite element model for training data generation. An adaptive sampling scheme is developed to optimize the training process with a steepest descent algorithm. The developed method was used to optimize ultra low-k chip package design. The effects of ten key design parameters on chip packaging reliability were considered. Multiple ML algorithms were evaluated for model development. It is shown that the adaptive sampling method performs much better than existing sequential sampling methods and that the finite element-based ML model can be used to achieve improved prediction accuracy for chip package design optimization.

27 citations


Journal ArticleDOI
TL;DR: This paper develops two different chip sets for on-off-keying modulation based THz transceivers which include carrier generators, modulators, THz amplifiers, and baseband amplifiers and confirms the accuracy of the derived BER expression.
Abstract: Terahertz (THz) communication is a promising technique for chip-to-chip communication and wireless personal area networks. In this paper, we present an experimental study and design to realize such THz communication systems. We develop two different chip sets for on-off-keying (OOK) modulation based THz transceivers which include carrier generators, modulators, THz amplifiers, and baseband amplifiers. Specifically, the first chip set integrates the circuit blocks for the OOK modulation without the THz amplifier for short-range communication. In addition, the second chip set design includes the THz amplifier modules to extend the coverage of transmission. For these two chip sets, we experimentally demonstrate the feasibility of the wireless communication at THz frequency bands and assess performance using the bit error rate (BER) analysis. We estimate the BER by calculating the signal-to-noise ratio (SNR) based on the eye diagram and compare with actual BER measurements and Monte Carlo simulations. We also address the impact of the distance, the transmit power, and the data rate for the proposed THz transceivers based on the link budget analysis, and confirm the accuracy of the derived BER expression.

27 citations


Journal ArticleDOI
TL;DR: In this article, optical reservoir computing was used to mitigate nonlinear distortions in a 32 GBPS OOK signal to below the 0.2 × 10−3 FEC limit using a photonic reservoir.
Abstract: Nonlinearity mitigation in optical fiber networks is typically handled by electronic Digital Signal Processing (DSP) chips. Such DSP chips are costly, power-hungry and can introduce high latencies. Therefore, optical techniques are investigated which are more efficient in both power consumption and processing cost. One such a machine learning technique is optical reservoir computing, in which a photonic chip can be trained on certain tasks, with the potential advantages of higher speed, reduced power consumption and lower latency compared to its electronic counterparts. In this paper, experimental results are presented where nonlinear distortions in a 32 GBPS OOK signal are mitigated to below the 0.2 × 10−3 FEC limit using a photonic reservoir. Furthermore, the results of the reservoir chip are compared to a tapped delay line filter to clearly show that the system performs nonlinear equalisation.

26 citations


Journal ArticleDOI
TL;DR: The developed system constitutes a fully digital, bidirectional 32-channel interface to the brain and offers low-noise recording, a state-of-the-art neurostimulator capable of both current- and voltage-controlled stimulation with high-voltage compliance, on-chip 16-bit data digitization as well as safety features such as electrode impedance estimation and charge balancing.
Abstract: This article presents the integration of a 32-channel neuromodulation system on chip (SoC) that is developed for chronic implantation in humans. The application-specific integrated circuit (ASIC) offers low-noise recording, a state-of-the-art (SotA) neurostimulator capable of both current- and voltage-controlled stimulation with high-voltage compliance, on-chip 16-bit data digitization as well as safety features such as electrode impedance estimation and charge balancing. The chip communicates through two distinct SPI interfaces for independent command and data transfer. Thus, the developed system constitutes a fully digital, bidirectional 32-channel interface to the brain.

Journal ArticleDOI
TL;DR: The article discusses the advantages of the «System on a Chip» design technology used to create ECB in Russia and the measures of structuring the program for the development of domestic SOC technology are proposed.
Abstract: The article discusses the advantages of the «System on a Chip» design technology used to create ECB in Russia. The measures of structuring the program for the development of domestic SOC technology are proposed. Attention is paid to the direction of Fabless – Foundry (Design – Production). The processes of interaction between designers, manufacturers and consumers from different sides are described.

Journal ArticleDOI
TL;DR: In this paper, a dual RSFQ/ERSFQ cell library for the MIT-LL SFQ5ee process is presented, which can be used with the superconductor EDA tools suite that is being developed.
Abstract: Cell library is the keystone component that enables adoption of advanced electronic design automation (EDA) tools, such as logic synthesis and automatic place-and-route. The EDA tools are essential for scaling circuit complexity by orders of magnitude. We have designed a dual RSFQ/ERSFQ cell library for the MIT-LL SFQ5ee process, that can be used with the superconductor EDA tools suite that is being developed. In addition to satisfying the margins criterion, the performance of each cell has been optimized for Monte-Carlo statistical variations across multiple process corners including minimizing the spread of timing distributions. To enable a digital design flow using HDL simulations with timing back-annotation Liberty files have been developed for multiple process corners, using the load-dependent timing char-acterization. The cells have been designed for a standard height of 40 μm with a grid size of 20 μm. The library provides dedicated tracks for signal and power routing. Multiple independent biases are supported for RSFQ designs. The cells can be interconnected either by abutting or using passive transmission lines. Dedicated moat slots have been provided which are uniformly distributed across the cell. All cells are re-optimized post-layout. The library currently contains 22 unique types of cells. Initial validation of the cell library was performed by designing RSFQ and ERSFQ shift registers for the MIT-LL SFQ5ee fabrication process, which yielded wide operating margins. In addition, we present measurement results for a chip designed and fabricated to characterize several library cells using a multiplexing scheme.

Journal ArticleDOI
TL;DR: In this article, the authors presented a miniaturized, minimally invasive high-density neural recording interface that occupies only a 1.53 mm2 footprint for hybrid integration of a flexible polyimide neural probe and a 256-channel integrated circuit chip.
Abstract: We report a miniaturized, minimally invasive high-density neural recording interface that occupies only a 1.53 mm2 footprint for hybrid integration of a flexible probe and a 256-channel integrated circuit chip. To achieve such a compact form factor, we developed a custom flip-chip bonding technique using anisotropic conductive film and analog circuit-under-pad in a tiny pitch of 75 m. To enhance signal-to-noise ratios, we applied a reference-replica topology that can provide the matched input impedance for signal and reference paths in low-noise aimpliers (LNAs). The analog front-end (AFE) consists of LNAs, buffers, programmable gain amplifiers, 10b ADCs, a reference generator, a digital controller, and serial-peripheral interfaces (SPIs). The AFE consumes 51.92 W from 1.2 V and 1.8 V supplies in an area of 0.0161 mm2 per channel, implemented in a 180 nm CMOS process. The AFE shows > 60 dB mid-band CMRR, 6.32 Vrms input-referred noise from 0.5 Hz to 10 kHz, and 48 M input impedance at 1 kHz. The fabricated AFE chip was directly flip-chip bonded with a 256-channel flexible polyimide neural probe and assembled in a tiny head-stage PCB. Full functionalities of the fabricated 256-channel interface were validated in both in vitro and in vivo experiments, demonstrating the presented hybrid neural recording interface is suitable for various neuroscience studies in the quest of large scale, miniaturized recording systems.

Journal ArticleDOI
TL;DR: In this paper, a coupled system consisting of a microscope and a high-speed camera in conjunction with a laser light source was used to obtain high-resolution pictures with a frame rate of more than 100 kHz.

Journal ArticleDOI
TL;DR: The bit-serial binary operation allows for bit-accurate operation and high DNN accuracy that multibit analog compute-in-memory designs struggle to attain and provides favorable energy tradeoffs compared with small-integer digital DNN accelerators.
Abstract: A binary neural network (BNN) chip explores the limits of energy efficiency and computational density for an all-digital deep neural network (DNN) inference accelerator. The chip intersperses data storage and computation using computation near memory (CNM) to reduce interconnect and data movement costs. It performs wide inner product operations to leverage parallelism inherent in DNN computations. The BNN chip leverages lightweight pipelining at a near-threshold voltage (NTV) to reduce the overhead of sequential elements. It employs optimized data access patterns to reduce memory accesses for convolutional operation with pooling layers. The combination of these techniques enables the BNN chip to achieve a peak energy efficiency of 617 TOPS/W. The digital BNN chip approaches the energy efficiency of analog in-memory techniques while also ensuring deterministic, scalable, and bit-accuracy operation. Moreover, the all-digital design leverages process scaling and does not require additional memory transistors or passive devices to attain a peak compute density of 418 TOPS/mm2 and a memory density of 414 KB/mm2. The binary design is extended to enable bit-serial integer precision operation with a reconfigurable 1-b multiplication circuit and element-wise partial sum shift and accumulate. This technique allows for fine-grain mixed precision and retains energy efficiency by exploiting parallelism inherent in DNNs. The bit-serial binary operation allows for bit-accurate operation and high DNN accuracy that multibit analog compute-in-memory designs struggle to attain. It provides favorable energy tradeoffs compared with small-integer digital DNN accelerators.

Journal ArticleDOI
TL;DR: A 256-pixel CMOS sensor array with in-pixel dual electrochemical and impedance detection modalities for rapid, multi-dimensional characterization of exoelectrogens is presented in this paper.
Abstract: The paper presents a 256-pixel CMOS sensor array with in-pixel dual electrochemical and impedance detection modalities for rapid, multi-dimensional characterization of exoelectrogens. The CMOS IC has 16 parallel readout channels, allowing it to perform multiple measurements with a high throughput and enable the chip to handle different samples simultaneously. The chip contains a total of 2 × 256 working electrodes of size 44 μm × 52 μm, along with 16 reference electrodes of dimensions 56 μm × 399 μm and 32 counter electrodes of dimensions 399 μm × 106 μm, which together facilitate the high resolution screening of the test samples. The chip was fabricated in a standard 130nm BiCMOS process. The on-chip electrodes are subjected to additional fabrication processes, including a critical Al-etch step that ensures the excellent biocompatibility and long-term reliability of the CMOS sensor array in bio-environment. The electrochemical sensing modality is verified by detecting the electroactive analyte NaFeEDTA and the exoelectrogenic Shewanella oneidensis MR-1 bacteria, illustrating the chip's ability to quantify the generated electrochemical current and distinguish between different analyte concentrations. The impedance measurements with the HEK-293 cancer cells cultured on-chip successfully capture the cell-to-surface adhesion information between the electrodes and the cancer cells. The reported CMOS sensor array outperforms the conventional discrete setups for exoelectrogen characterization in terms of spatial resolution and speed, which demonstrates the chip's potential to radically accelerate synthetic biology engineering.

Journal ArticleDOI
TL;DR: In this paper, a scalable laser-driven nanophotonic electron accelerator on a chip is presented, which requires only a single incident laser pulse and can be fabricated straightforwardly on commercial silicon-on-insulator wafers.
Abstract: A simple way of implementing a scalable laser-driven nanophotonic electron accelerator on a chip is presented. The design requires only a single incident laser pulse and can be fabricated straightforwardly on commercial silicon-on-insulator wafers. We investigate the low-energy regime of tabletop electron microscopes where the silicon structures safely allow peak gradients of about 150 MeV/m. By means of a three-dimensional alternating-phase-focusing scheme, we obtain about half of the peak gradient as the average gradient with six-dimensional confinement and full-length scalability. The structures are completely designed within the device layer of the wafer and can be arranged in stages. We choose the stages as energy doublers and outline how errors in the handshake between the stages can be corrected by on-chip steerers. Since the electron pulse length in the attosecond realm is preserved, our chip is the ideal energy booster for ultrafast-electron-diffraction machines, opening the megaelectronvolt scale on tabletop setups.

Posted ContentDOI
TL;DR: This work demonstrates wafer-scale processing of a 2D semiconductor for building integrated circuits with the functions of AI computation, including memory, multiply-and-accumulate (MAC), activation function, and weight update circuits.
Abstract: Recently, research on two-dimensional (2D) semiconductors has begun to translate from the fundamental investigation into rudimentary functional circuits. In this work, we unveil the first functional MoS2 artificial neural network (ANN) chip, including multiply-and-accumulate (MAC), memory and activation function circuits. Such MoS2 ANN chip is realized through fabricating 818 field-effect transistors (FETs) on a wafer-scale and high-homogeneity MoS2 film, with a gate-last process to realize top gate structured FETs. A 62-level simulation program with integrated circuit emphasis (SPICE) model is utilized to design and optimize our analog ANN circuit. To demonstrate a practical application, a tactile digit sensing recognition was demonstrated based on our ANN circuits. After training, the digit recognition rate exceeds 97%. Our work not only demonstrates the protentional of 2D semiconductors in wafer-scale integrated circuits, but also paves the way for its future application in AI computation.

Journal ArticleDOI
Qiman Cheng1, Shilie Zheng1, Qiang Zhang1, Jun Ji1, Hui Yu1, Xianmin Zhang1 
TL;DR: A two-dimensional integrated optical true time delay network (OTTDN), which has the abilities of real-time monitoring and adjustment, is proposed, which can be used in multi-beam systems, whose beams can be controlled independently at the same time.

Journal ArticleDOI
TL;DR: In this article, a waveguide-integrated single-photon detector (SNSPD) was used to achieve low dead times together with low dark-count rates and demonstrate a QKD experiment at 2.6 GHz clock rate.
Abstract: Quantum key distribution (QKD) can greatly benefit from photonic integration, which enables implementing low-loss, alignment-free, and scalable photonic circuitry. At the same time, superconducting nanowire single-photon detectors (SNSPD) are an ideal detector technology for QKD due to their high efficiency, low dark-count rate, and low jitter. We present a QKD receiver chip featuring the full photonic circuitry needed for different time-based protocols, including single-photon detectors. By utilizing waveguide-integrated SNSPDs we achieve low dead times together with low dark-count rates and demonstrate a QKD experiment at 2.6 GHz clock rate, yielding secret-key rates of 2.5 Mbit/s for low channel attenuations of 2.5 dB without detector saturation. Due to the broadband 3D polymer couplers the reciver chip can be operated at a wide wavelength range in the telecom band, thus paving the way for highly parallelized wavelength-division multiplexing implementations.

Journal ArticleDOI
TL;DR: This work presents a real-time 100Base-TX Ethernet physical layer (PHY) transceiver chip implemented in a 180-nm technology that implements highly accurate hardware timestamping by using an 8-bit digital-to-phase converter generating 256 phases of the 125 MHz system clock.
Abstract: This work presents a real-time 100Base-TX Ethernet physical layer (PHY) transceiver chip implemented in a 180-nm technology. The PHY chip implements highly accurate hardware timestamping by using an 8-bit digital-to-phase converter (DPC) generating 256 phases of the 125 MHz system clock. Using these clock phases, spaced 31.25 ps apart from each other, phase relationships can be evaluated to timestamp ingress and egress frames with improved resolution. Connected by up to 120-m-long category five unshielded twisted-pair cables, two of these PHY chips are used to demonstrate the synchronization in a two-node scenario. A proportional–integral (PI)-clock servo is used at the slave for the synchronization. To optimize the performance, the parameters of the controller need to be chosen carefully. To do so, a simple model is used to find a suitable bandwidth of the controller as a tradeoff between the noise of the timestamping and the noise of the oscillator. All in all, a synchronization accuracy with a standard deviation of only 64 ps and a mean offset of well below 100 ps is achieved in the given scenario. To the best of our knowledge, this is the highest synchronization accuracy over copper-based Ethernet reported to date.

Journal ArticleDOI
TL;DR: This work presents a split-tree SCL decoder that works by dividing a polar code’s decoding tree to sub-trees following asplit-tree decoding algorithm and improves the throughput and latency proportionally to the split factor.
Abstract: Polar codes are capacity-achieving channel codes and they have recently been adopted for fifth-generation (5G) enhanced mobile broadband (eMBB) control channels. Using successive cancellation list (SCL) decoding, the error-correction performance of polar codes can surpass state-of-the-art codes of a comparable length. However, the sequential SC decoding incurs a long latency, and list decoding requires complex tracking of candidates. We present a split-tree SCL decoder that works by dividing a polar code’s decoding tree to sub-trees following a split-tree decoding algorithm. The sub-trees are decoded in parallel by smaller sub-decoders that reconcile their decisions in every decoding stage. The split-tree list decoder architecture improves the throughput and latency proportionally to the split factor. By exploiting under-utilized hardware resources, we apply frame interleaving to further increase throughput and employ dynamic clock gating to reduce energy. The results are demonstrated in a 0.64-mm2 40-nm test chip that implements a split-4, list-2, eight-frame-interleaved decoding architecture. The chip supports configurable code lengths up to 1024 bit and variable code rates. At 0.9 V and room temperature, the chip achieves 3.25 Gb/s with 42.8-mW power, or 13.2 pJ/b, and demonstrates competitive error-correction performance.

Journal ArticleDOI
11 Jan 2021
TL;DR: In this paper, the broadband potential of the most common interconnect types in use and their performance demonstrated so far, covering wirebonding, approaches with chips embedded in a substrate, and flip-chip.
Abstract: Connecting chips within a module is a basic requirement in transforming MMIC performance to system functionality. More and more applications demand for operation at high mm-wave frequencies or with ultra-large bandwidth. While semiconductor devices have seen tremendous progress in terms of their frequency limits, the chip interconnects lag behind and often form the bottleneck in realizing such systems. This paper reviews the broadband potential of the most common interconnect types in use and their performance demonstrated so far, covering wirebonding, approaches with chips embedded in a substrate, and flip-chip. Additionally, as an intermediate solution between system-on-chip and system-in-a-package, semiconductor hetero-integration on the chip-level is included. As is discussed, bond wire interconnects are most limited in bandwidth among the four types and reach the 100 GHz band only at the expense of narrowband characteristics. Dedicated embedded-chip packaging techniques show significantly better performance, bandwidths in the order of 100 GHz have been shown in the literature. Flip-chip has clearly the highest potential, interconnects covering the range from DC to 500 GHz have been demonstrated and are presented in the paper. Hetero-integration on the chip proves to allow for very broadband interconnects between elements and circuits on the compound chip as well: For an InP-on-BiCMOS process 325 GHz bandwidth were achieved and even higher values seem to be feasible.

Journal ArticleDOI
TL;DR: In this article, a circuit modeling method for online training and testing process of the neuromorphic chip crossbar array based on the resistive random access memory (RRAM) is presented.
Abstract: This article presents a novel circuit modeling method for online training and testing process of the neuromorphic chip crossbar array based on the resistive random access memory (RRAM). A modified RRAM compact model is developed to realize the fast and accurate update of multiple conductance levels. Two training mechanisms with and without write-verify scheme are modeled and investigated for classifying MNIST handwritten digits and both achieve a good recognition accuracy of more than 96%. The parasitic model of the unit cell of interconnects is constructed by the domain decomposition method (DDM) and the partial equivalent element circuit (PEEC) method, which is suitable to build up a crossbar array of any size. The impact of parasitic effects of interconnects on the recognition accuracy with and without write-verify scheme is analyzed and compared. The weights trained with write-verify scheme show better robustness to parasitic noises but training with write-verify scheme spends a longer time processing the same amount of data.

Journal ArticleDOI
TL;DR: In this paper, a 64-channel SiN-Si based one-dimensional (1D) OPA chip has been designed to handle high beam power to achieve large scanning range.
Abstract: The optical power handling of an OPA scanning beam determines its targeted detection distance. So far, a limited number of investigations have been conducted on the restriction of the beam power. To the best of our knowledge, we for the first time in this paper explore the ability of the silicon photonics based OPA circuit for the high power application. A 64-channel SiN-Si based one-dimensional (1D) OPA chip has been designed to handle high beam power to achieve large scanning range. The chip was fabricated on the standard silicon photonics platform. The main lobe power of our chip can reach 720 mW and its peak side-lobe level (PSLL) is -10.33 dB. We obtain a wide scanning range of 110° in the horizontal direction at 1550 nm wavelength, with a compressed longitudinal divergence angle of each scanning beam of 0.02°.

Journal ArticleDOI
TL;DR: The silicon-on-insulator processor outperforms the silicon nitride one in terms of footprint and energy efficiency and the lower extinction ratio of Mach–Zehnder elements in the latter platform limits their expressivity.
Abstract: Reconfigurable linear optical processors can be used to perform linear transformations and are instrumental in effectively computing matrix–vector multiplications required in each neural network layer. In this paper, we characterize and compare two thermally tuned photonic integrated processors realized in silicon-on-insulator and silicon nitride platforms suited for extracting feature maps in convolutional neural networks. The reduction in bit resolution when crossing the processor is mainly due to optical losses, in the range 2.3–3.3 for the silicon-on-insulator chip and in the range 1.3–2.4 for the silicon nitride chip. However, the lower extinction ratio of Mach–Zehnder elements in the latter platform limits their expressivity (i.e., the capacity to implement any transformation) to 75%, compared to 97% of the former. Finally, the silicon-on-insulator processor outperforms the silicon nitride one in terms of footprint and energy efficiency.

Journal ArticleDOI
TL;DR: In this paper, a 3-way chip-to-chip communication via 3D photonic structure has been proposed, where the mechanism of the work is understood with the help of absorbance and reflectance of the s...
Abstract: A proposal is made in this paper to realize 3-ways chip to chip communication via 3D photonic structure. The mechanism of the work is understood with the help of absorbance and reflectance of the s...


Journal ArticleDOI
TL;DR: In this article, the authors investigated the mechanism of chip segmentation by combining numerical and experimental methods and found that the adiabatic shear band (ASB) is generated not only from the chip root but also from the free chip surface.
Abstract: Chip segmentation results in fluctuation of the cutting force, deteriorated tool wear and surface finish, thereby plays an important role in the machining process. Although extensive research has been carried out on studying the segmented chip formation, the mechanism of chip segmentation has remained under debate. This paper aims to investigate the mechanism by combining numerical and experimental methods. Finite element (FE) models of the orthogonal cutting process of A2024–T351 aluminum alloy and Ti6Al4V titanium alloy were developed with three numerical formulations: Lagrangian (LAG), arbitrary Lagrangian–Eulerian (ALE), and coupled Eulerian and Lagrangian (CEL). The appropriate model for predicting the segmented chip formation process was selected by systematic comparison. The mechanism of chip segmentation was thoroughly investigated by the selected numerical model. It revealed that the adiabatic shear band (ASB) is generated not only from the chip root but also from the free chip surface. The finding was then validated by observing the microstructure of chips from high-speed dry-cutting tests.