An Analog Neural Network Computing Engine Using CMOS-Compatible Charge-Trap-Transistor (CTT)

doi:10.1109/TCAD.2018.2859237

Home
/
Papers
/
An Analog Neural Network Computing Engine Using CMOS-Compatible Charge-Trap-Transistor (CTT)

Journal Article•DOI•

An Analog Neural Network Computing Engine Using CMOS-Compatible Charge-Trap-Transistor (CTT)

Yuan Du¹, Li Du¹, Xuefeng Gu¹, Jieqiong Du¹, X. Shawn Wang¹, Boyu Hu¹, Jiang Ming-Zhe, Xiaoliang Chen², Subramanian S. Iyer¹, Mau-Chung Frank Chang¹ - Show less +6 more•Institutions (2)

University of California, Los Angeles¹, University of California, Irvine²

01 Oct 2019-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems (Institute of Electrical and Electronics Engineers Inc.)-Vol. 38, Iss: 10, pp 1811-1819

TL;DR: An analog neural network computing engine based on CMOS-compatible charge-trap transistor (CTT) is proposed and obtained a performance comparable to state-of-the-art fully connected neural networks using 8-bit fixed-point resolution.

read less

Abstract: An analog neural network computing engine based on CMOS-compatible charge-trap transistor (CTT) is proposed in this paper. CTT devices are used as analog multipliers. Compared to digital multipliers, CTT-based analog multiplier shows significant area and power reduction. The proposed computing engine is composed of a scalable CTT multiplier array and energy efficient analog–digital interfaces. By implementing the sequential analog fabric, the engine’s mixed-signal interfaces are simplified and hardware overhead remains constant regardless of the size of the array. A proof-of-concept 784 by 784 CTT computing engine is implemented using TSMC 28-nm CMOS technology and occupies 0.68 mm2. The simulated performance achieves 76.8 TOPS (8-bit) with 500 MHz clock frequency and consumes 14.8 mW. As an example, we utilize this computing engine to address a classic pattern recognition problem—classifying handwritten digits on MNIST database and obtained a performance comparable to state-of-the-art fully connected neural networks using 8-bit fixed-point resolution.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Programmable black phosphorus image sensor for broadband optoelectronic edge computing

[...]

18 Mar 2022-Nature Communications

TL;DR: In this article , a multifunctional infrared image sensor based on an array of black phosphorous programmable phototransistors (bP-PPT) is presented, which can receive optical images transmitted over a broad spectral range in the infrared and perform inference computation to process and recognize the images with 92% accuracy.

...read moreread less

Abstract: Image sensors with internal computing capability enable in-sensor computing that can significantly reduce the communication latency and power consumption for machine vision in distributed systems and robotics. Two-dimensional semiconductors have many advantages in realizing such intelligent vision sensors because of their tunable electrical and optical properties and amenability for heterogeneous integration. Here, we report a multifunctional infrared image sensor based on an array of black phosphorous programmable phototransistors (bP-PPT). By controlling the stored charges in the gate dielectric layers electrically and optically, the bP-PPT's electrical conductance and photoresponsivity can be locally or remotely programmed with 5-bit precision to implement an in-sensor convolutional neural network (CNN). The sensor array can receive optical images transmitted over a broad spectral range in the infrared and perform inference computation to process and recognize the images with 92% accuracy. The demonstrated bP image sensor array can be scaled up to build a more complex vision-sensory neural network, which will find many promising applications for distributed and remote multispectral sensing.

...read moreread less

32 citations

Posted Content•

Programmable black phosphorus image sensor for broadband optoelectronic edge computing

[...]

Seokhyeong Lee, Ruoming Peng, Changming Wu, Mo Li

09 Nov 2021-arXiv: Applied Physics

TL;DR: In this article, a multispectral infrared image sensor based on an array of black phosphorous programmable phototransistors (bP-PPT) is presented.

...read moreread less

Abstract: Image sensors with internal computing capability enable in-sensor computing that can significantly reduce the communication latency and power consumption for machine vision in distributed systems and robotics. Two-dimensional semiconductors are uniquely advantageous in realizing such intelligent visionary sensors because of their tunable electrical and optical properties and amenability for heterogeneous integration. Here, we report a multifunctional infrared image sensor based on an array of black phosphorous programmable phototransistors (bP-PPT). By controlling the stored charges in the gate dielectric layers electrically and optically, the bP-PPT's electrical conductance and photoresponsivity can be locally or remotely programmed with high precision to implement an in-sensor convolutional neural network (CNN). The sensor array can receive optical images transmitted over a broad spectral range in the infrared and perform inference computation to process and recognize the images with 92% accuracy. The demonstrated multispectral infrared imaging and in-sensor computing with the black phosphorous optoelectronic sensor array can be scaled up to build a more complex visionary neural network, which will find many promising applications for distributed and remote multispectral sensing.

...read moreread less

32 citations

Journal Article•DOI•

Ferroelectric HfO2-based synaptic devices: recent trends and prospects

[...]

Shimeng Yu¹, Jae Hur¹, Yuan-Chun Luo¹, Wonbo Shim¹, Gihun Choe¹, Panni Wang¹ - Show less +2 more•Institutions (1)

Georgia Institute of Technology¹

20 Sep 2021-Semiconductor Science and Technology

TL;DR: In this article, the authors present the three-terminal synaptic devices based on ferroelectric field effect transistor (FeFET), and discuss the switching physics of the intermediate states, the back-end-of-line (BEOL) integration and the 3D NAND architecture design.

...read moreread less

Abstract: Neuro-inspired deep learning algorithm has shown promising futures for artificial intelligence (AI). Despite the remarkable progress on software-based neural network, the traditional von-Neumann hardware architecture has suffered from limited energy efficiency while facing unprecedented large amount of data. In order to meet the performance requirement of neuro-inspired computing, the large-scale vector-matrix multiplication is preferred to be performed in-situ, namely compute-in-memory (CIM). Non-volatile memory devices with different materials have been proposed for weight storage as synaptic devices. Among them, the HfO2-based ferroelectric devices have obtained great attention because of their low energy consumption, good CMOS compatibility and multi-bit per cell potential. In this review, recent trends and prospects of the ferroelectric synaptic devices are surveyed. First, we present the three-terminal synaptic devices based on ferroelectric field effect transistor (FeFET), and discuss the switching physics of the intermediate states, the back-end-of-line (BEOL) integration and the 3D NAND architecture design. Then, we introduce hybrid precision synapse concept that leverages the volatile charges on the gate capacitor of FeFET and the non-volatile polarization on the gate dielectric of FeFET. Lastly, we review two-terminal synaptic devices using ferroelectric tunnel junction (FTJ) and ferroelectric capacitor (FeCAP). Design margins of the crossbar array with FTJ and FeCAP are analyzed.

...read moreread less

18 citations

Journal Article•DOI•

Charge-Trap Transistors for CMOS-Only Analog Memory

[...]

Xuefeng Gu¹, Zhe Wan¹, Subramanian S. Iyer¹•Institutions (1)

University of California, Los Angeles¹

15 Aug 2019-IEEE Transactions on Electron Devices

TL;DR: A comprehensive investigation of the programming behavior of CTTs, including analog retention, intra- and inter-device variation, and fine-tuning of the device, both for individual devices and for devices in an integrated array reveals the promising future of using the CTT as a CMOS-only analog memory device.

...read moreread less

Abstract: Since our demonstration of unsupervised learning using the CMOS-only charge-trap transistors (CTTs) as analog synapses, there has been an increasing interest in exploiting the device for various other neural network (NN) applications. However, most of these studies are limited to mere simulation due to the absence of detailed experimental device characterization. In this article, we provide a comprehensive investigation of the programming behavior of CTTs, including analog retention, intra- and inter-device variation, and fine-tuning of the device, both for individual devices and for devices in an integrated array. It is found that, after programming, the channel current gradually increases to a higher level, and the shift is larger when the device is programmed to a higher threshold voltage. With this postprogramming current increase appropriately accounted for, individual devices can be programmed to an equivalent precision of five bits, and three bits can be achieved for devices in an array. Our results reveal the promising future of using the CTT as a CMOS-only analog memory device.

...read moreread less

18 citations

Cites methods from "An Analog Neural Network Computing ..."

...For example, in [34], an analog inference engine using the CTT as the analog synapses was proposed, and the simulation shows significant performance edge over conventional digital approaches....
[...]

Proceedings Article•DOI•

In-Memory Computing: The Next-Generation AI Computing Paradigm

[...]

Yufei Ma¹, Yuan Du¹, Li Du¹, Jun Lin¹, Zhongfeng Wang¹ - Show less +1 more•Institutions (1)

Nanjing University¹

07 Sep 2020

TL;DR: The recent trends of IMC from techniques (SRAM, flash, RRAM and other types of non-volatile memory) to architecture and to applications are investigated to serve as a guide to the future advances on computing in-memory (CIM).

...read moreread less

Abstract: To overcome the memory bottleneck of von-Neuman architecture, various memory-centric computing techniques are emerging to reduce the latency and energy consumption caused by data communication. The great success of artificial intelligence (AI) algorithms, which involve a large number of computations and data movements, has motivated and accelerated the recent researches of in-memory computing (IMC) techniques to significantly reduce or even diminish the accesses of off-chip data, where memory is not only storing data but can also directly output computation results. For example, the multiply-and-accumulate (MAC) operations in deep learning algorithms can be realized by accessing the memory using the input activations. This paper will investigate the recent trends of IMC from techniques (SRAM, flash, RRAM and other types of non-volatile memory) to architecture and to applications, which will serve as a guide to the future advances on computing in-memory (CIM).

...read moreread less

11 citations

Cites background or methods from "An Analog Neural Network Computing ..."

...We then survey recent computingin-memory research based on different types of NVM devices, including Flash memory, ReRAM, STT-MRAM, PCM, and the emerging non-volatile devices [38]....
[...]
...KEYWORDS In-memory computing (IMC), SRAM, NVM, deep learning, convolutional neural networks (CNNs), ACM Reference format: Yufei Ma, Yuan Du, Li Du, Jun Lin and Zhongfeng Wang....
[...]
...Ref [22] Ref [23] Ref [38] Ref [25] Ref [26] Ref [34] Ref [30]...
[...]

1
2
3
4
…
5
6

Collapse

References

PDF

Open Access

More filters

Proceedings Article•DOI•

Deep Residual Learning for Image Recognition

[...]

Kaiming He¹, Xiangyu Zhang¹, Shaoqing Ren¹, Jian Sun¹•Institutions (1)

Microsoft¹

27 Jun 2016

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Abstract: Deeper neural networks are more difficult to train. We present a residual learning framework to ease the training of networks that are substantially deeper than those used previously. We explicitly reformulate the layers as learning residual functions with reference to the layer inputs, instead of learning unreferenced functions. We provide comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth. On the ImageNet dataset we evaluate residual nets with a depth of up to 152 layers—8× deeper than VGG nets [40] but still having lower complexity. An ensemble of these residual nets achieves 3.57% error on the ImageNet test set. This result won the 1st place on the ILSVRC 2015 classification task. We also present analysis on CIFAR-10 with 100 and 1000 layers. The depth of representations is of central importance for many visual recognition tasks. Solely due to our extremely deep representations, we obtain a 28% relative improvement on the COCO object detection dataset. Deep residual nets are foundations of our submissions to ILSVRC & COCO 2015 competitions1, where we also won the 1st places on the tasks of ImageNet detection, ImageNet localization, COCO detection, and COCO segmentation.

...read moreread less

123,388 citations

Proceedings Article•

ImageNet Classification with Deep Convolutional Neural Networks

[...]

Alex Krizhevsky¹, Ilya Sutskever¹, Geoffrey E. Hinton¹•Institutions (1)

University of Toronto¹

03 Dec 2012

TL;DR: The state-of-the-art performance of CNNs was achieved by Deep Convolutional Neural Networks (DCNNs) as discussed by the authors, which consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax.

...read moreread less

Abstract: We trained a large, deep convolutional neural network to classify the 1.2 million high-resolution images in the ImageNet LSVRC-2010 contest into the 1000 different classes. On the test data, we achieved top-1 and top-5 error rates of 37.5% and 17.0% which is considerably better than the previous state-of-the-art. The neural network, which has 60 million parameters and 650,000 neurons, consists of five convolutional layers, some of which are followed by max-pooling layers, and three fully-connected layers with a final 1000-way softmax. To make training faster, we used non-saturating neurons and a very efficient GPU implementation of the convolution operation. To reduce overriding in the fully-connected layers we employed a recently-developed regularization method called "dropout" that proved to be very effective. We also entered a variant of this model in the ILSVRC-2012 competition and achieved a winning top-5 test error rate of 15.3%, compared to 26.2% achieved by the second-best entry.

...read moreread less

73,978 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

04 Sep 2014

TL;DR: This work investigates the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting using an architecture with very small convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers.

...read moreread less

Abstract: In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3x3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.

...read moreread less

55,235 citations

Proceedings Article•

Very Deep Convolutional Networks for Large-Scale Image Recognition

[...]

Karen Simonyan¹, Andrew Zisserman¹•Institutions (1)

University of Oxford¹

01 Jan 2015

TL;DR: In this paper, the authors investigated the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting and showed that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16-19 layers.

...read moreread less

49,914 citations

Journal Article•DOI•

Gradient-based learning applied to document recognition

[...]

Yann LeCun¹, Léon Bottou², Léon Bottou³, Yoshua Bengio³, Yoshua Bengio⁴, Yoshua Bengio⁵, Patrick Haffner³ - Show less +3 more•Institutions (5)

Bell Labs¹, École Normale Supérieure², AT&T³, Alcatel-Lucent⁴, École Polytechnique de Montréal⁵

01 Jan 1998

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

...read moreread less

42,067 citations