scispace - formally typeset
Search or ask a question
Author

Li Du

Other affiliations: Southeast University, Qualcomm
Bio: Li Du is an academic researcher from University of California, Los Angeles. The author has contributed to research in topics: Computer science & Convolution. The author has an hindex of 10, co-authored 37 publications receiving 347 citations. Previous affiliations of Li Du include Southeast University & Qualcomm.

Papers published on a yearly basis

Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, a streaming hardware accelerator for image detection using CNN in the Internet of Things (IoT) devices has been proposed, which can support arbitrary convolution window size and max-pooling function can be computed in parallel with convolution.
Abstract: Convolutional neural network (CNN) offers significant accuracy in image detection. To implement image detection using CNN in the Internet of Things (IoT) devices, a streaming hardware accelerator is proposed. The proposed accelerator optimizes the energy efficiency by avoiding unnecessary data movement. With unique filter decomposition technique, the accelerator can support arbitrary convolution window size. In addition, max-pooling function can be computed in parallel with convolution by using separate pooling unit, thus achieving throughput improvement. A prototype accelerator was implemented in TSMC 65-nm technology with a core size of 5 mm2. The accelerator can support major CNNs and achieve 152GOPS peak throughput and 434GOPS/W energy efficiency at 350 mW, making it a promising hardware accelerator for intelligent IoT devices.

175 citations

Journal ArticleDOI
TL;DR: The adaptive multiband scheme mitigates equalization requirements and enhances the energy efficiency by avoiding frequency notches and utilizing the maximum available signal-to-noise ratio and channel bandwidth.
Abstract: A cognitive tri-band transmitter (TX) with a forwarded clock using multiband signaling and high-order digital signal modulations is presented for serial link applications. The TX features learning an arbitrary channel response by sending a sweep of continuous wave, detecting power level at the receiver side, and then adapting modulation scheme, data bandwidth, and carrier frequencies accordingly based on detected channel information. The supported modulation scheme ranges from nonreturn to zero/Quadrature phase shift keying (QPSK) to Pulse-amplitude modulation (PAM) 16/256-Quadrature amplitude modulation(QAM). The proposed highly reconfigurable TX is capable of dealing with low-cost serial channels, such as low-cost connectors, cables, or multidrop buses with deep and narrow notches in the frequency domain (e.g., a 40-dB loss at notches). The adaptive multiband scheme mitigates equalization requirements and enhances the energy efficiency by avoiding frequency notches and utilizing the maximum available signal-to-noise ratio and channel bandwidth. The implemented TX prototype consumes a 14.7-mW power and occupies 0.016 mm2 in a 28-nm CMOS. It achieves a maximum data rate of 16 Gb/s with forwarded clock through one differential pair and the most energy efficient figure of merit of 20.4 $\mu \text{W}$ /Gb/s/dB, which is calculated based on power consumption of transmitting per gigabits per second data and simultaneously overcoming per decibel worst case channel loss within the Nyquist frequency.

32 citations

Journal ArticleDOI
TL;DR: An analog neural network computing engine based on CMOS-compatible charge-trap transistor (CTT) is proposed and obtained a performance comparable to state-of-the-art fully connected neural networks using 8-bit fixed-point resolution.
Abstract: An analog neural network computing engine based on CMOS-compatible charge-trap transistor (CTT) is proposed in this paper. CTT devices are used as analog multipliers. Compared to digital multipliers, CTT-based analog multiplier shows significant area and power reduction. The proposed computing engine is composed of a scalable CTT multiplier array and energy efficient analog–digital interfaces. By implementing the sequential analog fabric, the engine’s mixed-signal interfaces are simplified and hardware overhead remains constant regardless of the size of the array. A proof-of-concept 784 by 784 CTT computing engine is implemented using TSMC 28-nm CMOS technology and occupies 0.68 mm2. The simulated performance achieves 76.8 TOPS (8-bit) with 500 MHz clock frequency and consumes 14.8 mW. As an example, we utilize this computing engine to address a classic pattern recognition problem—classifying handwritten digits on MNIST database and obtained a performance comparable to state-of-the-art fully connected neural networks using 8-bit fixed-point resolution.

26 citations

Journal ArticleDOI
TL;DR: This brief discusses an oscillator-based capacitive 3-D touch-sensing circuit for mobile devices that uses correlated double sampling to achieve a high sensing resolution in the Z-direction and employs bootstrapping circuitry to reduce the mobile screen's interchannel-coupling effects.
Abstract: This brief discusses an oscillator-based capacitive 3-D touch-sensing circuit for mobile devices. The proposed 3-D touch sensor uses correlated double sampling to achieve a high sensing resolution in the Z-direction and employs bootstrapping circuitry to reduce the mobile screen's interchannel-coupling effects. Additionally, to reduce chip area and assembly, the sensing oscillator is implemented with inverter-based active resonators instead of using either on- or off-chip inductors. The prototyped 3-D touch sensor is fabricated using 65-nm CMOS process technology and consumes an area of 2 mm2, with a 2.3-mW power consumption from a 1-V power supply. Measured together with a 3.4′′ HTC standard mobile screen, the sensor achieves an 11-cm Z-direction sensing range with a 1-cm resolution, demonstrating the potential implementation of 3-D finger position sensing in a mobile device.

17 citations

Patent
16 May 2018
TL;DR: In this paper, a convolution operation device includes a CNN, a memory and a buffer device coupled to the CNN for retrieving a plurality of new data from the memory and inputting the new data to each of the convolution units.
Abstract: A convolution operation device includes a convolution calculation module, a memory and a buffer device. The convolution calculation module has a plurality of convolution units, and each convolution unit performs a convolution operation according to a filter and a plurality of current data, and leaves a part of the current data after the convolution operation. The buffer device is coupled to the memory and the convolution calculation module for retrieving a plurality of new data from the memory and inputting the new data to each of the convolution units. The new data are not a duplicate of the current data. A convolution operation method is also disclosed.

17 citations


Cited by
More filters
01 Jan 2016
TL;DR: The design of analog cmos integrated circuits is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can download it instantly.
Abstract: Thank you very much for downloading design of analog cmos integrated circuits. Maybe you have knowledge that, people have look hundreds times for their favorite novels like this design of analog cmos integrated circuits, but end up in malicious downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some malicious virus inside their laptop. design of analog cmos integrated circuits is available in our book collection an online access to it is set as public so you can download it instantly. Our digital library saves in multiple countries, allowing you to get the most less latency time to download any of our books like this one. Merely said, the design of analog cmos integrated circuits is universally compatible with any devices to read.

912 citations

Journal ArticleDOI
TL;DR: This paper constitutes the first holistic tutorial on the development of ANN-based ML techniques tailored to the needs of future wireless networks and overviews how artificial neural networks (ANNs)-based ML algorithms can be employed for solving various wireless networking problems.
Abstract: In order to effectively provide ultra reliable low latency communications and pervasive connectivity for Internet of Things (IoT) devices, next-generation wireless networks can leverage intelligent, data-driven functions enabled by the integration of machine learning (ML) notions across the wireless core and edge infrastructure. In this context, this paper provides a comprehensive tutorial that overviews how artificial neural networks (ANNs)-based ML algorithms can be employed for solving various wireless networking problems. For this purpose, we first present a detailed overview of a number of key types of ANNs that include recurrent, spiking, and deep neural networks, that are pertinent to wireless networking applications. For each type of ANN, we present the basic architecture as well as specific examples that are particularly important and relevant wireless network design. Such ANN examples include echo state networks, liquid state machine, and long short term memory. Then, we provide an in-depth overview on the variety of wireless communication problems that can be addressed using ANNs, ranging from communication using unmanned aerial vehicles to virtual reality applications over wireless networks as well as edge computing and caching. For each individual application, we present the main motivation for using ANNs along with the associated challenges while we also provide a detailed example for a use case scenario and outline future works that can be addressed using ANNs. In a nutshell, this paper constitutes the first holistic tutorial on the development of ANN-based ML techniques tailored to the needs of future wireless networks.

666 citations

Journal ArticleDOI
TL;DR: By consolidating information scattered across the communication, networking, and DL areas, this survey can help readers to understand the connections between enabling technologies while promoting further discussions on the fusion of edge intelligence and intelligent edge, i.e., Edge DL.
Abstract: Ubiquitous sensors and smart devices from factories and communities are generating massive amounts of data, and ever-increasing computing power is driving the core of computation and services from the cloud to the edge of the network. As an important enabler broadly changing people’s lives, from face recognition to ambitious smart factories and cities, developments of artificial intelligence (especially deep learning, DL) based applications and services are thriving. However, due to efficiency and latency issues, the current cloud computing service architecture hinders the vision of “providing artificial intelligence for every person and every organization at everywhere”. Thus, unleashing DL services using resources at the network edge near the data sources has emerged as a desirable solution. Therefore, edge intelligence , aiming to facilitate the deployment of DL services by edge computing, has received significant attention. In addition, DL, as the representative technique of artificial intelligence, can be integrated into edge computing frameworks to build intelligent edge for dynamic, adaptive edge maintenance and management. With regard to mutually beneficial edge intelligence and intelligent edge , this paper introduces and discusses: 1) the application scenarios of both; 2) the practical implementation methods and enabling technologies, namely DL training and inference in the customized edge computing framework; 3) challenges and future trends of more pervasive and fine-grained intelligence. We believe that by consolidating information scattered across the communication, networking, and DL areas, this survey can help readers to understand the connections between enabling technologies while promoting further discussions on the fusion of edge intelligence and intelligent edge , i.e., Edge DL.

611 citations

Journal ArticleDOI
TL;DR: In this paper, a survey on the relationship between edge intelligence and intelligent edge computing is presented, and the practical implementation methods and enabling technologies, namely DL training and inference in the customized edge computing framework, challenges and future trends of more pervasive and fine-grained intelligence.
Abstract: Ubiquitous sensors and smart devices from factories and communities are generating massive amounts of data, and ever-increasing computing power is driving the core of computation and services from the cloud to the edge of the network. As an important enabler broadly changing people's lives, from face recognition to ambitious smart factories and cities, developments of artificial intelligence (especially deep learning, DL) based applications and services are thriving. However, due to efficiency and latency issues, the current cloud computing service architecture hinders the vision of "providing artificial intelligence for every person and every organization at everywhere". Thus, unleashing DL services using resources at the network edge near the data sources has emerged as a desirable solution. Therefore, edge intelligence, aiming to facilitate the deployment of DL services by edge computing, has received significant attention. In addition, DL, as the representative technique of artificial intelligence, can be integrated into edge computing frameworks to build intelligent edge for dynamic, adaptive edge maintenance and management. With regard to mutually beneficial edge intelligence and intelligent edge, this paper introduces and discusses: 1) the application scenarios of both; 2) the practical implementation methods and enabling technologies, namely DL training and inference in the customized edge computing framework; 3) challenges and future trends of more pervasive and fine-grained intelligence. We believe that by consolidating information scattered across the communication, networking, and DL areas, this survey can help readers to understand the connections between enabling technologies while promoting further discussions on the fusion of edge intelligence and intelligent edge, i.e., Edge DL.

518 citations

Journal ArticleDOI
TL;DR: A thorough investigation of deep learning in its applications and mechanisms is sought, as a categorical collection of state of the art in deep learning research, to provide a broad reference for those seeking a primer on deep learning and its various implementations, platforms, algorithms, and uses in a variety of smart-world systems.
Abstract: Deep learning has exploded in the public consciousness, primarily as predictive and analytical products suffuse our world, in the form of numerous human-centered smart-world systems, including targeted advertisements, natural language assistants and interpreters, and prototype self-driving vehicle systems. Yet to most, the underlying mechanisms that enable such human-centered smart products remain obscure. In contrast, researchers across disciplines have been incorporating deep learning into their research to solve problems that could not have been approached before. In this paper, we seek to provide a thorough investigation of deep learning in its applications and mechanisms. Specifically, as a categorical collection of state of the art in deep learning research, we hope to provide a broad reference for those seeking a primer on deep learning and its various implementations, platforms, algorithms, and uses in a variety of smart-world systems. Furthermore, we hope to outline recent key advancements in the technology, and provide insight into areas, in which deep learning can improve investigation, as well as highlight new areas of research that have yet to see the application of deep learning, but could nonetheless benefit immensely. We hope this survey provides a valuable reference for new deep learning practitioners, as well as those seeking to innovate in the application of deep learning.

411 citations