scispace - formally typeset
Search or ask a question
Author

Li Jing

Other affiliations: Peking University, Facebook
Bio: Li Jing is an academic researcher from Massachusetts Institute of Technology. The author has contributed to research in topics: Artificial neural network & Recurrent neural network. The author has an hindex of 17, co-authored 56 publications receiving 1315 citations. Previous affiliations of Li Jing include Peking University & Facebook.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, artificial neural networks are used to approximate light scattering by multilayer nanoparticles. But the network needs to be trained on only a small sampling of the data to approximate the simulation to high precision.
Abstract: We propose a method to use artificial neural networks to approximate light scattering by multilayer nanoparticles. We find that the network needs to be trained on only a small sampling of the data to approximate the simulation to high precision. Once the neural network is trained, it can simulate such optical processes orders of magnitude faster than conventional simulations. Furthermore, the trained neural network can be used to solve nanophotonic inverse design problems by using back propagation, where the gradient is analytical, not numerical.

576 citations

Journal ArticleDOI
TL;DR: The concept of an intelligent (that is, self-adaptive) cloak driven by deep learning is proposed and a metasurface cloak is presented as an example implementation, which exhibits a millisecond response time to an ever-changing incident wave and the surrounding environment.
Abstract: Becoming invisible at will has fascinated humanity for centuries and in the past decade it has attracted a great deal of attention owing to the advent of metamaterials. However, state-of-the-art invisibility cloaks typically work in a deterministic system or in conjunction with outside help to achieve active cloaking. Here, we propose the concept of an intelligent (that is, self-adaptive) cloak driven by deep learning and present a metasurface cloak as an example implementation. In the experiment, the metasurface cloak exhibits a millisecond response time to an ever-changing incident wave and the surrounding environment, without any human intervention. Our work brings the available cloaking strategies closer to a wide range of real-time, in situ applications, such as moving stealth vehicles. The approach opens the way to facilitating other intelligent metadevices in the microwave regime and across the wider electromagnetic spectrum and, more generally, enables automatic solutions of electromagnetic inverse design problems. A deep-learning-enabled metasurface cloak actively self-adapts to take into account changing microwave illumination and varying physical surroundings.

259 citations

Proceedings ArticleDOI
23 Feb 2018
TL;DR: A method to use artificial neural networks to approximate light scattering by multilayer nanoparticles is proposed and it is found the network needs to be trained on only a small sampling of the data in order to approximate the simulation to high precision.
Abstract: We propose a method to use artificial neural networks to approximate light scattering by multilayer nanoparticles. We find the network needs to be trained on only a small sampling of the data in order to approximate the simulation to high precision. Once the neural network is trained, it can simulate such optical processes orders of magnitude faster than conventional simulations. Furthermore, the trained neural network can be used solve nanophotonic inverse design problems by using back-propogation - where the gradient is analytical, not numerical.

185 citations

Proceedings Article
06 Aug 2017
TL;DR: This work presents a new architecture for implementing an Efficient Unitary Neural Network (EUNNs), and finds that this architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed.
Abstract: Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely O(1) per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.

168 citations

Journal ArticleDOI
TL;DR: A comprehensive review of quantum cloning can be found in this article, where the authors give a complete description of those important developments about quantum cloning and some related topics, and in particular, they present some detailed formulations so that further study can be taken based on those results.

104 citations


Cited by
More filters
Posted Content
TL;DR: A systematic evaluation of generic convolutional and recurrent architectures for sequence modeling concludes that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutionals should be regarded as a natural starting point for sequence modeled tasks.
Abstract: For most deep learning practitioners, sequence modeling is synonymous with recurrent networks. Yet recent results indicate that convolutional architectures can outperform recurrent networks on tasks such as audio synthesis and machine translation. Given a new sequence modeling task or dataset, which architecture should one use? We conduct a systematic evaluation of generic convolutional and recurrent architectures for sequence modeling. The models are evaluated across a broad range of standard tasks that are commonly used to benchmark recurrent networks. Our results indicate that a simple convolutional architecture outperforms canonical recurrent networks such as LSTMs across a diverse range of tasks and datasets, while demonstrating longer effective memory. We conclude that the common association between sequence modeling and recurrent networks should be reconsidered, and convolutional networks should be regarded as a natural starting point for sequence modeling tasks. To assist related work, we have made code available at this http URL .

2,776 citations

Journal ArticleDOI
TL;DR: The LSTM cell and its variants are reviewed and their variants are explored to explore the learning capacity of the LSTm cell and the L STM networks are divided into two broad categories:LSTM-dominated networks and integrated LSTS networks.
Abstract: Recurrent neural networks (RNNs) have been widely adopted in research areas concerned with sequential data, such as text, audio, and video. However, RNNs consisting of sigma cells or tanh cells are...

1,561 citations

Posted Content
TL;DR: This paper proposes the weight-dropped LSTM which uses DropConnect on hidden-to-hidden weights as a form of recurrent regularization and introduces NT-ASGD, a variant of the averaged stochastic gradient method, wherein the averaging trigger is determined using a non-monotonic condition as opposed to being tuned by the user.
Abstract: Recurrent neural networks (RNNs), such as long short-term memory networks (LSTMs), serve as a fundamental building block for many sequence learning tasks, including machine translation, language modeling, and question answering. In this paper, we consider the specific problem of word-level language modeling and investigate strategies for regularizing and optimizing LSTM-based models. We propose the weight-dropped LSTM which uses DropConnect on hidden-to-hidden weights as a form of recurrent regularization. Further, we introduce NT-ASGD, a variant of the averaged stochastic gradient method, wherein the averaging trigger is determined using a non-monotonic condition as opposed to being tuned by the user. Using these and other regularization strategies, we achieve state-of-the-art word level perplexities on two data sets: 57.3 on Penn Treebank and 65.8 on WikiText-2. In exploring the effectiveness of a neural cache in conjunction with our proposed model, we achieve an even lower state-of-the-art perplexity of 52.8 on Penn Treebank and 52.0 on WikiText-2.

899 citations

Journal ArticleDOI
TL;DR: A tandem neural network architecture is demonstrated that tolerates inconsistent training instances in inverse design of nanophotonic devices and provides a way to train large neural networks for the inverseDesign of complex photonic structures.
Abstract: Data inconsistency leads to a slow training process when deep neural networks are used for the inverse design of photonic devices, an issue that arises from the fundamental property of nonuniqueness in all inverse scattering problems. Here we show that by combining forward modeling and inverse design in a tandem architecture, one can overcome this fundamental issue, allowing deep neural networks to be effectively trained by data sets that contain nonunique electromagnetic scattering instances. This paves the way for using deep neural networks to design complex photonic structures that require large training data sets.

619 citations