Showing papers on "Artificial neural network published in 2019"

PDF

Open Access

Journal Article•DOI•

Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations

[...]

Maziar Raissi¹, Paris Perdikaris², George Em Karniadakis¹•Institutions (2)

Brown University¹, University of Pennsylvania²

01 Feb 2019-Journal of Computational Physics

TL;DR: In this article, the authors introduce physics-informed neural networks, which are trained to solve supervised learning tasks while respecting any given laws of physics described by general nonlinear partial differential equations.

...read moreread less

5,448 citations

Journal Article•DOI•

Dynamic Graph CNN for Learning on Point Clouds

[...]

Yue Wang¹, Yongbin Sun¹, Ziwei Liu², Sanjay E. Sarma¹, Michael M. Bronstein³, Justin Solomon¹ - Show less +2 more•Institutions (3)

Massachusetts Institute of Technology¹, University of California, Berkeley², Imperial College London³

10 Oct 2019-ACM Transactions on Graphics

TL;DR: This work proposes a new neural network module suitable for CNN-based high-level tasks on point clouds, including classification and segmentation called EdgeConv, which acts on graphs dynamically computed in each layer of the network.

...read moreread less

Abstract: Point clouds provide a flexible geometric representation suitable for countless applications in computer graphics; they also comprise the raw output of most 3D data acquisition devices. While hand-designed features on point clouds have long been proposed in graphics and vision, however, the recent overwhelming success of convolutional neural networks (CNNs) for image analysis suggests the value of adapting insight from CNN to the point cloud world. Point clouds inherently lack topological information, so designing a model to recover topology can enrich the representation power of point clouds. To this end, we propose a new neural network module dubbed EdgeConv suitable for CNN-based high-level tasks on point clouds, including classification and segmentation. EdgeConv acts on graphs dynamically computed in each layer of the network. It is differentiable and can be plugged into existing architectures. Compared to existing modules operating in extrinsic space or treating each point independently, EdgeConv has several appealing properties: It incorporates local neighborhood information; it can be stacked applied to learn global shape properties; and in multi-layer systems affinity in feature space captures semantic characteristics over potentially long distances in the original embedding. We show the performance of our model on standard benchmarks, including ModelNet40, ShapeNetPart, and S3DIS.

...read moreread less

3,727 citations

Journal Article•DOI•

Object Detection With Deep Learning: A Review

[...]

Zhong-Qiu Zhao¹, Peng Zheng¹, Shou-Tao Xu¹, Xindong Wu²•Institutions (2)

Hefei University of Technology¹, University of Louisiana at Lafayette²

28 Jan 2019-IEEE Transactions on Neural Networks

TL;DR: In this article, a review of deep learning-based object detection frameworks is provided, focusing on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further.

...read moreread less

Abstract: Due to object detection’s close relationship with video analysis and image understanding, it has attracted much research attention in recent years. Traditional object detection methods are built on handcrafted features and shallow trainable architectures. Their performance easily stagnates by constructing complex ensembles that combine multiple low-level image features with high-level context from object detectors and scene classifiers. With the rapid development in deep learning, more powerful tools, which are able to learn semantic, high-level, deeper features, are introduced to address the problems existing in traditional architectures. These models behave differently in network architecture, training strategy, and optimization function. In this paper, we provide a review of deep learning-based object detection frameworks. Our review begins with a brief introduction on the history of deep learning and its representative tool, namely, the convolutional neural network. Then, we focus on typical generic object detection architectures along with some modifications and useful tricks to improve detection performance further. As distinct specific detection tasks exhibit different characteristics, we also briefly survey several specific tasks, including salient object detection, face detection, and pedestrian detection. Experimental analyses are also provided to compare various methods and draw some meaningful conclusions. Finally, several promising directions and tasks are provided to serve as guidelines for future work in both object detection and relevant neural network-based learning systems.

...read moreread less

3,097 citations

Journal Article•DOI•

Continual lifelong learning with neural networks: A review.

[...]

German Ignacio Parisi¹, Ronald Kemker², Jose L. Part³, Christopher Kanan², Stefan Wermter¹ - Show less +1 more•Institutions (3)

University of Hamburg¹, Rochester Institute of Technology², Heriot-Watt University³

01 May 2019-Neural Networks

TL;DR: This review critically summarize the main challenges linked to lifelong learning for artificial learning systems and compare existing neural network approaches that alleviate, to different extents, catastrophic forgetting.

...read moreread less

2,095 citations

Journal Article•DOI•

Regularized Evolution for Image Classifier Architecture Search

[...]

Esteban Real¹, Alok Aggarwal¹, Yanping Huang¹, Quoc V. Le¹•Institutions (1)

Google¹

17 Jul 2019

TL;DR: AmoebaNet-A as mentioned in this paper modified the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes and achieved state-of-the-art performance.

...read moreread less

Abstract: The effort devoted to hand-crafting neural network image classifiers has motivated the use of architecture search to discover them automatically. Although evolutionary algorithms have been repeatedly applied to neural network topologies, the image classifiers thus discovered have remained inferior to human-crafted ones. Here, we evolve an image classifier— AmoebaNet-A—that surpasses hand-designs for the first time. To do this, we modify the tournament selection evolutionary algorithm by introducing an age property to favor the younger genotypes. Matching size, AmoebaNet-A has comparable accuracy to current state-of-the-art ImageNet models discovered with more complex architecture-search methods. Scaled to larger size, AmoebaNet-A sets a new state-of-theart 83.9% top-1 / 96.6% top-5 ImageNet accuracy. In a controlled comparison against a well known reinforcement learning algorithm, we give evidence that evolution can obtain results faster with the same hardware, especially at the earlier stages of the search. This is relevant when fewer compute resources are available. Evolution is, thus, a simple method to effectively discover high-quality architectures.

...read moreread less

2,076 citations

Journal Article•DOI•

Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning

[...]

Takeru Miyato, Shin-ichi Maeda, Masanori Koyama¹, Shin Ishii²•Institutions (2)

Ritsumeikan University¹, Kyoto University²

01 Aug 2019-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: Virtual adversarial training (VAT) as discussed by the authors is a regularization method based on virtual adversarial loss, which is a measure of local smoothness of the conditional label distribution given input.

...read moreread less

Abstract: We propose a new regularization method based on virtual adversarial loss: a new measure of local smoothness of the conditional label distribution given input. Virtual adversarial loss is defined as the robustness of the conditional label distribution around each input data point against local perturbation. Unlike adversarial training, our method defines the adversarial direction without label information and is hence applicable to semi-supervised learning. Because the directions in which we smooth the model are only “virtually” adversarial, we call our method virtual adversarial training (VAT). The computational cost of VAT is relatively low. For neural networks, the approximated gradient of virtual adversarial loss can be computed with no more than two pairs of forward- and back-propagations. In our experiments, we applied VAT to supervised and semi-supervised learning tasks on multiple benchmark datasets. With a simple enhancement of the algorithm based on the entropy minimization principle, our VAT achieves state-of-the-art performance for semi-supervised learning tasks on SVHN and CIFAR-10.

...read moreread less

1,991 citations

Journal Article•DOI•

One Pixel Attack for Fooling Deep Neural Networks

[...]

Jiawei Su¹, Danilo Vasconcellos Vargas¹, Kouichi Sakurai¹•Institutions (1)

Kyushu University¹

04 Jan 2019-IEEE Transactions on Evolutionary Computation

TL;DR: This paper proposes a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE), which requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE.

...read moreread less

Abstract: Recent research has revealed that the output of deep neural networks (DNNs) can be easily altered by adding relatively small perturbations to the input vector. In this paper, we analyze an attack in an extremely limited scenario where only one pixel can be modified. For that we propose a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE). It requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE. The results show that 67.97% of the natural images in Kaggle CIFAR-10 test dataset and 16.04% of the ImageNet (ILSVRC 2012) test images can be perturbed to at least one target class by modifying just one pixel with 74.03% and 22.91% confidence on average. We also show the same vulnerability on the original CIFAR-10 dataset. Thus, the proposed attack explores a different take on adversarial machine learning in an extreme limited scenario, showing that current DNNs are also vulnerable to such low dimension attacks. Besides, we also illustrate an important application of DE (or broadly speaking, evolutionary computation) in the domain of adversarial machine learning: creating tools that can effectively generate low-cost adversarial attacks against neural networks for evaluating robustness.

...read moreread less

1,702 citations

Proceedings Article•DOI•

Heterogeneous Graph Attention Network

[...]

Xiao Wang¹, Houye Ji¹, Chuan Shi¹, Bai Wang¹, Yanfang Ye², Peng Cui³, Philip S. Yu⁴ - Show less +3 more•Institutions (4)

Beijing University of Posts and Telecommunications¹, West Virginia University², Tsinghua University³, University of Illinois at Chicago⁴

13 May 2019

TL;DR: Wang et al. as discussed by the authors proposed a heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions, which can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner.

...read moreread less

Abstract: Graph neural network, as a powerful graph representation technique based on deep learning, has shown superior performance and attracted considerable research interest. However, it has not been fully considered in graph neural network for heterogeneous graph which contains different types of nodes and links. The heterogeneity and rich semantic information bring great challenges for designing a graph neural network for heterogeneous graph. Recently, one of the most exciting advancements in deep learning is the attention mechanism, whose great potential has been well demonstrated in various areas. In this paper, we first propose a novel heterogeneous graph neural network based on the hierarchical attention, including node-level and semantic-level attentions. Specifically, the node-level attention aims to learn the importance between a node and its meta-path based neighbors, while the semantic-level attention is able to learn the importance of different meta-paths. With the learned importance from both node-level and semantic-level attention, the importance of node and meta-path can be fully considered. Then the proposed model can generate node embedding by aggregating features from meta-path based neighbors in a hierarchical manner. Extensive experimental results on three real-world heterogeneous graphs not only show the superior performance of our proposed model over the state-of-the-arts, but also demonstrate its potentially good interpretability for graph analysis.

...read moreread less

1,467 citations

Journal Article•DOI•

Survey on deep learning with class imbalance

[...]

Justin M. Johnson¹, Taghi M. Khoshgoftaar¹•Institutions (1)

Florida Atlantic University¹

01 Mar 2019-Journal of Big Data

TL;DR: Examination of existing deep learning techniques for addressing class imbalanced data finds that research in this area is very limited, that most existing work focuses on computer vision tasks with convolutional neural networks, and that the effects of big data are rarely considered.

...read moreread less

Abstract: The purpose of this study is to examine existing deep learning techniques for addressing class imbalanced data. Effective classification with imbalanced data is an important area of research, as high class imbalance is naturally inherent in many real-world applications, e.g., fraud detection and cancer detection. Moreover, highly imbalanced data poses added difficulty, as most learners will exhibit bias towards the majority class, and in extreme cases, may ignore the minority class altogether. Class imbalance has been studied thoroughly over the last two decades using traditional machine learning models, i.e. non-deep learning. Despite recent advances in deep learning, along with its increasing popularity, very little empirical work in the area of deep learning with class imbalance exists. Having achieved record-breaking performance results in several complex domains, investigating the use of deep neural networks for problems containing high levels of class imbalance is of great interest. Available studies regarding class imbalance and deep learning are surveyed in order to better understand the efficacy of deep learning when applied to class imbalanced data. This survey discusses the implementation details and experimental results for each study, and offers additional insight into their strengths and weaknesses. Several areas of focus include: data complexity, architectures tested, performance interpretation, ease of use, big data application, and generalization to other domains. We have found that research in this area is very limited, that most existing work focuses on computer vision tasks with convolutional neural networks, and that the effects of big data are rarely considered. Several traditional methods for class imbalance, e.g. data sampling and cost-sensitive learning, prove to be applicable in deep learning, while more advanced methods that exploit neural network feature learning abilities show promising results. The survey concludes with a discussion that highlights various gaps in deep learning from class imbalanced data for the purpose of guiding future research.

...read moreread less

1,377 citations

Proceedings Article•DOI•

Graph Neural Networks for Social Recommendation

[...]

Wenqi Fan¹, Yao Ma², Qing Li¹, Yuan He, Eric Zhao, Jiliang Tang², Dawei Yin - Show less +3 more•Institutions (2)

City University of Hong Kong¹, Michigan State University²

13 May 2019

TL;DR: This paper provides a principled approach to jointly capture interactions and opinions in the user-item graph and proposes the framework GraphRec, which coherently models two graphs and heterogeneous strengths for social recommendations.

...read moreread less

Abstract: In recent years, Graph Neural Networks (GNNs), which can naturally integrate node information and topological structure, have been demonstrated to be powerful in learning on graph data. These advantages of GNNs provide great potential to advance social recommendation since data in social recommender systems can be represented as user-user social graph and user-item graph; and learning latent factors of users and items is the key. However, building social recommender systems based on GNNs faces challenges. For example, the user-item graph encodes both interactions and their associated opinions; social relations have heterogeneous strengths; users involve in two graphs (e.g., the user-user social graph and the user-item graph). To address the three aforementioned challenges simultaneously, in this paper, we present a novel graph neural network framework (GraphRec) for social recommendations. In particular, we provide a principled approach to jointly capture interactions and opinions in the user-item graph and propose the framework GraphRec, which coherently models two graphs and heterogeneous strengths. Extensive experiments on two real-world datasets demonstrate the effectiveness of the proposed framework GraphRec.

...read moreread less

1,111 citations

Journal Article•DOI•

Recent advances in physical reservoir computing: A review

[...]

Gouhei Tanaka¹, Toshiyuki Yamane², Jean Benoit Héroux², Ryosho Nakane¹, Naoki Kanazawa², Seiji Takeda², Hidetoshi Numata², Daiju Nakano², Akira Hirose¹ - Show less +5 more•Institutions (2)

University of Tokyo¹, IBM²

01 Jul 2019-Neural Networks

TL;DR: An overview of recent advances in physical reservoir computing is provided by classifying them according to the type of the reservoir to expand its practical applications and develop next-generation machine learning systems.

...read moreread less

Journal Article•DOI•

Memristive crossbar arrays for brain-inspired computing

[...]

Qiangfei Xia¹, Jianhua Yang¹•Institutions (1)

University of Massachusetts Amherst¹

20 Mar 2019-Nature Materials

TL;DR: The challenges in the integration and use in computation of large-scale memristive neural networks are discussed, both as accelerators for deep learning and as building blocks for spiking neural networks.

...read moreread less

Abstract: With their working mechanisms based on ion migration, the switching dynamics and electrical behaviour of memristive devices resemble those of synapses and neurons, making these devices promising candidates for brain-inspired computing. Built into large-scale crossbar arrays to form neural networks, they perform efficient in-memory computing with massive parallelism by directly using physical laws. The dynamical interactions between artificial synapses and neurons equip the networks with both supervised and unsupervised learning capabilities. Moreover, their ability to interface with analogue signals from sensors without analogue/digital conversions reduces the processing time and energy overhead. Although numerous simulations have indicated the potential of these networks for brain-inspired computing, experimental implementation of large-scale memristive arrays is still in its infancy. This Review looks at the progress, challenges and possible solutions for efficient brain-inspired computation with memristive implementations, both as accelerators for deep learning and as building blocks for spiking neural networks.

...read moreread less

Journal Article•DOI•

A State-of-the-Art Survey on Deep Learning Theory and Architectures

[...]

Zahangir Alom, Tarek M. Taha, Chris Yakopcic, Stefan Westberg, Paheding Sidike, Mst Shamima Nasrin, Mahmudul Hasan, Brian Van Essen, Abdul A. S. Awwal, Vijayan K. Asari - Show less +6 more

05 Mar 2019-Electronics

TL;DR: This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network and goes on to cover Convolutional Neural Network, Recurrent Neural Network (RNN), and Deep Reinforcement Learning (DRL).

...read moreread less

Abstract: In recent years, deep learning has garnered tremendous success in a variety of application domains. This new field of machine learning has been growing rapidly and has been applied to most traditional application domains, as well as some new areas that present more opportunities. Different methods have been proposed based on different categories of learning, including supervised, semi-supervised, and un-supervised learning. Experimental results show state-of-the-art performance using deep learning when compared to traditional machine learning approaches in the fields of image processing, computer vision, speech recognition, machine translation, art, medical imaging, medical information processing, robotics and control, bioinformatics, natural language processing, cybersecurity, and many others. This survey presents a brief survey on the advances that have occurred in the area of Deep Learning (DL), starting with the Deep Neural Network (DNN). The survey goes on to cover Convolutional Neural Network (CNN), Recurrent Neural Network (RNN), including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), Auto-Encoder (AE), Deep Belief Network (DBN), Generative Adversarial Network (GAN), and Deep Reinforcement Learning (DRL). Additionally, we have discussed recent developments, such as advanced variant DL techniques based on these DL approaches. This work considers most of the papers published after 2012 from when the history of deep learning began. Furthermore, DL approaches that have been explored and evaluated in different application domains are also included in this survey. We also included recently developed frameworks, SDKs, and benchmark datasets that are used for implementing and evaluating deep learning approaches. There are some surveys that have been published on DL using neural networks and a survey on Reinforcement Learning (RL). However, those papers have not discussed individual advanced techniques for training large-scale deep learning models and the recently developed method of generative models.

...read moreread less

Journal Article•DOI•

Review of Deep Learning Algorithms and Architectures

[...]

Ajay Shrestha¹, Ausif Mahmood¹•Institutions (1)

University of Bridgeport¹

22 Apr 2019-IEEE Access

TL;DR: This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time, and delve into the math behind training algorithms used in recent deep networks.

...read moreread less

Abstract: Deep learning (DL) is playing an increasingly important role in our lives. It has already made a huge impact in areas, such as cancer diagnosis, precision medicine, self-driving cars, predictive forecasting, and speech recognition. The painstakingly handcrafted feature extractors used in traditional learning, classification, and pattern recognition systems are not scalable for large-sized data sets. In many cases, depending on the problem complexity, DL can also overcome the limitations of earlier shallow networks that prevented efficient training and abstractions of hierarchical representations of multi-dimensional training data. Deep neural network (DNN) uses multiple (deep) layers of units with highly optimized algorithms and architectures. This paper reviews several optimization methods to improve the accuracy of the training and to reduce training time. We delve into the math behind training algorithms used in recent deep networks. We describe current shortcomings, enhancements, and implementations. The review also covers different types of deep architectures, such as deep convolution networks, deep residual networks, recurrent neural networks, reinforcement learning, variational autoencoders, and others.

...read moreread less

Proceedings Article•DOI•

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

[...]

Chenxi Liu¹, Liang-Chieh Chen², Florian Schroff², Hartwig Adam², Wei Hua², Alan L. Yuille¹, Li Fei-Fei³ - Show less +3 more•Institutions (3)

Johns Hopkins University¹, Google², Stanford University³

15 Jun 2019

TL;DR: Li et al. as mentioned in this paper proposed to search the network level structure in addition to the cell level structure, which formed a hierarchical architecture search space and achieved state-of-the-art performance without any ImageNet pretraining.

...read moreread less

Abstract: Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This choice simplifies the search space, but becomes increasingly problematic for dense image prediction which exhibits a lot more network level architectural variations. Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. We present a network level search space that includes many popular designs, and develop a formulation that allows efficient gradient-based architecture search (3 P100 GPU days on Cityscapes images). We demonstrate the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets. Auto-DeepLab, our architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining.

...read moreread less

Journal Article•DOI•

All-optical spiking neurosynaptic networks with self-learning capabilities.

[...]

Johannes Feldmann¹, Nathan Youngblood², C.D. Wright³, Harish Bhaskaran², Wolfram H. P. Pernice¹ - Show less +1 more•Institutions (3)

University of Münster¹, University of Oxford², University of Exeter³

08 May 2019-Nature

TL;DR: An optical version of a brain-inspired neurosynaptic system, using wavelength division multiplexing techniques, is presented that is capable of supervised and unsupervised learning.

...read moreread less

Abstract: Software implementations of brain-inspired computing underlie many important computational tasks, from image processing to speech recognition, artificial intelligence and deep learning applications. Yet, unlike real neural tissue, traditional computing architectures physically separate the core computing functions of memory and processing, making fast, efficient and low-energy computing difficult to achieve. To overcome such limitations, an attractive alternative is to design hardware that mimics neurons and synapses. Such hardware, when connected in networks or neuromorphic systems, processes information in a way more analogous to brains. Here we present an all-optical version of such a neurosynaptic system, capable of supervised and unsupervised learning. We exploit wavelength division multiplexing techniques to implement a scalable circuit architecture for photonic neural networks, successfully demonstrating pattern recognition directly in the optical domain. Such photonic neurosynaptic networks promise access to the high speed and high bandwidth inherent to optical systems, thus enabling the direct processing of optical telecommunication and visual data. An optical version of a brain-inspired neurosynaptic system, using wavelength division multiplexing techniques, is presented that is capable of supervised and unsupervised learning.

...read moreread less

Journal Article•DOI•

Deep Learning Approach for Intelligent Intrusion Detection System

[...]

R. Vinayakumar¹, Mamoun Alazab², K. P. Soman¹, Prabaharan Poornachandran¹, Ameer Al-Nemrat³, Sitalakshmi Venkatraman - Show less +2 more•Institutions (3)

Amrita Vishwa Vidyapeetham¹, Charles Darwin University², University of East London³

03 Apr 2019-IEEE Access

TL;DR: A highly scalable and hybrid DNNs framework called scale-hybrid-IDS-AlertNet is proposed which can be used in real-time to effectively monitor the network traffic and host-level events to proactively alert possible cyberattacks.

...read moreread less

Abstract: Machine learning techniques are being widely used to develop an intrusion detection system (IDS) for detecting and classifying cyberattacks at the network-level and the host-level in a timely and automatic manner. However, many challenges arise since malicious attacks are continually changing and are occurring in very large volumes requiring a scalable solution. There are different malware datasets available publicly for further research by cyber security community. However, no existing study has shown the detailed analysis of the performance of various machine learning algorithms on various publicly available datasets. Due to the dynamic nature of malware with continuously changing attacking methods, the malware datasets available publicly are to be updated systematically and benchmarked. In this paper, a deep neural network (DNN), a type of deep learning model, is explored to develop a flexible and effective IDS to detect and classify unforeseen and unpredictable cyberattacks. The continuous change in network behavior and rapid evolution of attacks makes it necessary to evaluate various datasets which are generated over the years through static and dynamic approaches. This type of study facilitates to identify the best algorithm which can effectively work in detecting future cyberattacks. A comprehensive evaluation of experiments of DNNs and other classical machine learning classifiers are shown on various publicly available benchmark malware datasets. The optimal network parameters and network topologies for DNNs are chosen through the following hyperparameter selection methods with KDDCup 99 dataset. All the experiments of DNNs are run till 1,000 epochs with the learning rate varying in the range [0.01-0.5]. The DNN model which performed well on KDDCup 99 is applied on other datasets, such as NSL-KDD, UNSW-NB15, Kyoto, WSN-DS, and CICIDS 2017, to conduct the benchmark. Our DNN model learns the abstract and high-dimensional feature representation of the IDS data by passing them into many hidden layers. Through a rigorous experimental testing, it is confirmed that DNNs perform well in comparison with the classical machine learning classifiers. Finally, we propose a highly scalable and hybrid DNNs framework called scale-hybrid-IDS-AlertNet which can be used in real-time to effectively monitor the network traffic and host-level events to proactively alert possible cyberattacks.

...read moreread less

Journal Article•DOI•

Deep Learning With Edge Computing: A Review

[...]

Jiasi Chen¹, Xukan Ran¹•Institutions (1)

University of California, Riverside¹

15 Jul 2019

TL;DR: This paper will provide an overview of applications where deep learning is used at the network edge, discuss various approaches for quickly executing deep learning inference across a combination of end devices, edge servers, and the cloud, and describe the methods for training deep learning models across multiple edge devices.

...read moreread less

Abstract: Deep learning is currently widely used in a variety of applications, including computer vision and natural language processing. End devices, such as smartphones and Internet-of-Things sensors, are generating data that need to be analyzed in real time using deep learning or used to train deep learning models. However, deep learning inference and training require substantial computation resources to run quickly. Edge computing, where a fine mesh of compute nodes are placed close to end devices, is a viable way to meet the high computation and low-latency requirements of deep learning on edge devices and also provides additional benefits in terms of privacy, bandwidth efficiency, and scalability. This paper aims to provide a comprehensive review of the current state of the art at the intersection of deep learning and edge computing. Specifically, it will provide an overview of applications where deep learning is used at the network edge, discuss various approaches for quickly executing deep learning inference across a combination of end devices, edge servers, and the cloud, and describe the methods for training deep learning models across multiple edge devices. It will also discuss open challenges in terms of systems performance, network technologies and management, benchmarks, and privacy. The reader will take away the following concepts from this paper: understanding scenarios where deep learning at the network edge can be useful, understanding common techniques for speeding up deep learning inference and performing distributed training on edge devices, and understanding recent trends and opportunities.

...read moreread less

Proceedings Article•DOI•

Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning

[...]

Milad Nasr¹, Reza Shokri², Amir Houmansadr¹•Institutions (2)

University of Massachusetts Amherst¹, National University of Singapore²

19 May 2019

TL;DR: The reasons why deep learning models may leak information about their training data are investigated and new algorithms tailored to the white-box setting are designed by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks.

...read moreread less

Abstract: Deep neural networks are susceptible to various inference attacks as they remember information about their training data. We design white-box inference attacks to perform a comprehensive privacy analysis of deep learning models. We measure the privacy leakage through parameters of fully trained models as well as the parameter updates of models during training. We design inference algorithms for both centralized and federated learning, with respect to passive and active inference attackers, and assuming different adversary prior knowledge. We evaluate our novel white-box membership inference attacks against deep learning algorithms to trace their training data records. We show that a straightforward extension of the known black-box attacks to the white-box setting (through analyzing the outputs of activation functions) is ineffective. We therefore design new algorithms tailored to the white-box setting by exploiting the privacy vulnerabilities of the stochastic gradient descent algorithm, which is the algorithm used to train deep neural networks. We investigate the reasons why deep learning models may leak information about their training data. We then show that even well-generalized models are significantly susceptible to white-box membership inference attacks, by analyzing state-of-the-art pre-trained and publicly available models for the CIFAR dataset. We also show how adversarial participants, in the federated learning setting, can successfully run active membership inference attacks against other participants, even when the global model achieves high prediction accuracies.

...read moreread less

Journal Article•DOI•

Deep learning for electroencephalogram (EEG) classification tasks: a review.

[...]

Alexander Craik¹, Yongtian He¹, Jose L. Contreras-Vidal¹•Institutions (1)

University of Houston¹

26 Feb 2019-Journal of Neural Engineering

TL;DR: Practical suggestions on the selection of many hyperparameters are provided in the hope that they will promote or guide the deployment of deep learning to EEG datasets in future research.

...read moreread less

Abstract: Objective Electroencephalography (EEG) analysis has been an important tool in neuroscience with applications in neuroscience, neural engineering (e.g. Brain-computer interfaces, BCI's), and even commercial applications. Many of the analytical tools used in EEG studies have used machine learning to uncover relevant information for neural classification and neuroimaging. Recently, the availability of large EEG data sets and advances in machine learning have both led to the deployment of deep learning architectures, especially in the analysis of EEG signals and in understanding the information it may contain for brain functionality. The robust automatic classification of these signals is an important step towards making the use of EEG more practical in many applications and less reliant on trained professionals. Towards this goal, a systematic review of the literature on deep learning applications to EEG classification was performed to address the following critical questions: (1) Which EEG classification tasks have been explored with deep learning? (2) What input formulations have been used for training the deep networks? (3) Are there specific deep learning network structures suitable for specific types of tasks? Approach A systematic literature review of EEG classification using deep learning was performed on Web of Science and PubMed databases, resulting in 90 identified studies. Those studies were analyzed based on type of task, EEG preprocessing methods, input type, and deep learning architecture. Main results For EEG classification tasks, convolutional neural networks, recurrent neural networks, deep belief networks outperform stacked auto-encoders and multi-layer perceptron neural networks in classification accuracy. The tasks that used deep learning fell into five general groups: emotion recognition, motor imagery, mental workload, seizure detection, event related potential detection, and sleep scoring. For each type of task, we describe the specific input formulation, major characteristics, and end classifier recommendations found through this review. Significance This review summarizes the current practices and performance outcomes in the use of deep learning for EEG classification. Practical suggestions on the selection of many hyperparameters are provided in the hope that they will promote or guide the deployment of deep learning to EEG datasets in future research.

...read moreread less

Journal Article•DOI•

A New Deep Transfer Learning Based on Sparse Auto-Encoder for Fault Diagnosis

[...]

Long Wen¹, Liang Gao¹, Xinyu Li¹•Institutions (1)

Huazhong University of Science and Technology¹

01 Jan 2019-IEEE Transactions on Systems, Man, and Cybernetics

TL;DR: A new DTL method is proposed, which uses a three-layer sparse auto-encoder to extract the features of raw data, and applies the maximum mean discrepancy term to minimizing the discrepancy penalty between the features from training data and testing data.

...read moreread less

Abstract: Fault diagnosis plays an important role in modern industry. With the development of smart manufacturing, the data-driven fault diagnosis becomes hot. However, traditional methods have two shortcomings: 1) their performances depend on the good design of handcrafted features of data, but it is difficult to predesign these features and 2) they work well under a general assumption: the training data and testing data should be drawn from the same distribution, but this assumption fails in many engineering applications. Since deep learning (DL) can extract the hierarchical representation features of raw data, and transfer learning provides a good way to perform a learning task on the different but related distribution datasets, deep transfer learning (DTL) has been developed for fault diagnosis. In this paper, a new DTL method is proposed. It uses a three-layer sparse auto-encoder to extract the features of raw data, and applies the maximum mean discrepancy term to minimizing the discrepancy penalty between the features from training data and testing data. The proposed DTL is tested on the famous motor bearing dataset from the Case Western Reserve University. The results show a good improvement, and DTL achieves higher prediction accuracies on most experiments than DL. The prediction accuracy of DTL, which is as high as 99.82%, is better than the results of other algorithms, including deep belief network, sparse filter, artificial neural network, support vector machine and some other traditional methods. What is more, two additional analytical experiments are conducted. The results show that a good unlabeled third dataset may be helpful to DTL, and a good linear relationship between the final prediction accuracies and their standard deviations have been observed.

...read moreread less

Journal Article•DOI•

Deep learning in spiking neural networks

[...]

Amirhossein Tavanaei¹, Masoud Ghodrati², Saeed Reza Kheradpisheh³, Timothée Masquelier⁴, Anthony S. Maida¹ - Show less +1 more•Institutions (4)

University of Louisiana at Lafayette¹, Monash University, Clayton campus², Kharazmi University³, University of Toulouse⁴

01 Mar 2019-Neural Networks

TL;DR: The emerging picture is that SNNs still lag behind ANNs in terms of accuracy, but the gap is decreasing, and can even vanish on some tasks, while SNN's typically require many fewer operations and are the better candidates to process spatio-temporal data.

...read moreread less

Journal Article•DOI•

Wide Neural Networks of Any Depth Evolve as Linear Models Under Gradient Descent

[...]

Jaehoon Lee¹, Lechao Xiao¹, Samuel S. Schoenholz¹, Yasaman Bahri¹, Roman Novak¹, Jascha Sohl-Dickstein¹, Jeffrey Pennington¹ - Show less +3 more•Institutions (1)

Google¹

18 Feb 2019-arXiv: Machine Learning

TL;DR: In this article, the authors show that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters.

...read moreread less

Abstract: A longstanding goal in deep learning research has been to precisely characterize training and generalization. However, the often complex loss landscapes of neural networks have made a theory of learning dynamics elusive. In this work, we show that for wide neural networks the learning dynamics simplify considerably and that, in the infinite width limit, they are governed by a linear model obtained from the first-order Taylor expansion of the network around its initial parameters. Furthermore, mirroring the correspondence between wide Bayesian neural networks and Gaussian processes, gradient-based training of wide neural networks with a squared loss produces test set predictions drawn from a Gaussian process with a particular compositional kernel. While these theoretical results are only exact in the infinite width limit, we nevertheless find excellent empirical agreement between the predictions of the original network and those of the linearized version even for finite practically-sized networks. This agreement is robust across different architectures, optimization methods, and loss functions.

...read moreread less

Proceedings Article•DOI•

Revisiting Self-Supervised Visual Representation Learning

[...]

Alexander Kolesnikov¹, Xiaohua Zhai¹, Lucas Beyer¹•Institutions (1)

Google¹

15 Jun 2019

TL;DR: This study revisits numerous previously proposed self-supervised models, conducts a thorough large scale study and uncovers multiple crucial insights about standard recipes for CNN design that do not always translate to self- supervised representation learning.

...read moreread less

Abstract: Unsupervised visual representation learning remains a largely unsolved problem in computer vision research. Among a big body of recently proposed approaches for unsupervised learning of visual representations, a class of self-supervised techniques achieves superior performance on many challenging benchmarks. A large number of the pretext tasks for self-supervised learning have been studied, but other important aspects, such as the choice of convolutional neural networks (CNN), has not received equal attention. Therefore, we revisit numerous previously proposed self-supervised models, conduct a thorough large scale study and, as a result, uncover multiple crucial insights. We challenge a number of common practices in self-supervised visual representation learning and observe that standard recipes for CNN design do not always translate to self-supervised representation learning. As part of our study, we drastically boost the performance of previously proposed techniques and outperform previously published state-of-the-art results by a large margin. We will release the code for reproducing our experiments when the anonymity requirements are lifted.

...read moreread less

Journal Article•DOI•

Highly Accurate Machine Fault Diagnosis Using Deep Transfer Learning

[...]

Siyu Shao¹, Stephen McAleer², Ruqiang Yan¹, Pierre Baldi²•Institutions (2)

Southeast University¹, University of California, Irvine²

01 Apr 2019-IEEE Transactions on Industrial Informatics

TL;DR: A novel deep learning framework to achieve highly accurate machine fault diagnosis using transfer learning to enable and accelerate the training of deep neural network is developed.

...read moreread less

Abstract: We develop a novel deep learning framework to achieve highly accurate machine fault diagnosis using transfer learning to enable and accelerate the training of deep neural network. Compared with existing methods, the proposed method is faster to train and more accurate. First, original sensor data are converted to images by conducting a Wavelet transformation to obtain time-frequency distributions. Next, a pretrained network is used to extract lower level features. The labeled time-frequency images are then used to fine-tune the higher levels of the neural network architecture. This paper creates a machine fault diagnosis pipeline and experiments are carried out to verify the effectiveness and generalization of the pipeline on three main mechanical datasets including induction motors, gearboxes, and bearings with sizes of 6000, 9000, and 5000 time series samples, respectively. We achieve state-of-the-art results on each dataset, with most datasets showing test accuracy near 100%, and in the gearbox dataset, we achieve significant improvement from 94.8% to 99.64%. We created a repository including these datasets located at mlmechanics.ics.uci.edu.

...read moreread less

Journal Article•DOI•

Deep learning for cellular image analysis

[...]

Erick Moen¹, Dylan Bannon¹, Takamasa Kudo², William Graf¹, Markus W. Covert², David Van Valen¹ - Show less +2 more•Institutions (2)

California Institute of Technology¹, Stanford University²

27 May 2019-Nature Methods

TL;DR: The intersection between deep learning and cellular image analysis is reviewed and an overview of both the mathematical mechanics and the programming frameworks of deep learning that are pertinent to life scientists are provided.

...read moreread less

Abstract: Recent advances in computer vision and machine learning underpin a collection of algorithms with an impressive ability to decipher the content of images. These deep learning algorithms are being applied to biological images and are transforming the analysis and interpretation of imaging data. These advances are positioned to render difficult analyses routine and to enable researchers to carry out new, previously impossible experiments. Here we review the intersection between deep learning and cellular image analysis and provide an overview of both the mathematical mechanics and the programming frameworks of deep learning that are pertinent to life scientists. We survey the field's progress in four key applications: image classification, image segmentation, object tracking, and augmented microscopy. Last, we relay our labs' experience with three key aspects of implementing deep learning in the laboratory: annotating training data, selecting and training a range of neural network architectures, and deploying solutions. We also highlight existing datasets and implementations for each surveyed application.

...read moreread less

Proceedings Article•DOI•

Meta-Transfer Learning for Few-Shot Learning

[...]

Qianru Sun, Yaoyao Liu¹, Tat-Seng Chua², Bernt Schiele•Institutions (2)

Tianjin University¹, National University of Singapore²

15 Jun 2019

TL;DR: In this paper, the authors proposed a meta-transfer learning approach to adapt a base-learner to a new task for which only a few labeled samples are available, which learns scaling and shifting functions of DNN weights for each task.

...read moreread less

Abstract: Meta-learning has been proposed as a framework to address the challenging few-shot learning setting. The key idea is to leverage a large number of similar few-shot tasks in order to learn how to adapt a base-learner to a new task for which only a few labeled samples are available. As deep neural networks (DNNs) tend to overfit using a few samples only, meta-learning typically uses shallow neural networks (SNNs), thus limiting its effectiveness. In this paper we propose a novel few-shot learning method called meta-transfer learning (MTL) which learns to adapt a deep NN for few shot learning tasks. Specifically, "meta" refers to training multiple tasks, and "transfer" is achieved by learning scaling and shifting functions of DNN weights for each task. In addition, we introduce the hard task (HT) meta-batch scheme as an effective learning curriculum for MTL. We conduct experiments using (5-class, 1-shot) and (5-class, 5-shot) recognition tasks on two challenging few-shot learning benchmarks: miniImageNet and Fewshot-CIFAR100. Extensive comparisons to related works validate that our meta-transfer learning approach trained with the proposed HT meta-batch scheme achieves top performance. An ablation study also shows that both components contribute to fast convergence and high accuracy.

...read moreread less

Proceedings Article•DOI•

Invariant Information Clustering for Unsupervised Image Classification and Segmentation

[...]

Xu Ji¹, Andrea Vedaldi¹, João F. Henriques¹•Institutions (1)

University of Oxford¹

01 Oct 2019

TL;DR: IIC as mentioned in this paper learns a neural network classifier from scratch, given only unlabeled data samples, and achieves state-of-the-art results in eight unsupervised clustering benchmarks spanning image classification and segmentation.

...read moreread less

Abstract: We present a novel clustering objective that learns a neural network classifier from scratch, given only unlabelled data samples. The model discovers clusters that accurately match semantic classes, achieving state-of-the-art results in eight unsupervised clustering benchmarks spanning image classification and segmentation. These include STL10, an unsupervised variant of ImageNet, and CIFAR10, where we significantly beat the accuracy of our closest competitors by 6.6 and 9.5 absolute percentage points respectively. The method is not specialised to computer vision and operates on any paired dataset samples; in our experiments we use random transforms to obtain a pair from each image. The trained network directly outputs semantic labels, rather than high dimensional representations that need external processing to be usable for semantic clustering. The objective is simply to maximise mutual information between the class assignments of each pair. It is easy to implement and rigorously grounded in information theory, meaning we effortlessly avoid degenerate solutions that other clustering methods are susceptible to. In addition to the fully unsupervised mode, we also test two semi-supervised settings. The first achieves 88.8% accuracy on STL10 classification, setting a new global state-of-the-art over all existing methods (whether supervised, semi-supervised or unsupervised). The second shows robustness to 90% reductions in label coverage, of relevance to applications that wish to make use of small amounts of labels. github.com/xu-ji/IIC

...read moreread less

Journal Article•DOI•

Deep learning: new computational modelling techniques for genomics

[...]

Gökcen Eraslan¹, Žiga Avsec¹, Julien Gagneur¹, Fabian J. Theis¹•Institutions (1)

Technische Universität München¹

01 Jul 2019-Nature Reviews Genetics

TL;DR: This Review describes different deep learning techniques and how they can be applied to extract biologically relevant information from large, complex genomic data sets.

...read moreread less

Abstract: As a data-driven science, genomics largely utilizes machine learning to capture dependencies in data and derive novel biological hypotheses. However, the ability to extract new insights from the exponentially increasing volume of genomics data requires more expressive machine learning models. By effectively leveraging large data sets, deep learning has transformed fields such as computer vision and natural language processing. Now, it is becoming the method of choice for many genomics modelling tasks, including predicting the impact of genetic variation on gene regulatory mechanisms such as DNA accessibility and splicing. This Review describes different deep learning techniques and how they can be applied to extract biologically relevant information from large, complex genomic data sets.

...read moreread less

Proceedings Article•DOI•

Learning a Unified Classifier Incrementally via Rebalancing

[...]

Saihui Hou¹, Xinyu Pan², Chen Change Loy³, Zilei Wang¹, Dahua Lin² - Show less +1 more•Institutions (3)

University of Science and Technology of China¹, The Chinese University of Hong Kong², Nanyang Technological University³

01 Jun 2019

TL;DR: This work develops a new framework for incrementally learning a unified classifier, e.g. a classifier that treats both old and new classes uniformly, and incorporates three components, cosine normalization, less-forget constraint, and inter-class separation, to mitigate the adverse effects of the imbalance.

...read moreread less

Abstract: Conventionally, deep neural networks are trained offline, relying on a large dataset prepared in advance. This paradigm is often challenged in real-world applications, e.g. online services that involve continuous streams of incoming data. Recently, incremental learning receives increasing attention, and is considered as a promising solution to the practical challenges mentioned above. However, it has been observed that incremental learning is subject to a fundamental difficulty -- catastrophic forgetting, namely adapting a model to new data often results in severe performance degradation on previous tasks or classes. Our study reveals that the imbalance between previous and new data is a crucial cause to this problem. In this work, we develop a new framework for incrementally learning a unified classifier, e.g. a classifier that treats both old and new classes uniformly. Specifically, we incorporate three components, cosine normalization, less-forget constraint, and inter-class separation, to mitigate the adverse effects of the imbalance. Experiments show that the proposed method can effectively rebalance the training process, thus obtaining superior performance compared to the existing methods. On CIFAR-100 and ImageNet, our method can reduce the classification errors by more than 6% and 13% respectively, under the incremental setting of 10 phases.

...read moreread less

Collapse