Showing papers on "Artificial neural network published in 2018"

PDF

Open Access

Proceedings Article•DOI•

MobileNetV2: Inverted Residuals and Linear Bottlenecks

[...]

Mark Sandler¹, Andrew Howard¹, Menglong Zhu¹, Andrey Zhmoginov¹, Liang-Chieh Chen¹ - Show less +1 more•Institutions (1)

18 Jun 2018

TL;DR: MobileNetV2 as mentioned in this paper is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers and intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity.

...read moreread less

Abstract: In this paper we describe a new mobile architecture, MobileNetV2, that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes. We also describe efficient ways of applying these mobile models to object detection in a novel framework we call SSDLite. Additionally, we demonstrate how to build mobile semantic segmentation models through a reduced form of DeepLabv3 which we call Mobile DeepLabv3. is based on an inverted residual structure where the shortcut connections are between the thin bottleneck layers. The intermediate expansion layer uses lightweight depthwise convolutions to filter features as a source of non-linearity. Additionally, we find that it is important to remove non-linearities in the narrow layers in order to maintain representational power. We demonstrate that this improves performance and provide an intuition that led to this design. Finally, our approach allows decoupling of the input/output domains from the expressiveness of the transformation, which provides a convenient framework for further analysis. We measure our performance on ImageNet [1] classification, COCO object detection [2], VOC image segmentation [3]. We evaluate the trade-offs between accuracy, and number of operations measured by multiply-adds (MAdd), as well as actual latency, and the number of parameters.

...read moreread less

9,381 citations

Proceedings Article•DOI•

Non-local Neural Networks

[...]

Xiaolong Wang¹, Ross Girshick¹, Abhinav Gupta², Kaiming He¹•Institutions (2)

Facebook¹, Carnegie Mellon University²

18 Jun 2018

TL;DR: In this article, the non-local operation computes the response at a position as a weighted sum of the features at all positions, which can be used to capture long-range dependencies.

...read moreread less

Abstract: Both convolutional and recurrent operations are building blocks that process one local neighborhood at a time. In this paper, we present non-local operations as a generic family of building blocks for capturing long-range dependencies. Inspired by the classical non-local means method [4] in computer vision, our non-local operation computes the response at a position as a weighted sum of the features at all positions. This building block can be plugged into many computer vision architectures. On the task of video classification, even without any bells and whistles, our nonlocal models can compete or outperform current competition winners on both Kinetics and Charades datasets. In static image recognition, our non-local models improve object detection/segmentation and pose estimation on the COCO suite of tasks. Code will be made available.

...read moreread less

8,059 citations

Proceedings Article•DOI•

Graph Attention Networks

[...]

Petar Veličković¹, Guillem Cucurull², Arantxa Casanova³, Adriana Romero⁴, Pietro Liò¹, Yoshua Bengio⁵ - Show less +2 more•Institutions (5)

University of Cambridge¹, Autonomous University of Barcelona², Polytechnic University of Catalonia³, HEC Montréal⁴, Université de Montréal⁵

15 Feb 2018

TL;DR: Graph Attention Networks (GATs) as mentioned in this paper leverage masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations.

...read moreread less

Abstract: We present graph attention networks (GATs), novel neural network architectures that operate on graph-structured data, leveraging masked self-attentional layers to address the shortcomings of prior methods based on graph convolutions or their approximations. By stacking layers in which nodes are able to attend over their neighborhoods' features, we enable (implicitly) specifying different weights to different nodes in a neighborhood, without requiring any kind of costly matrix operation (such as inversion) or depending on knowing the graph structure upfront. In this way, we address several key challenges of spectral-based graph neural networks simultaneously, and make our model readily applicable to inductive as well as transductive problems. Our GAT models have achieved or matched state-of-the-art results across four established transductive and inductive graph benchmarks: the Cora, Citeseer and Pubmed citation network datasets, as well as a protein-protein interaction dataset (wherein test graphs remain unseen during training).

...read moreread less

7,904 citations

Proceedings Article•DOI•

ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices

[...]

Xiangyu Zhang, Xinyu Zhou, Mengxiao Lin, Jian Sun

18 Jun 2018

TL;DR: ShuffleNet as discussed by the authors utilizes two new operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy, and achieves an actual speedup over AlexNet while maintaining comparable accuracy.

...read moreread less

Abstract: We introduce an extremely computation-efficient CNN architecture named ShuffleNet, which is designed specially for mobile devices with very limited computing power (e.g., 10-150 MFLOPs). The new architecture utilizes two new operations, pointwise group convolution and channel shuffle, to greatly reduce computation cost while maintaining accuracy. Experiments on ImageNet classification and MS COCO object detection demonstrate the superior performance of ShuffleNet over other structures, e.g. lower top-1 error (absolute 7.8%) than recent MobileNet [12] on ImageNet classification task, under the computation budget of 40 MFLOPs. On an ARM-based mobile device, ShuffleNet achieves ~13A— actual speedup over AlexNet while maintaining comparable accuracy.

...read moreread less

4,503 citations

Proceedings Article•DOI•

Learning Transferable Architectures for Scalable Image Recognition

[...]

Barret Zoph¹, Vijay K. Vasudevan¹, Jonathon Shlens¹, Quoc V. Le¹•Institutions (1)

Google¹

18 Jun 2018

TL;DR: NASNet as discussed by the authors proposes to search for an architectural building block on a small dataset and then transfer the block to a larger dataset, which enables transferability and achieves state-of-the-art performance.

...read moreread less

Abstract: Developing neural network image classification models often requires significant architecture engineering. In this paper, we study a method to learn the model architectures directly on the dataset of interest. As this approach is expensive when the dataset is large, we propose to search for an architectural building block on a small dataset and then transfer the block to a larger dataset. The key contribution of this work is the design of a new search space (which we call the "NASNet search space") which enables transferability. In our experiments, we search for the best convolutional layer (or "cell") on the CIFAR-10 dataset and then apply this cell to the ImageNet dataset by stacking together more copies of this cell, each with their own parameters to design a convolutional architecture, which we name a "NASNet architecture". We also introduce a new regularization technique called ScheduledDropPath that significantly improves generalization in the NASNet models. On CIFAR-10 itself, a NASNet found by our method achieves 2.4% error rate, which is state-of-the-art. Although the cell is not searched for directly on ImageNet, a NASNet constructed from the best cell achieves, among the published works, state-of-the-art accuracy of 82.7% top-1 and 96.2% top-5 on ImageNet. Our model is 1.2% better in top-1 accuracy than the best human-invented architectures while having 9 billion fewer FLOPS - a reduction of 28% in computational demand from the previous state-of-the-art model. When evaluated at different levels of computational cost, accuracies of NASNets exceed those of the state-of-the-art human-designed models. For instance, a small version of NASNet also achieves 74% top-1 accuracy, which is 3.1% better than equivalently-sized, state-of-the-art models for mobile platforms. Finally, the image features learned from image classification are generically useful and can be transferred to other computer vision problems. On the task of object detection, the learned features by NASNet used with the Faster-RCNN framework surpass state-of-the-art by 4.0% achieving 43.1% mAP on the COCO dataset.

...read moreread less

4,384 citations

Proceedings Article•DOI•

Path Aggregation Network for Instance Segmentation

[...]

Shu Liu¹, Lu Qi¹, Haifang Qin², Jianping Shi³, Jiaya Jia¹ - Show less +1 more•Institutions (3)

The Chinese University of Hong Kong¹, Peking University², SenseTime³

18 Jun 2018

TL;DR: PANet as mentioned in this paper enhances the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path between lower layers and topmost feature.

...read moreread less

Abstract: The way that information propagates in neural networks is of great importance. In this paper, we propose Path Aggregation Network (PANet) aiming at boosting information flow in proposal-based instance segmentation framework. Specifically, we enhance the entire feature hierarchy with accurate localization signals in lower layers by bottom-up path augmentation, which shortens the information path between lower layers and topmost feature. We present adaptive feature pooling, which links feature grid and all feature levels to make useful information in each level propagate directly to following proposal subnetworks. A complementary branch capturing different views for each proposal is created to further improve mask prediction. These improvements are simple to implement, with subtle extra computational overhead. Yet they are useful and make our PANet reach the 1st place in the COCO 2017 Challenge Instance Segmentation task and the 2nd place in Object Detection task without large-batch training. PANet is also state-of-the-art on MVD and Cityscapes.

...read moreread less

3,784 citations

Proceedings Article•

Towards Deep Learning Models Resistant to Adversarial Attacks.

[...]

Aleksander Madry¹, Aleksandar Makelov¹, Ludwig Schmidt¹, Dimitris Tsipras¹, Adrian Vladu² - Show less +1 more•Institutions (2)

Massachusetts Institute of Technology¹, Boston University²

15 Feb 2018

TL;DR: This article studied the adversarial robustness of neural networks through the lens of robust optimization and identified methods for both training and attacking neural networks that are reliable and, in a certain sense, universal.

...read moreread less

Abstract: Recent work has demonstrated that deep neural networks are vulnerable to adversarial examples—inputs that are almost indistinguishable from natural data and yet classified incorrectly by the network. In fact, some of the latest findings suggest that the existence of adversarial attacks may be an inherent weakness of deep learning models. To address this problem, we study the adversarial robustness of neural networks through the lens of robust optimization. This approach provides us with a broad and unifying view on much of the prior work on this topic. Its principled nature also enables us to identify methods for both training and attacking neural networks that are reliable and, in a certain sense, universal. In particular, they specify a concrete security guarantee that would protect against any adversary. These methods let us train networks with significantly improved resistance to a wide range of adversarial attacks. They also suggest the notion of security against a first-order adversary as a natural and broad security guarantee. We believe that robustness against such well-defined classes of adversaries is an important stepping stone towards fully resistant deep learning models. Code and pre-trained models are available at this https URL and this https URL.

...read moreread less

3,581 citations

Journal Article•DOI•

Identifying Medical Diagnoses and Treatable Diseases by Image-Based Deep Learning

[...]

Daniel S. Kermany¹, Daniel S. Kermany², Michael H. Goldbaum¹, Wenjia Cai¹, Carolina C. S. Valentim¹, Huiying Liang², Sally L. Baxter¹, Alex McKeown, Ge Yang¹, Xiaokang Wu³, Fangbing Yan³, Justin Dong², Made K. Prasadha¹, Jacqueline Pei¹, Jacqueline Pei², Magdalene Yin Lin Ting¹, Jie Zhu², Christina Li¹, Sierra Hewett¹, Sierra Hewett², Jason Dong², Ian Ziyar¹, Alexander Shi¹, Runze Zhang¹, Lianghong Zheng, Rui Hou, William Shi¹, Xin Fu², Xin Fu¹, Yaou Duan¹, Viet Anh Nguyen Huu¹, Viet Anh Nguyen Huu², Cindy Wen¹, Edward Zhang¹, Edward Zhang², Charlotte Zhang¹, Charlotte Zhang², Oulan Li², Oulan Li¹, Xiaobo Wang, Michael A Singer⁴, Xiaodong Sun⁵, Jie Xu⁶, Ali Tafreshi, M. Anthony Lewis⁷, Huimin Xia², Kang Zhang - Show less +43 more•Institutions (7)

University of California, San Diego¹, Guangzhou Medical University², Sichuan University³, University of Texas Health Science Center at San Antonio⁴, Shanghai Jiao Tong University⁵, Capital Medical University⁶, Qualcomm⁷

22 Feb 2018-Cell

TL;DR: A diagnostic tool based on a deep-learning framework for the screening of patients with common treatable blinding retinal diseases, which demonstrates performance comparable to that of human experts in classifying age-related macular degeneration and diabetic macular edema.

...read moreread less

2,750 citations

Proceedings Article•DOI•

Unsupervised Feature Learning via Non-parametric Instance Discrimination

[...]

Zhirong Wu¹, Yuanjun Xiong², Stella X. Yu¹, Dahua Lin²•Institutions (2)

University of California, Berkeley¹, The Chinese University of Hong Kong²

18 Jun 2018

TL;DR: This work forms this intuition as a non-parametric classification problem at the instance-level, and uses noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes.

...read moreread less

Abstract: Neural net classifiers trained on data with annotated class labels can also capture apparent visual similarity among categories without being directed to do so. We study whether this observation can be extended beyond the conventional domain of supervised learning: Can we learn a good feature representation that captures apparent similarity among instances, instead of classes, by merely asking the feature to be discriminative of individual instances? We formulate this intuition as a non-parametric classification problem at the instance-level, and use noise-contrastive estimation to tackle the computational challenges imposed by the large number of instance classes. Our experimental results demonstrate that, under unsupervised learning settings, our method surpasses the state-of-the-art on ImageNet classification by a large margin. Our method is also remarkable for consistently improving test performance with more training data and better network architectures. By fine-tuning the learned feature, we further obtain competitive results for semi-supervised learning and object detection tasks. Our non-parametric model is highly compact: With 128 features per image, our method requires only 600MB storage for a million images, enabling fast nearest neighbour retrieval at the run time.

...read moreread less

2,533 citations

Book•

Neural Networks and Deep Learning

[...]

Charu C. Aggarwal

01 Jan 2018

2,291 citations

Journal Article•DOI•

Convolutional neural networks: an overview and application in radiology

[...]

Rikiya Yamashita¹, Rikiya Yamashita², Mizuho Nishio², Richard K. G. Do¹, Kaori Togashi² - Show less +1 more•Institutions (2)

Memorial Sloan Kettering Cancer Center¹, Kyoto University²

22 Jun 2018-Insights Into Imaging

TL;DR: A perspective on the basic concepts of convolutional neural network and its application to various radiological tasks is offered, and its challenges and future directions in the field of radiology are discussed.

...read moreread less

Abstract: Convolutional neural network (CNN), a class of artificial neural networks that has become dominant in various computer vision tasks, is attracting interest across a variety of domains, including radiology. CNN is designed to automatically and adaptively learn spatial hierarchies of features through backpropagation by using multiple building blocks, such as convolution layers, pooling layers, and fully connected layers. This review article offers a perspective on the basic concepts of CNN and its application to various radiological tasks, and discusses its challenges and future directions in the field of radiology. Two challenges in applying CNN to radiological tasks, small dataset and overfitting, will also be covered in this article, as well as techniques to minimize them. Being familiar with the concepts and advantages, as well as limitations, of CNN is essential to leverage its potential in diagnostic radiology, with the goal of augmenting the performance of radiologists and improving patient care. • Convolutional neural network is a class of deep learning methods which has become dominant in various computer vision tasks and is attracting interest across a variety of domains, including radiology. • Convolutional neural network is composed of multiple building blocks, such as convolution layers, pooling layers, and fully connected layers, and is designed to automatically and adaptively learn spatial hierarchies of features through a backpropagation algorithm. • Familiarity with the concepts and advantages, as well as limitations, of convolutional neural network is essential to leverage its potential to improve radiologist performance and, eventually, patient care.

...read moreread less

Posted Content•

Relational inductive biases, deep learning, and graph networks

[...]

04 Jun 2018-arXiv: Learning

TL;DR: It is argued that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective.

...read moreread less

Abstract: Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one's experiences--a hallmark of human intelligence from infancy--remains a formidable challenge for modern AI. The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between "hand-engineering" and "end-to-end" learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias--the graph network--which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

...read moreread less

Proceedings Article•DOI•

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

[...]

Benoit Jacob¹, Skirmantas Kligys¹, Bo Chen¹, Menglong Zhu¹, Matthew Tang¹, Andrew Howard¹, Hartwig Adam¹, Dmitry Kalenichenko¹ - Show less +4 more•Institutions (1)

Google¹

18 Jun 2018

TL;DR: A quantization scheme is proposed that allows inference to be carried out using integer- only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware.

...read moreread less

Abstract: The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

...read moreread less

Journal Article•DOI•

Deep Learning for Computer Vision: A Brief Review.

[...]

Athanasios Voulodimos¹, Nikolaos Doulamis², Anastasios Doulamis², Eftychios Protopapadakis²•Institutions (2)

Technological Educational Institute of Athens¹, National Technical University of Athens²

01 Feb 2018-Computational Intelligence and Neuroscience

TL;DR: A brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders are provided.

...read moreread less

Abstract: Over the last years deep learning methods have been shown to outperform previous state-of-the-art machine learning techniques in several fields, with computer vision being one of the most prominent cases. This review paper provides a brief overview of some of the most significant deep learning schemes used in computer vision problems, that is, Convolutional Neural Networks, Deep Boltzmann Machines and Deep Belief Networks, and Stacked Denoising Autoencoders. A brief account of their history, structure, advantages, and limitations is given, followed by a description of their applications in various computer vision tasks, such as object detection, face recognition, action and activity recognition, and human pose estimation. Finally, a brief overview is given of future directions in designing deep learning schemes for computer vision problems and the challenges involved therein.

...read moreread less

Journal Article•DOI•

Methods for interpreting and understanding deep neural networks

[...]

Grégoire Montavon¹, Wojciech Samek², Klaus-Robert Müller¹, Klaus-Robert Müller³, Klaus-Robert Müller⁴ - Show less +1 more•Institutions (4)

Technical University of Berlin¹, Heinrich Hertz Institute², Max Planck Society³, Korea University⁴

01 Feb 2018-Digital Signal Processing

TL;DR: The second part of the tutorial focuses on the recently proposed layer-wise relevance propagation (LRP) technique, for which the author provides theory, recommendations, and tricks, to make most efficient use of it on real data.

...read moreread less

Journal Article•DOI•

Learning without Forgetting

[...]

Zhizhong Li¹, Derek Hoiem¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

01 Dec 2018-IEEE Transactions on Pattern Analysis and Machine Intelligence

TL;DR: In this paper, the authors propose a Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities, which performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques.

...read moreread less

Abstract: When building a unified vision system or gradually adding new apabilities to a system, the usual assumption is that training data for all tasks is always available. However, as the number of tasks grows, storing and retraining on such data becomes infeasible. A new problem arises where we add new capabilities to a Convolutional Neural Network (CNN), but the training data for its existing capabilities are unavailable. We propose our Learning without Forgetting method, which uses only new task data to train the network while preserving the original capabilities. Our method performs favorably compared to commonly used feature extraction and fine-tuning adaption techniques and performs similarly to multitask learning that uses original task data we assume unavailable. A more surprising observation is that Learning without Forgetting may be able to replace fine-tuning with similar old and new task datasets for improved new task performance.

...read moreread less

Proceedings Article•

Neural Tangent Kernel: Convergence and Generalization in Neural Networks

[...]

Arthur Jacot¹, Franck Gabriel², Clément Hongler¹•Institutions (2)

École Polytechnique Fédérale de Lausanne¹, Imperial College London²

20 Jun 2018

TL;DR: This talk will introduce this formalism and give a number of results on the Neural Tangent Kernel and explain how they give us insight into the dynamics of neural networks during training and into their generalization features.

...read moreread less

Abstract: At initialization, artificial neural networks (ANNs) are equivalent to Gaussian processes in the infinite-width limit, thus connecting them to kernel methods. We prove that the evolution of an ANN during training can also be described by a kernel: during gradient descent on the parameters of an ANN, the network function (which maps input vectors to output vectors) follows the so-called kernel gradient associated with a new object, which we call the Neural Tangent Kernel (NTK). This kernel is central to describe the generalization features of ANNs. While the NTK is random at initialization and varies during training, in the infinite-width limit it converges to an explicit limiting kernel and stays constant during training. This makes it possible to study the training of ANNs in function space instead of parameter space. Convergence of the training can then be related to the positive-definiteness of the limiting NTK. We then focus on the setting of least-squares regression and show that in the infinite-width limit, the network function follows a linear differential equation during training. The convergence is fastest along the largest kernel principal components of the input data with respect to the NTK, hence suggesting a theoretical motivation for early stopping. Finally we study the NTK numerically, observe its behavior for wide networks, and compare it to the infinite-width limit.

...read moreread less

Journal Article•DOI•

Road Extraction by Deep Residual U-Net

[...]

Zhengxin Zhang¹, Qingjie Liu¹, Yunhong Wang¹•Institutions (1)

Beihang University¹

08 Mar 2018-IEEE Geoscience and Remote Sensing Letters

TL;DR: A semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction, which outperforms all the comparing methods and demonstrates its superiority over recently developed state of the arts methods.

...read moreread less

Abstract: Road extraction from aerial images has been a hot research topic in the field of remote sensing image analysis. In this letter, a semantic segmentation neural network, which combines the strengths of residual learning and U-Net, is proposed for road area extraction. The network is built with residual units and has similar architecture to that of U-Net. The benefits of this model are twofold: first, residual units ease training of deep networks. Second, the rich skip connections within the network could facilitate information propagation, allowing us to design networks with fewer parameters, however, better performance. We test our network on a public road data set and compare it with U-Net and other two state-of-the-art deep-learning-based road extraction methods. The proposed approach outperforms all the comparing methods, which demonstrates its superiority over recently developed state of the arts.

...read moreread less

Book Chapter•DOI•

A Survey on Deep Transfer Learning

[...]

Chuanqi Tan¹, Fuchun Sun¹, Tao Kong¹, Wenchang Zhang¹, Chao Yang¹, Chunfang Liu¹ - Show less +2 more•Institutions (1)

Tsinghua University¹

04 Oct 2018

TL;DR: Deep transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates researchers to use transfer learning to solve the problem of insufficient training data as mentioned in this paper.

...read moreread less

Abstract: As a new classification platform, deep learning has recently received increasing attention from researchers and has been successfully applied to many domains. In some domains, like bioinformatics and robotics, it is very difficult to construct a large-scale well-annotated dataset due to the expense of data acquisition and costly annotation, which limits its development. Transfer learning relaxes the hypothesis that the training data must be independent and identically distributed (i.i.d.) with the test data, which motivates us to use transfer learning to solve the problem of insufficient training data. This survey focuses on reviewing the current researches of transfer learning by using deep neural network and its applications. We defined deep transfer learning, category and review the recent research works based on the techniques used in deep transfer learning.

...read moreread less

Proceedings Article•DOI•

Maximum Classifier Discrepancy for Unsupervised Domain Adaptation

[...]

Kuniaki Saito¹, Kohei Watanabe¹, Yoshitaka Ushiku¹, Tatsuya Harada¹•Institutions (1)

University of Tokyo¹

18 Jun 2018

TL;DR: MCD-DA as discussed by the authors aligns distributions of source and target by utilizing the task-specific decision boundaries between classes to detect target samples that are far from the support of the source.

...read moreread less

Abstract: In this work, we present a method for unsupervised domain adaptation. Many adversarial learning methods train domain classifier networks to distinguish the features as either a source or target and train a feature generator network to mimic the discriminator. Two problems exist with these methods. First, the domain classifier only tries to distinguish the features as a source or target and thus does not consider task-specific decision boundaries between classes. Therefore, a trained generator can generate ambiguous features near class boundaries. Second, these methods aim to completely match the feature distributions between different domains, which is difficult because of each domain's characteristics. To solve these problems, we introduce a new approach that attempts to align distributions of source and target by utilizing the task-specific decision boundaries. We propose to maximize the discrepancy between two classifiers' outputs to detect target samples that are far from the support of the source. A feature generator learns to generate target features near the support to minimize the discrepancy. Our method outperforms other methods on several datasets of image classification and semantic segmentation. The codes are available at https://github.com/mil-tokyo/MCD_DA

...read moreread less

Journal Article•DOI•

State-of-the-art in artificial neural network applications: A survey

[...]

Oludare Isaac Abiodun¹, Oludare Isaac Abiodun², Aman Jantan¹, Abiodun Esther Omolara³, Kemi Victoria Dada⁴, Nachaat AbdElatif Mohamed⁵, Humaira Arshad⁶ - Show less +3 more•Institutions (6)

Universiti Sains Malaysia¹, ECWA Bingham University², Ahmadu Bello University³, Nasarawa State University⁴, University of California, Berkeley⁵, Islamia University⁶

01 Nov 2018-Heliyon

TL;DR: The study found that neural-network models such as feedforward and feedback propagation artificial neural networks are performing better in its application to human problems and proposed feedforwardand feedback propagation ANN models for research focus based on data analysis factors like accuracy, processing speed, latency, fault tolerance, volume, scalability, convergence, and performance.

...read moreread less

Proceedings Article•DOI•

Deep Image Prior

[...]

Victor Lempitsky¹, Andrea Vedaldi², Dmitry Ulyanov¹•Institutions (2)

Skolkovo Institute of Science and Technology¹, University of Oxford²

18 Jun 2018

TL;DR: It is shown that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting.

...read moreread less

Abstract: Deep convolutional networks have become a popular tool for image generation and restoration. Generally, their excellent performance is imputed to their ability to learn realistic image priors from a large number of example images. In this paper, we show that, on the contrary, the structure of a generator network is sufficient to capture a great deal of low-level image statistics prior to any learning. In order to do so, we show that a randomly-initialized neural network can be used as a handcrafted prior with excellent results in standard inverse problems such as denoising, superresolution, and inpainting. Furthermore, the same prior can be used to invert deep neural representations to diagnose them, and to restore images based on flash-no flash input pairs. Apart from its diverse applications, our approach highlights the inductive bias captured by standard generator network architectures. It also bridges the gap between two very popular families of image restoration methods: learning-based methods using deep convolutional networks and learning-free methods based on handcrafted image priors such as self-similarity.

...read moreread less

Journal Article•DOI•

Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection

[...]

Sergey Levine¹, Peter Pastor, Alex Krizhevsky¹, Julian Ibarz¹, Deirdre Quillen¹ - Show less +1 more•Institutions (1)

Google¹

01 Apr 2018-The International Journal of Robotics Research

TL;DR: The approach achieves effective real-time control, can successfully grasp novel objects, and corrects mistakes by continuous servoing, and illustrates that data from different robots can be combined to learn more reliable and effective grasping.

...read moreread less

Abstract: We describe a learning-based approach to hand-eye coordination for robotic grasping from monocular images. To learn hand-eye coordination for grasping, we trained a large convolutional neural netwo...

...read moreread less

Posted Content•

Deep Clustering for Unsupervised Learning of Visual Features

[...]

Mathilde Caron¹, Piotr Bojanowski¹, Armand Joulin¹, Matthijs Douze¹•Institutions (1)

Facebook¹

15 Jul 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: This work presents DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features and outperforms the current state of the art by a significant margin on all the standard benchmarks.

...read moreread less

Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.

...read moreread less

Journal Article•DOI•

Image reconstruction by domain-transform manifold learning

[...]

Bo Zhu¹, Jeremiah Zhe Liu¹, Stephen F. Cauley¹, Bruce R. Rosen¹, Matthew S. Rosen¹ - Show less +1 more•Institutions (1)

Harvard University¹

21 Mar 2018-Nature

TL;DR: A unified framework for image reconstruction—automated transform by manifold approximation (AUTOMAP)—which recasts image reconstruction as a data-driven supervised learning task that allows a mapping between the sensor and the image domain to emerge from an appropriate corpus of training data is presented.

...read moreread less

Abstract: Image reconstruction is essential for imaging applications across the physical and life sciences, including optical and radar systems, magnetic resonance imaging, X-ray computed tomography, positron emission tomography, ultrasound imaging and radio astronomy. During image acquisition, the sensor encodes an intermediate representation of an object in the sensor domain, which is subsequently reconstructed into an image by an inversion of the encoding function. Image reconstruction is challenging because analytic knowledge of the exact inverse transform may not exist a priori, especially in the presence of sensor non-idealities and noise. Thus, the standard reconstruction approach involves approximating the inverse function with multiple ad hoc stages in a signal processing chain, the composition of which depends on the details of each acquisition strategy, and often requires expert parameter tuning to optimize reconstruction performance. Here we present a unified framework for image reconstruction-automated transform by manifold approximation (AUTOMAP)-which recasts image reconstruction as a data-driven supervised learning task that allows a mapping between the sensor and the image domain to emerge from an appropriate corpus of training data. We implement AUTOMAP with a deep neural network and exhibit its flexibility in learning reconstruction transforms for various magnetic resonance imaging acquisition strategies, using the same network architecture and hyperparameters. We further demonstrate that manifold learning during training results in sparse representations of domain transforms along low-dimensional data manifolds, and observe superior immunity to noise and a reduction in reconstruction artefacts compared with conventional handcrafted reconstruction methods. In addition to improving the reconstruction performance of existing acquisition methodologies, we anticipate that AUTOMAP and other learned reconstruction approaches will accelerate the development of new acquisition strategies across imaging modalities.

...read moreread less

Book Chapter•DOI•

Deep Clustering for Unsupervised Learning of Visual Features

[...]

Mathilde Caron¹, Piotr Bojanowski¹, Armand Joulin¹, Matthijs Douze¹•Institutions (1)

Facebook¹

08 Sep 2018

TL;DR: DeepCluster as discussed by the authors is a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features, and uses the subsequent assignments as supervision to update the weights of the network.

...read moreread less

Abstract: Clustering is a class of unsupervised learning methods that has been extensively applied and studied in computer vision. Little work has been done to adapt it to the end-to-end training of visual features on large-scale datasets. In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features. DeepCluster iteratively groups the features with a standard clustering algorithm, k-means, and uses the subsequent assignments as supervision to update the weights of the network. We apply DeepCluster to the unsupervised training of convolutional neural networks on large datasets like ImageNet and YFCC100M. The resulting model outperforms the current state of the art by a significant margin on all the standard benchmarks.

...read moreread less

Posted Content•

Federated Learning with Non-IID Data.

[...]

Yue Zhao, Meng Li, Liangzhen Lai, Naveen Suda, Damon Civin, Vikas Chandra - Show less +2 more

02 Jun 2018-arXiv: Learning

TL;DR: This work presents a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices, and shows that accuracy can be increased by 30% for the CIFAR-10 dataset with only 5% globally shared data.

...read moreread less

Abstract: Federated learning enables resource-constrained edge compute devices, such as mobile phones and IoT devices, to learn a shared model for prediction, while keeping the training data local. This decentralized approach to train models provides privacy, security, regulatory and economic benefits. In this work, we focus on the statistical challenge of federated learning when local data is non-IID. We first show that the accuracy of federated learning reduces significantly, by up to 55% for neural networks trained for highly skewed non-IID data, where each client device trains only on a single class of data. We further show that this accuracy reduction can be explained by the weight divergence, which can be quantified by the earth mover's distance (EMD) between the distribution over classes on each device and the population distribution. As a solution, we propose a strategy to improve training on non-IID data by creating a small subset of data which is globally shared between all the edge devices. Experiments show that accuracy can be increased by 30% for the CIFAR-10 dataset with only 5% globally shared data.

...read moreread less

Journal Article•DOI•

Solving high-dimensional partial differential equations using deep learning

[...]

Jiequn Han¹, Arnulf Jentzen², Weinan E¹•Institutions (2)

Princeton University¹, ETH Zurich²

21 Aug 2018-Proceedings of the National Academy of Sciences of the United States of America

TL;DR: A deep learning-based approach that can handle general high-dimensional parabolic PDEs using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function.

...read moreread less

Abstract: Developing algorithms for solving high-dimensional partial differential equations (PDEs) has been an exceedingly difficult task for a long time, due to the notoriously difficult problem known as the “curse of dimensionality.” This paper introduces a deep learning-based approach that can handle general high-dimensional parabolic PDEs. To this end, the PDEs are reformulated using backward stochastic differential equations and the gradient of the unknown solution is approximated by neural networks, very much in the spirit of deep reinforcement learning with the gradient acting as the policy function. Numerical results on examples including the nonlinear Black–Scholes equation, the Hamilton–Jacobi–Bellman equation, and the Allen–Cahn equation suggest that the proposed algorithm is quite effective in high dimensions, in terms of both accuracy and cost. This opens up possibilities in economics, finance, operational research, and physics, by considering all participating agents, assets, resources, or particles together at the same time, instead of making ad hoc assumptions on their interrelationships.

...read moreread less

Proceedings Article•DOI•

Single-Shot Refinement Neural Network for Object Detection

[...]

Shifeng Zhang¹, Longyin Wen², Xiao Bian², Zhen Lei¹, Stan Z. Li³ - Show less +1 more•Institutions (3)

Chinese Academy of Sciences¹, General Electric², Macau University of Science and Technology³

18 Jun 2018

TL;DR: RefineDet as discussed by the authors proposes an anchor refinement module and an object detection module to adjust the locations and sizes of anchors to provide better initialization for the subsequent regressor, which achieves state-of-the-art detection accuracy with high efficiency.

...read moreread less

Abstract: For object detection, the two-stage approach (e.g., Faster R-CNN) has been achieving the highest accuracy, whereas the one-stage approach (e.g., SSD) has the advantage of high efficiency. To inherit the merits of both while overcoming their disadvantages, in this paper, we propose a novel single-shot based detector, called RefineDet, that achieves better accuracy than two-stage methods and maintains comparable efficiency of one-stage methods. RefineDet consists of two inter-connected modules, namely, the anchor refinement module and the object detection module. Specifically, the former aims to (1) filter out negative anchors to reduce search space for the classifier, and (2) coarsely adjust the locations and sizes of anchors to provide better initialization for the subsequent regressor. The latter module takes the refined anchors as the input from the former to further improve the regression accuracy and predict multi-class label. Meanwhile, we design a transfer connection block to transfer the features in the anchor refinement module to predict locations, sizes and class labels of objects in the object detection module. The multitask loss function enables us to train the whole network in an end-to-end way. Extensive experiments on PASCAL VOC 2007, PASCAL VOC 2012, and MS COCO demonstrate that RefineDet achieves state-of-the-art detection accuracy with high efficiency. Code is available at https://github.com/sfzhang15/RefineDet.

...read moreread less

Proceedings Article•

ProxylessNAS: Direct Neural Architecture Search on Target Task and Hardware

[...]

Han Cai¹, Ligeng Zhu², Song Han³•Institutions (3)

Shanghai Jiao Tong University¹, Simon Fraser University², Massachusetts Institute of Technology³

02 Dec 2018

TL;DR: ProxylessNAS is presented, which can directly learn the architectures for large-scale target tasks and target hardware platforms and apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.

...read moreread less

Abstract: Neural architecture search (NAS) has a great impact by automatically designing effective neural network architectures. However, the prohibitive computational demand of conventional NAS algorithms (e.g. $10^4$ GPU hours) makes it difficult to \emph{directly} search the architectures on large-scale tasks (e.g. ImageNet). Differentiable NAS can reduce the cost of GPU hours via a continuous representation of network architecture but suffers from the high GPU memory consumption issue (grow linearly w.r.t. candidate set size). As a result, they need to utilize~\emph{proxy} tasks, such as training on a smaller dataset, or learning with only a few blocks, or training just for a few epochs. These architectures optimized on proxy tasks are not guaranteed to be optimal on the target task. In this paper, we present \emph{ProxylessNAS} that can \emph{directly} learn the architectures for large-scale target tasks and target hardware platforms. We address the high memory consumption issue of differentiable NAS and reduce the computational cost (GPU hours and GPU memory) to the same level of regular training while still allowing a large candidate set. Experiments on CIFAR-10 and ImageNet demonstrate the effectiveness of directness and specialization. On CIFAR-10, our model achieves 2.08\% test error with only 5.7M parameters, better than the previous state-of-the-art architecture AmoebaNet-B, while using 6$\times$ fewer parameters. On ImageNet, our model achieves 3.1\% better top-1 accuracy than MobileNetV2, while being 1.2$\times$ faster with measured GPU latency. We also apply ProxylessNAS to specialize neural architectures for hardware with direct hardware metrics (e.g. latency) and provide insights for efficient CNN architecture design.

...read moreread less

Collapse