Showing papers on "Robustness (computer science) published in 2018"

PDF

Open Access

Proceedings Article•DOI•

StarGAN: Unified Generative Adversarial Networks for Multi-domain Image-to-Image Translation

[...]

Yunjey Choi¹, Minje Choi¹, Munyoung Kim², Jung-Woo Ha³, Sunghun Kim⁴, Jaegul Choo¹ - Show less +2 more•Institutions (4)

Korea University¹, The College of New Jersey², Naver Corporation³, Hong Kong University of Science and Technology⁴

18 Jun 2018

TL;DR: StarGAN as discussed by the authors proposes a unified model architecture to perform image-to-image translation for multiple domains using only a single model, which leads to superior quality of translated images compared to existing models as well as the capability of flexibly translating an input image to any desired target domain.

...read moreread less

Abstract: Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.

...read moreread less

2,479 citations

Posted Content•

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels

[...]

Bo Han¹, Quanming Yao², Xingrui Yu³, Gang Niu⁴, Miao Xu⁵, Weihua Hu⁴, Ivor W. Tsang¹, Masashi Sugiyama⁴ - Show less +4 more•Institutions (5)

University of Technology, Sydney¹, Hong Kong University of Science and Technology², China University of Petroleum³, University of Tokyo⁴, University of Queensland⁵

18 Apr 2018-arXiv: Learning

TL;DR: Co-teaching as discussed by the authors trains two deep neural networks simultaneously, and let them teach each other given every mini-batch: first, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this minibatch should be used for training; finally, each networks back propagates the data selected by its peer network and updates itself.

...read moreread less

Abstract: Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called Co-teaching for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.

...read moreread less

866 citations

Proceedings Article•DOI•

Domain Adaptive Faster R-CNN for Object Detection in the Wild

[...]

Yuhua Chen¹, Wen Li¹, Christos Sakaridis¹, Dengxin Dai¹, Luc Van Gool¹ - Show less +1 more•Institutions (1)

ETH Zurich¹

08 Mar 2018

TL;DR: Zhang et al. as discussed by the authors designed two domain adaptation components, on image level and instance level, to reduce the domain discrepancy in Faster R-CNN, which is based on $$-divergence theory and is implemented by learning a domain classifier in adversarial training manner.

...read moreread less

Abstract: Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc., and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on $$-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

...read moreread less

843 citations

Proceedings Article•DOI•

AI2: Safety and Robustness Certification of Neural Networks with Abstract Interpretation

[...]

Timon Gehr¹, Matthew Mirman¹, Dana Drachsler-Cohen¹, Petar Tsankov¹, Swarat Chaudhuri¹, Martin Vechev¹ - Show less +2 more•Institutions (1)

ETH Zurich¹

20 May 2018

TL;DR: This work presents AI2, the first sound and scalable analyzer for deep neural networks, and introduces abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers.

...read moreread less

Abstract: We present AI2, the first sound and scalable analyzer for deep neural networks. Based on overapproximation, AI2 can automatically prove safety properties (e.g., robustness) of realistic neural networks (e.g., convolutional neural networks). The key insight behind AI2 is to phrase reasoning about safety and robustness of neural networks in terms of classic abstract interpretation, enabling us to leverage decades of advances in that area. Concretely, we introduce abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers. This allows us to handle real-world neural networks, which are often built out of those types of layers. We present a complete implementation of AI2 together with an extensive evaluation on 20 neural networks. Our results demonstrate that: (i) AI2 is precise enough to prove useful specifications (e.g., robustness), (ii) AI2 can be used to certify the effectiveness of state-of-the-art defenses for neural networks, (iii) AI2 is significantly faster than existing analyzers based on symbolic analysis, which often take hours to verify simple fully connected networks, and (iv) AI2 can handle deep convolutional networks, which are beyond the reach of existing methods.

...read moreread less

841 citations

Proceedings Article•

Robustness May Be at Odds with Accuracy

[...]

Dimitris Tsipras¹, Shibani Santurkar¹, Logan Engstrom¹, Alexander Turner¹, Aleksander Madry¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

27 Sep 2018

TL;DR: It is shown that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization, and it is argued that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers.

...read moreread less

Abstract: We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.

...read moreread less

822 citations

Proceedings Article•DOI•

PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning

[...]

Arun Mallya¹, Svetlana Lazebnik¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

18 Jun 2018

TL;DR: In this article, a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting is presented, which exploits redundancies in large deep networks to free up parameters that can then be employed to learn new tasks.

...read moreread less

Abstract: This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting. Inspired by network pruning techniques, we exploit redundancies in large deep networks to free up parameters that can then be employed to learn new tasks. By performing iterative pruning and network re-training, we are able to sequentially "pack" multiple tasks into a single network while ensuring minimal drop in performance and minimal storage overhead. Unlike prior work that uses proxy losses to maintain accuracy on older tasks, we always optimize for the task at hand. We perform extensive experiments on a variety of network architectures and large-scale datasets, and observe much better robustness against catastrophic forgetting than prior work. In particular, we are able to add three fine-grained classification tasks to a single ImageNet-trained VGG-16 network and achieve accuracies close to those of separately trained networks for each task.

...read moreread less

803 citations

Proceedings Article•

Certified Defenses against Adversarial Examples

[...]

Aditi Raghunathan¹, Jacob Steinhardt¹, Percy Liang¹•Institutions (1)

Stanford University¹

29 Jan 2018

TL;DR: This work proposes a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value, providing an adaptive regularizer that encourages robustness against all attacks.

...read moreread less

Abstract: While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses. Can we somehow end this arms race? In this work, we study this problem for neural networks with one hidden layer. We first propose a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value. Second, as this certificate is differentiable, we jointly optimize it with the network parameters, providing an adaptive regularizer that encourages robustness against all attacks. On MNIST, our approach produces a network and a certificate that no attack that perturbs each pixel by at most \epsilon = 0.1 can cause more than 35% test error.

...read moreread less

758 citations

Proceedings Article•DOI•

GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose

[...]

Zhichao Yin, Jianping Shi

06 Mar 2018

TL;DR: GeoNet as mentioned in this paper proposes an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively.

...read moreread less

Abstract: We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and egomotion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner. Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately. Furthermore, we propose an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively. Experimentation on the KITTI driving dataset reveals that our scheme achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.

...read moreread less

725 citations

Journal Article•DOI•

Scale-Aware Fast R-CNN for Pedestrian Detection

[...]

Jianan Li¹, Xiaodan Liang², Sheng Mei Shen³, Tingfa Xu¹, Jiashi Feng⁴, Shuicheng Yan⁴ - Show less +2 more•Institutions (4)

Beijing Institute of Technology¹, Carnegie Mellon University², Panasonic³, National University of Singapore⁴

01 Apr 2018-IEEE Transactions on Multimedia

TL;DR: SAF R-CNN as discussed by the authors introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges, and outputs from all of the sub-networks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales.

...read moreread less

Abstract: In this paper, we consider the problem of pedestrian detection in natural scenes. Intuitively, instances of pedestrians with different spatial scales may exhibit dramatically different features. Thus, large variance in instance scales, which results in undesirable large intracategory variance in features, may severely hurt the performance of modern object instance detection methods. We argue that this issue can be substantially alleviated by the divide-and-conquer philosophy. Taking pedestrian detection as an example, we illustrate how we can leverage this philosophy to develop a Scale-Aware Fast R-CNN (SAF R-CNN) framework. The model introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges. Outputs from all of the subnetworks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales, via a gate function defined over the sizes of object proposals. Extensive evaluations on several challenging pedestrian detection datasets well demonstrate the effectiveness of the proposed SAF R-CNN. Particularly, our method achieves state-of-the-art performance on Caltech [P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An evaluation of the state of the art,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 34, no. 4, pp. 743–761, Apr. 2012], and obtains competitive results on INRIA [N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. , 2005, pp. 886–893], ETH [A. Ess, B. Leibe, and L. V. Gool, “Depth and appearance for mobile scene analysis,” in Proc. Int. Conf. Comput. Vis ., 2007, pp. 1–8], and KITTI [A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit ., 2012, pp. 3354–3361].

...read moreread less

716 citations

Proceedings Article•DOI•

Co-teaching: Robust Training of Deep Neural Networks with Extremely Noisy Labels

[...]

Bo Han¹, Quanming Yao², Xingrui Yu³, Gang Niu⁴, Miao Xu⁵, Weihua Hu⁴, Ivor W. Tsang¹, Masashi Sugiyama⁴ - Show less +4 more•Institutions (5)

University of Technology, Sydney¹, Hong Kong University of Science and Technology², China University of Petroleum³, University of Tokyo⁴, University of Queensland⁵

01 Jan 2018

TL;DR: Empirical results on noisy versions of MNIST, CIFar-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.

...read moreread less

Abstract: Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called ''Co-teaching'' for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.

...read moreread less

657 citations

Proceedings Article•

Synthetic and Natural Noise Both Break Neural Machine Translation

[...]

Yonatan Belinkov¹, Yonatan Bisk²•Institutions (2)

Massachusetts Institute of Technology¹, University of Washington²

15 Feb 2018

TL;DR: It is found that a model based on a character convolutional neural network is able to simultaneously learn representations robust to multiple kinds of noise, including structure-invariant word representations and robust training on noisy texts.

...read moreread less

Abstract: Character-based neural machine translation (NMT) models alleviate out-of-vocabulary issues, learn morphology, and move us closer to completely end-to-end translation systems. Unfortunately, they are also very brittle and easily falter when presented with noisy data. In this paper, we confront NMT models with synthetic and natural sources of noise. We find that state-of-the-art models fail to translate even moderately noisy texts that humans have no trouble comprehending. We explore two approaches to increase model robustness: structure-invariant word representations and robust training on noisy texts. We find that a model based on a character convolutional neural network is able to simultaneously learn representations robust to multiple kinds of noise.

...read moreread less

Journal Article•DOI•

NB-CNN: Deep Learning-Based Crack Detection Using Convolutional Neural Network and Naïve Bayes Data Fusion

[...]

Fu-Chen Chen¹, Mohammad R. Jahanshahi¹•Institutions (1)

Purdue University¹

01 May 2018-IEEE Transactions on Industrial Electronics

TL;DR: A convolutional neural network is proposed to detect crack patches in each video frame, while the proposed data fusion scheme maintains the spatiotemporal coherence of cracks in videos, and the Naïve Bayes decision making discards false positives effectively.

...read moreread less

Abstract: Regular inspection of nuclear power plant components is important to guarantee safe operations. However, current practice is time consuming, tedious, and subjective, which involves human technicians reviewing the inspection videos and identifying cracks on reactors. A few vision-based crack detection approaches have been developed for metallic surfaces, and they typically perform poorly when used for analyzing nuclear inspection videos. Detecting these cracks is a challenging task since they are tiny, and noisy patterns exist on the components’ surfaces. This study proposes a deep learning framework, based on a convolutional neural network (CNN) and a Naive Bayes data fusion scheme, called NB-CNN, to analyze individual video frames for crack detection while a novel data fusion scheme is proposed to aggregate the information extracted from each video frame to enhance the overall performance and robustness of the system. To this end, a CNN is proposed to detect crack patches in each video frame, while the proposed data fusion scheme maintains the spatiotemporal coherence of cracks in videos, and the Naive Bayes decision making discards false positives effectively. The proposed framework achieves a 98.3% hit rate against 0.1 false positives per frame that is significantly higher than state-of-the-art approaches as presented in this paper.

...read moreread less

Proceedings Article•

Evaluating Robustness of Neural Networks with Mixed Integer Programming

[...]

Vincent Tjeng, Kai Xiao¹, Russ Tedrake¹•Institutions (1)

Massachusetts Institute of Technology¹

27 Sep 2018

TL;DR: Verification of piecewise-linear neural networks as a mixed integer program that is able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack for every network.

...read moreread less

Abstract: Neural networks have demonstrated considerable success on a wide variety of real-world problems. However, neural networks can be fooled by adversarial examples – slightly perturbed inputs that are misclassified with high confidence. Verification of networks enables us to gauge their vulnerability to such adversarial examples. We formulate verification of piecewise-linear neural networks as a mixed integer program. Our verifier finds minimum adversarial distortions two to three orders of magnitude more quickly than the state-of-the-art. We achieve this via tight formulations for non-linearities, as well as a novel presolve algorithm that makes full use of all information available. The computational speedup enables us to verify properties on convolutional networks with an order of magnitude more ReLUs than had been previously verified by any complete verifier, and we determine for the first time the exact adversarial accuracy of an MNIST classifier to perturbations with bounded l[infinity] norm e = 0:1. On this network, we find an adversarial example for 4.38% of samples, and a certificate of robustness for the remainder. Across a variety of robust training procedures, we are able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack for every network.

...read moreread less

Journal Article•DOI•

Deep Learning for Super-Resolution Channel Estimation and DOA Estimation Based Massive MIMO System

[...]

Hongji Huang¹, Jie Yang¹, Hao Huang¹, Yiwei Song¹, Guan Gui¹ - Show less +1 more•Institutions (1)

Nanjing University of Posts and Telecommunications¹

29 Jun 2018-IEEE Transactions on Vehicular Technology

TL;DR: Simulation results corroborate that the proposed deep learning based scheme can achieve better performance in terms of the DOA estimation and the channel estimation compared with conventional methods, and the proposed scheme is well investigated by extensive simulation in various cases for testing its robustness.

...read moreread less

Abstract: The recent concept of massive multiple-input multiple-output (MIMO) can significantly improve the capacity of the communication network, and it has been regarded as a promising technology for the next-generation wireless communications. However, the fundamental challenge of existing massive MIMO systems is that high computational complexity and complicated spatial structures bring great difficulties to exploit the characteristics of the channel and sparsity of these multi-antennas systems. To address this problem, in this paper, we focus on channel estimation and direction-of-arrival (DOA) estimation, and a novel framework that integrates the massive MIMO into deep learning is proposed. To realize end-to-end performance, a deep neural network (DNN) is employed to conduct offline learning and online learning procedures, which is effective to learn the statistics of the wireless channel and the spatial structures in the angle domain. Concretely, the DNN is first trained by simulated data in different channel conditions with the aids of the offline learning, and then corresponding output data can be obtained based on current input data during online learning process. In order to realize super-resolution channel estimation and DOA estimation, two algorithms based on the deep learning are developed, in which the DOA can be estimated in the angle domain without additional complexity directly. Furthermore, simulation results corroborate that the proposed deep learning based scheme can achieve better performance in terms of the DOA estimation and the channel estimation compared with conventional methods, and the proposed scheme is well investigated by extensive simulation in various cases for testing its robustness.

...read moreread less

Journal Article•DOI•

A Survey of the State-of-the-Art Localization Techniques and Their Potentials for Autonomous Vehicle Applications

[...]

Sampo Kuutti¹, Saber Fallah¹, Konstantinos Katsaros, Mehrdad Dianati², Francis Mccullough³, Alexandros Mouzakitis³ - Show less +2 more•Institutions (3)

University of Surrey¹, University of Warwick², Jaguar Land Rover³

05 Mar 2018-IEEE Internet of Things Journal

TL;DR: The analysis shows that augmenting off-board information to sensory information has potential to design low-cost localization systems with high accuracy and robustness, however, their performance depends on penetration rate of nearby connected vehicles or infrastructure and the quality of network service.

...read moreread less

Abstract: For an autonomous vehicle to operate safely and effectively, an accurate and robust localization system is essential. While there are a variety of vehicle localization techniques in literature, there is a lack of effort in comparing these techniques and identifying their potentials and limitations for autonomous vehicle applications. Hence, this paper evaluates the state-of-the-art vehicle localization techniques and investigates their applicability on autonomous vehicles. The analysis starts with discussing the techniques which merely use the information obtained from on-board vehicle sensors. It is shown that although some techniques can achieve the accuracy required for autonomous driving but suffer from the high cost of the sensors and also sensor performance limitations in different driving scenarios (e.g., cornering and intersections) and different environmental conditions (e.g., darkness and snow). This paper continues the analysis with considering the techniques which benefit from off-board information obtained from V2X communication channels, in addition to vehicle sensory information. The analysis shows that augmenting off-board information to sensory information has potential to design low-cost localization systems with high accuracy and robustness, however, their performance depends on penetration rate of nearby connected vehicles or infrastructure and the quality of network service.

...read moreread less

Proceedings Article•

Thermometer Encoding: One Hot Way To Resist Adversarial Examples

[...]

Jacob Buckman¹, Aurko Roy¹, Colin Raffel¹, Ian Goodfellow¹•Institutions (1)

Google¹

15 Feb 2018

TL;DR: A simple modification to standard neural network ar3 chitectures, thermometer encoding is proposed, which significantly increases the robustness of the network to adversarial examples, and the proper ties of these networks are explored, providing evidence that thermometer encodings help neural networks to find more-non-linear decision boundaries.

...read moreread less

Abstract: It is well known that it is possible to construct "adversarial examples" for neural networks: inputs which are misclassified by the network yet indistinguishable from true data We propose a simple modification to standard neural network architectures, thermometer encoding, which significantly increases the robustness of the network to adversarial examples We demonstrate this robustness with experiments on the MNIST, CIFAR-10, CIFAR-100, and SVHN datasets, and show that models with thermometer-encoded inputs consistently have higher accuracy on adversarial examples, without decreasing generalization State-of-the-art accuracy under the strongest known white-box attack was increased from 9320% to 9430% on MNIST and 5000% to 7916% on CIFAR-10 We explore the properties of these networks, providing evidence that thermometer encodings help neural networks to find more-non-linear decision boundaries

...read moreread less

Journal Article•DOI•

Speeded up detection of squared fiducial markers

[...]

Francisco J. Romero-Ramirez¹, Rafael Muñoz-Salinas¹, Rafael Medina-Carnicer¹•Institutions (1)

University of Córdoba (Spain)¹

01 Aug 2018-Image and Vision Computing

TL;DR: This paper proposes a multi-scale strategy for speeding up marker detection in video sequences by wisely selecting the most appropriate scale for detection, identification and corner estimation.

...read moreread less

Proceedings Article•

Improving the Adversarial Robustness and Interpretability of Deep Neural Networks by Regularizing Their Input Gradients

[...]

Andrew S. Ross¹, Finale Doshi-Velez¹•Institutions (1)

Harvard University¹

25 Apr 2018

TL;DR: It is demonstrated that regularizing input gradients makes them more naturally interpretable as rationales for model predictions, and also exhibits robustness to transferred adversarial examples generated to fool all of the other models.

...read moreread less

Abstract: Deep neural networks have proven remarkably effective at solving many classification problems, but have been criticized recently for two major weaknesses: the reasons behind their predictions are uninterpretable, and the predictions themselves can often be fooled by small adversarial perturbations. These problems pose major obstacles for the adoption of neural networks in domains that require security or transparency. In this work, we evaluate the effectiveness of defenses that differentiably penalize the degree to which small changes in inputs can alter model predictions. Across multiple attacks, architectures, defenses, and datasets, we find that neural networks trained with this input gradient regularization exhibit robustness to transferred adversarial examples generated to fool all of the other models. We also find that adversarial examples generated to fool gradient-regularized models fool all other models equally well, and actually lead to more "legitimate," interpretable misclassifications as rated by people (which we confirm in a human subject experiment). Finally, we demonstrate that regularizing input gradients makes them more naturally interpretable as rationales for model predictions. We conclude by discussing this relationship between interpretability and robustness in deep neural networks.

...read moreread less

Posted Content•

Certified Robustness to Adversarial Examples with Differential Privacy

[...]

Mathias Lecuyer¹, Vaggelis Atlidakis¹, Roxana Geambasu¹, Daniel Hsu¹, Suman Jana¹ - Show less +1 more•Institutions (1)

Columbia University¹

09 Feb 2018-arXiv: Machine Learning

TL;DR: This paper presents the first certified defense that both scales to large networks and datasets and applies broadly to arbitrary model types, based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism.

...read moreread less

Abstract: Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks, but they either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google's Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired formalism, that provides a rigorous, generic, and flexible foundation for defense.

...read moreread less

Proceedings Article•

Compositional Attention Networks for Machine Reasoning

[...]

Drew A. Hudson¹, Christopher D. Manning¹•Institutions (1)

Stanford University¹

15 Feb 2018

TL;DR: The MAC network is presented, a novel fully differentiable neural network architecture, designed to facilitate explicit and expressive reasoning that is computationally-efficient and data-efficient, in particular requiring 5x less data than existing models to achieve strong results.

...read moreread less

Abstract: We present the MAC network, a novel fully differentiable neural network architecture, designed to facilitate explicit and expressive reasoning. MAC moves away from monolithic black-box neural architectures towards a design that encourages both transparency and versatility. The model approaches problems by decomposing them into a series of attention-based reasoning steps, each performed by a novel recurrent Memory, Attention, and Composition (MAC) cell that maintains a separation between control and memory. By stringing the cells together and imposing structural constraints that regulate their interaction, MAC effectively learns to perform iterative reasoning processes that are directly inferred from the data in an end-to-end approach. We demonstrate the model's strength, robustness and interpretability on the challenging CLEVR dataset for visual reasoning, achieving a new state-of-the-art 98.9% accuracy, halving the error rate of the previous best model. More importantly, we show that the model is computationally-efficient and data-efficient, in particular requiring 5x less data than existing models to achieve strong results.

...read moreread less

Proceedings Article•DOI•

PPFNet: Global Context Aware Local Features for Robust 3D Point Matching

[...]

Haowen Deng¹, Tolga Birdal¹, Slobodan Ilic¹•Institutions (1)

Technische Universität München¹

18 Jun 2018

TL;DR: Qualitative and quantitative evaluations of the PPFNet network suggest increased recall, improved robustness and invariance as well as a vital step in the 3D descriptor extraction performance.

...read moreread less

Abstract: We present PPFNet - Point Pair Feature NETwork for deeply learning a globally informed 3D local feature descriptor to find correspondences in unorganized point clouds. PPFNet learns local descriptors on pure geometry and is highly aware of the global context, an important cue in deep learning. Our 3D representation is computed as a collection of point-pair-features combined with the points and normals within a local vicinity. Our permutation invariant network design is inspired by PointNet and sets PPFNet to be ordering-free. As opposed to voxelization, our method is able to consume raw point clouds to exploit the full sparsity. PPFNet uses a novel N-tuple loss and architecture injecting the global information naturally into the local descriptor. It shows that context awareness also boosts the local feature representation. Qualitative and quantitative evaluations of our network suggest increased recall, improved robustness and invariance as well as a vital step in the 3D descriptor extraction performance.

...read moreread less

Proceedings Article•DOI•

DeepRoad: GAN-based metamorphic testing and input validation framework for autonomous driving systems

[...]

Mengshi Zhang¹, Yuqun Zhang², Lingming Zhang³, Cong Liu³, Sarfraz Khurshid¹ - Show less +1 more•Institutions (3)

University of Texas at Austin¹, Southern University of Science and Technology², University of Texas at Dallas³

03 Sep 2018

TL;DR: The experimental results demonstrate that DeepRoad can detect thousands of inconsistent behaviors for DNN-based autonomous driving systems, and effectively validate input images to potentially enhance the system robustness as well.

...read moreread less

Abstract: While Deep Neural Networks (DNNs) have established the fundamentals of image-based autonomous driving systems, they may exhibit erroneous behaviors and cause fatal accidents. To address the safety issues in autonomous driving systems, a recent set of testing techniques have been designed to automatically generate artificial driving scenes to enrich test suite, e.g., generating new input images transformed from the original ones. However, these techniques are insufficient due to two limitations: first, many such synthetic images often lack diversity of driving scenes, and hence compromise the resulting efficacy and reliability. Second, for machine-learning-based systems, a mismatch between training and application domain can dramatically degrade system accuracy, such that it is necessary to validate inputs for improving system robustness. In this paper, we propose DeepRoad, an unsupervised DNN-based framework for automatically testing the consistency of DNN-based autonomous driving systems and online validation. First, DeepRoad automatically synthesizes large amounts of diverse driving scenes without using image transformation rules (e.g. scale, shear and rotation). In particular, DeepRoad is able to produce driving scenes with various weather conditions (including those with rather extreme conditions) by applying Generative Adversarial Networks (GANs) along with the corresponding real-world weather scenes. Second, DeepRoad utilizes metamorphic testing techniques to check the consistency of such systems using synthetic images. Third, DeepRoad validates input images for DNN-based systems by measuring the distance of the input and training images using their VGGNet features. We implement DeepRoad to test three well-recognized DNN-based autonomous driving systems in Udacity self-driving car challenge. The experimental results demonstrate that DeepRoad can detect thousands of inconsistent behaviors for these systems, and effectively validate input images to potentially enhance the system robustness as well.

...read moreread less

Posted Content•

Robustness May Be at Odds with Accuracy

[...]

Dimitris Tsipras¹, Shibani Santurkar¹, Logan Engstrom¹, Alexander Turner¹, Aleksander Madry¹ - Show less +1 more•Institutions (1)

Massachusetts Institute of Technology¹

30 May 2018-arXiv: Machine Learning

TL;DR: In this article, the authors show that there is a trade-off between the standard accuracy of a model and its robustness to adversarial perturbations in a simple and natural setting.

...read moreread less

Posted Content•

Deep k-Nearest Neighbors: Towards Confident, Interpretable and Robust Deep Learning

[...]

Nicolas Papernot, Patrick McDaniel

13 Mar 2018-arXiv: Learning

TL;DR: The DkNN algorithm is evaluated on several datasets, and it is shown the confidence estimates accurately identify inputs outside the model, and that the explanations provided by nearest neighbors are intuitive and useful in understanding model failures.

...read moreread less

Abstract: Deep neural networks (DNNs) enable innovative applications of machine learning like image recognition, machine translation, or malware detection. However, deep learning is often criticized for its lack of robustness in adversarial settings (e.g., vulnerability to adversarial inputs) and general inability to rationalize its predictions. In this work, we exploit the structure of deep learning to enable new learning-based inference and decision strategies that achieve desirable properties such as robustness and interpretability. We take a first step in this direction and introduce the Deep k-Nearest Neighbors (DkNN). This hybrid classifier combines the k-nearest neighbors algorithm with representations of the data learned by each layer of the DNN: a test input is compared to its neighboring training points according to the distance that separates them in the representations. We show the labels of these neighboring points afford confidence estimates for inputs outside the model's training manifold, including on malicious inputs like adversarial examples--and therein provides protections against inputs that are outside the models understanding. This is because the nearest neighbors can be used to estimate the nonconformity of, i.e., the lack of support for, a prediction in the training data. The neighbors also constitute human-interpretable explanations of predictions. We evaluate the DkNN algorithm on several datasets, and show the confidence estimates accurately identify inputs outside the model, and that the explanations provided by nearest neighbors are intuitive and useful in understanding model failures.

...read moreread less

Posted Content•

Domain Adaptive Faster R-CNN for Object Detection in the Wild

[...]

Yuhua Chen¹, Wen Li¹, Christos Sakaridis¹, Dengxin Dai¹, Luc Van Gool¹ - Show less +1 more•Institutions (1)

ETH Zurich¹

08 Mar 2018-arXiv: Computer Vision and Pattern Recognition

TL;DR: Zhang et al. as discussed by the authors designed two domain adaptation components, on image level and instance level, to reduce the domain discrepancy in Faster R-CNN, which is based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner.

...read moreread less

Abstract: Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

...read moreread less

Proceedings Article•

Efficient Neural Network Robustness Certification with General Activation Functions

[...]

Huan Zhang¹, Tsui-Wei Weng², Pin-Yu Chen³, Cho-Jui Hsieh¹, Luca Daniel² - Show less +1 more•Institutions (3)

University of California, Los Angeles¹, Massachusetts Institute of Technology², IBM³

01 Nov 2018

TL;DR: This paper introduces CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points and facilitates the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation.

...read moreread less

Abstract: Finding minimum distortion of adversarial examples and thus certifying robustness in neural networks classifiers is known to be a challenging problem. Nevertheless, recently it has been shown to be possible to give a non-trivial certified lower bound of minimum distortion, and some recent progress has been made towards this direction by exploiting the piece-wise linear nature of ReLU activations. However, a generic robustness certification for \textit{general} activation functions still remains largely unexplored. To address this issue, in this paper we introduce CROWN, a general framework to certify robustness of neural networks with general activation functions. The novelty in our algorithm consists of bounding a given activation function with linear and quadratic functions, hence allowing it to tackle general activation functions including but not limited to the four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we facilitate the search for a tighter certified lower bound by \textit{adaptively} selecting appropriate surrogates for each neuron activation. Experimental results show that CROWN on ReLU networks can notably improve the certified lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while having comparable computational efficiency. Furthermore, CROWN also demonstrates its effectiveness and flexibility on networks with general activation functions, including tanh, sigmoid and arctan.

...read moreread less

Journal Article•DOI•

Understanding adversarial training: Increasing local stability of supervised models through robust optimization

[...]

Uri Shaham¹, Yutaro Yamada¹, Sahand Negahban¹•Institutions (1)

Yale University¹

13 Sep 2018-Neurocomputing

TL;DR: The proposed framework generalizes adversarial training, as well as previous approaches for increasing local stability of ANNs, and increases the robustness of the network to existing adversarial examples, while making it harder to generate new ones.

...read moreread less

Journal Article•DOI•

Learning Explanatory Rules from Noisy Data

[...]

Richard Evans, Edward Grefenstette

26 Jan 2018-Journal of Artificial Intelligence Research

TL;DR: Differentiable Inductive Logic Programming (DILP) as mentioned in this paper is a differentiable inductive logic framework that can not only solve tasks which traditional ILP systems are suited for, but shows a robustness to noise and error in the training data which ILP cannot cope with.

...read moreread less

Abstract: Artificial Neural Networks are powerful function approximators capable of modelling solutions to a wide variety of problems, both supervised and unsupervised. As their size and expressivity increases, so too does the variance of the model, yielding a nearly ubiquitous overfitting problem. Although mitigated by a variety of model regularisation methods, the common cure is to seek large amounts of training data--which is not necessarily easily obtained--that sufficiently approximates the data distribution of the domain we wish to test on. In contrast, logic programming methods such as Inductive Logic Programming offer an extremely data-efficient process by which models can be trained to reason on symbolic domains. However, these methods are unable to deal with the variety of domains neural networks can be applied to: they are not robust to noise in or mislabelling of inputs, and perhaps more importantly, cannot be applied to non-symbolic domains where the data is ambiguous, such as operating on raw pixels. In this paper, we propose a Differentiable Inductive Logic framework, which can not only solve tasks which traditional ILP systems are suited for, but shows a robustness to noise and error in the training data which ILP cannot cope with. Furthermore, as it is trained by backpropagation against a likelihood objective, it can be hybridised by connecting it with neural networks over ambiguous data in order to be applied to domains which ILP cannot address, while providing data efficiency and generalisation beyond what neural networks on their own can achieve.

...read moreread less

Proceedings Article•

Fast and Effective Robustness Certification

[...]

Gagandeep Singh¹, Timon Gehr¹, Matthew Mirman¹, Markus Püschel¹, Martin Vechev¹ - Show less +1 more•Institutions (1)

ETH Zurich¹

01 Jan 2018

TL;DR: A new method and system, called DeepZ, for certifying neural network robustness based on abstract interpretation that handles ReLU, Tanh and Sigmoid activation functions, is significantly more scalable and precise, and is sound with respect to floating point arithmetic.

...read moreread less

Abstract: We present a new method and system, called DeepZ, for certifying neural network robustness based on abstract interpretation Compared to state-of-the-art automated verifiers for neural networks, DeepZ: (i) handles ReLU, Tanh and Sigmoid activation functions, (ii) supports feedforward and convolutional architectures, (iii) is significantly more scalable and precise, and (iv) and is sound with respect to floating point arithmetic These benefits are due to carefully designed approximations tailored to the setting of neural networks As an example, DeepZ achieves a verification accuracy of 97% on a large network with 88,500 hidden units under $L_{\infty}$ attack with $\epsilon = 01$ with an average runtime of 133 seconds

...read moreread less

Journal Article•DOI•

Deep Learning for RF Device Fingerprinting in Cognitive Communication Networks

[...]

Kevin Merchant¹, Shauna Revay¹, George Stantchev¹, Bryan Nousain¹•Institutions (1)

United States Naval Research Laboratory¹

23 Jan 2018-IEEE Journal of Selected Topics in Signal Processing

TL;DR: Deep learning is used to detect physical-layer attributes for the identification of cognitive radio devices, and the method is based on the empirical principle that manufacturing variability among wireless transmitters that conform to the same standard creates unique, repeatable signatures in each transmission.

...read moreread less

Abstract: With the increasing presence of cognitive radio networks as a means to address limited spectral resources, improved wireless security has become a necessity. In particular, the potential of a node to impersonate a licensed user demonstrates the need for techniques to authenticate a radio's true identity. In this paper, we use deep learning to detect physical-layer attributes for the identification of cognitive radio devices, and demonstrate the performance of our method on a set of IEEE 802.15.4 devices. Our method is based on the empirical principle that manufacturing variability among wireless transmitters that conform to the same standard creates unique, repeatable signatures in each transmission, which can then be used as a fingerprint for device identification and verification. We develop a framework for training a convolutional neural network using the time-domain complex baseband error signal and demonstrate 92.29% identification accuracy on a set of seven 2.4 GHz commercial ZigBee devices. We also demonstrate the robustness of our method over a wide range of signal-to-noise ratios.

...read moreread less

Collapse