scispace - formally typeset
Search or ask a question

Showing papers on "Robustness (computer science) published in 2018"


Proceedings ArticleDOI
18 Jun 2018
TL;DR: StarGAN as discussed by the authors proposes a unified model architecture to perform image-to-image translation for multiple domains using only a single model, which leads to superior quality of translated images compared to existing models as well as the capability of flexibly translating an input image to any desired target domain.
Abstract: Recent studies have shown remarkable success in image-to-image translation for two domains. However, existing approaches have limited scalability and robustness in handling more than two domains, since different models should be built independently for every pair of image domains. To address this limitation, we propose StarGAN, a novel and scalable approach that can perform image-to-image translations for multiple domains using only a single model. Such a unified model architecture of StarGAN allows simultaneous training of multiple datasets with different domains within a single network. This leads to StarGAN's superior quality of translated images compared to existing models as well as the novel capability of flexibly translating an input image to any desired target domain. We empirically demonstrate the effectiveness of our approach on a facial attribute transfer and a facial expression synthesis tasks.

2,479 citations


Posted Content
TL;DR: Co-teaching as discussed by the authors trains two deep neural networks simultaneously, and let them teach each other given every mini-batch: first, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this minibatch should be used for training; finally, each networks back propagates the data selected by its peer network and updates itself.
Abstract: Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called Co-teaching for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.

866 citations


Proceedings ArticleDOI
Yuhua Chen1, Wen Li1, Christos Sakaridis1, Dengxin Dai1, Luc Van Gool1 
08 Mar 2018
TL;DR: Zhang et al. as discussed by the authors designed two domain adaptation components, on image level and instance level, to reduce the domain discrepancy in Faster R-CNN, which is based on $$-divergence theory and is implemented by learning a domain classifier in adversarial training manner.
Abstract: Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc., and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on $$-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

843 citations


Proceedings ArticleDOI
20 May 2018
TL;DR: This work presents AI2, the first sound and scalable analyzer for deep neural networks, and introduces abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers.
Abstract: We present AI2, the first sound and scalable analyzer for deep neural networks. Based on overapproximation, AI2 can automatically prove safety properties (e.g., robustness) of realistic neural networks (e.g., convolutional neural networks). The key insight behind AI2 is to phrase reasoning about safety and robustness of neural networks in terms of classic abstract interpretation, enabling us to leverage decades of advances in that area. Concretely, we introduce abstract transformers that capture the behavior of fully connected and convolutional neural network layers with rectified linear unit activations (ReLU), as well as max pooling layers. This allows us to handle real-world neural networks, which are often built out of those types of layers. We present a complete implementation of AI2 together with an extensive evaluation on 20 neural networks. Our results demonstrate that: (i) AI2 is precise enough to prove useful specifications (e.g., robustness), (ii) AI2 can be used to certify the effectiveness of state-of-the-art defenses for neural networks, (iii) AI2 is significantly faster than existing analyzers based on symbolic analysis, which often take hours to verify simple fully connected networks, and (iv) AI2 can handle deep convolutional networks, which are beyond the reach of existing methods.

841 citations


Proceedings Article
27 Sep 2018
TL;DR: It is shown that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization, and it is argued that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers.
Abstract: We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.

822 citations


Proceedings ArticleDOI
18 Jun 2018
TL;DR: In this article, a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting is presented, which exploits redundancies in large deep networks to free up parameters that can then be employed to learn new tasks.
Abstract: This paper presents a method for adding multiple tasks to a single deep neural network while avoiding catastrophic forgetting. Inspired by network pruning techniques, we exploit redundancies in large deep networks to free up parameters that can then be employed to learn new tasks. By performing iterative pruning and network re-training, we are able to sequentially "pack" multiple tasks into a single network while ensuring minimal drop in performance and minimal storage overhead. Unlike prior work that uses proxy losses to maintain accuracy on older tasks, we always optimize for the task at hand. We perform extensive experiments on a variety of network architectures and large-scale datasets, and observe much better robustness against catastrophic forgetting than prior work. In particular, we are able to add three fine-grained classification tasks to a single ImageNet-trained VGG-16 network and achieve accuracies close to those of separately trained networks for each task.

803 citations


Proceedings Article
29 Jan 2018
TL;DR: This work proposes a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value, providing an adaptive regularizer that encourages robustness against all attacks.
Abstract: While neural networks have achieved high accuracy on standard image classification benchmarks, their accuracy drops to nearly zero in the presence of small adversarial perturbations to test inputs. Defenses based on regularization and adversarial training have been proposed, but often followed by new, stronger attacks that defeat these defenses. Can we somehow end this arms race? In this work, we study this problem for neural networks with one hidden layer. We first propose a method based on a semidefinite relaxation that outputs a certificate that for a given network and test input, no attack can force the error to exceed a certain value. Second, as this certificate is differentiable, we jointly optimize it with the network parameters, providing an adaptive regularizer that encourages robustness against all attacks. On MNIST, our approach produces a network and a certificate that no attack that perturbs each pixel by at most \epsilon = 0.1 can cause more than 35% test error.

758 citations


Proceedings ArticleDOI
06 Mar 2018
TL;DR: GeoNet as mentioned in this paper proposes an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively.
Abstract: We propose GeoNet, a jointly unsupervised learning framework for monocular depth, optical flow and egomotion estimation from videos. The three components are coupled by the nature of 3D scene geometry, jointly learned by our framework in an end-to-end manner. Specifically, geometric relationships are extracted over the predictions of individual modules and then combined as an image reconstruction loss, reasoning about static and dynamic scene parts separately. Furthermore, we propose an adaptive geometric consistency loss to increase robustness towards outliers and non-Lambertian regions, which resolves occlusions and texture ambiguities effectively. Experimentation on the KITTI driving dataset reveals that our scheme achieves state-of-the-art results in all of the three tasks, performing better than previously unsupervised methods and comparably with supervised ones.

725 citations


Journal ArticleDOI
TL;DR: SAF R-CNN as discussed by the authors introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges, and outputs from all of the sub-networks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales.
Abstract: In this paper, we consider the problem of pedestrian detection in natural scenes. Intuitively, instances of pedestrians with different spatial scales may exhibit dramatically different features. Thus, large variance in instance scales, which results in undesirable large intracategory variance in features, may severely hurt the performance of modern object instance detection methods. We argue that this issue can be substantially alleviated by the divide-and-conquer philosophy. Taking pedestrian detection as an example, we illustrate how we can leverage this philosophy to develop a Scale-Aware Fast R-CNN (SAF R-CNN) framework. The model introduces multiple built-in subnetworks which detect pedestrians with scales from disjoint ranges. Outputs from all of the subnetworks are then adaptively combined to generate the final detection results that are shown to be robust to large variance in instance scales, via a gate function defined over the sizes of object proposals. Extensive evaluations on several challenging pedestrian detection datasets well demonstrate the effectiveness of the proposed SAF R-CNN. Particularly, our method achieves state-of-the-art performance on Caltech [P. Dollar, C. Wojek, B. Schiele, and P. Perona, “Pedestrian detection: An evaluation of the state of the art,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 34, no. 4, pp. 743–761, Apr. 2012], and obtains competitive results on INRIA [N. Dalal and B. Triggs, “Histograms of oriented gradients for human detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit. , 2005, pp. 886–893], ETH [A. Ess, B. Leibe, and L. V. Gool, “Depth and appearance for mobile scene analysis,” in Proc. Int. Conf. Comput. Vis ., 2007, pp. 1–8], and KITTI [A. Geiger, P. Lenz, and R. Urtasun, “Are we ready for autonomous driving? The KITTI vision benchmark suite,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit ., 2012, pp. 3354–3361].

716 citations


Proceedings ArticleDOI
01 Jan 2018
TL;DR: Empirical results on noisy versions of MNIST, CIFar-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.
Abstract: Deep learning with noisy labels is practically challenging, as the capacity of deep models is so high that they can totally memorize these noisy labels sooner or later during training. Nonetheless, recent studies on the memorization effects of deep neural networks show that they would first memorize training data of clean labels and then those of noisy labels. Therefore in this paper, we propose a new deep learning paradigm called ''Co-teaching'' for combating with noisy labels. Namely, we train two deep neural networks simultaneously, and let them teach each other given every mini-batch: firstly, each network feeds forward all data and selects some data of possibly clean labels; secondly, two networks communicate with each other what data in this mini-batch should be used for training; finally, each network back propagates the data selected by its peer network and updates itself. Empirical results on noisy versions of MNIST, CIFAR-10 and CIFAR-100 demonstrate that Co-teaching is much superior to the state-of-the-art methods in the robustness of trained deep models.

657 citations


Proceedings Article
15 Feb 2018
TL;DR: It is found that a model based on a character convolutional neural network is able to simultaneously learn representations robust to multiple kinds of noise, including structure-invariant word representations and robust training on noisy texts.
Abstract: Character-based neural machine translation (NMT) models alleviate out-of-vocabulary issues, learn morphology, and move us closer to completely end-to-end translation systems. Unfortunately, they are also very brittle and easily falter when presented with noisy data. In this paper, we confront NMT models with synthetic and natural sources of noise. We find that state-of-the-art models fail to translate even moderately noisy texts that humans have no trouble comprehending. We explore two approaches to increase model robustness: structure-invariant word representations and robust training on noisy texts. We find that a model based on a character convolutional neural network is able to simultaneously learn representations robust to multiple kinds of noise.

Journal ArticleDOI
TL;DR: A convolutional neural network is proposed to detect crack patches in each video frame, while the proposed data fusion scheme maintains the spatiotemporal coherence of cracks in videos, and the Naïve Bayes decision making discards false positives effectively.
Abstract: Regular inspection of nuclear power plant components is important to guarantee safe operations. However, current practice is time consuming, tedious, and subjective, which involves human technicians reviewing the inspection videos and identifying cracks on reactors. A few vision-based crack detection approaches have been developed for metallic surfaces, and they typically perform poorly when used for analyzing nuclear inspection videos. Detecting these cracks is a challenging task since they are tiny, and noisy patterns exist on the components’ surfaces. This study proposes a deep learning framework, based on a convolutional neural network (CNN) and a Naive Bayes data fusion scheme, called NB-CNN, to analyze individual video frames for crack detection while a novel data fusion scheme is proposed to aggregate the information extracted from each video frame to enhance the overall performance and robustness of the system. To this end, a CNN is proposed to detect crack patches in each video frame, while the proposed data fusion scheme maintains the spatiotemporal coherence of cracks in videos, and the Naive Bayes decision making discards false positives effectively. The proposed framework achieves a 98.3% hit rate against 0.1 false positives per frame that is significantly higher than state-of-the-art approaches as presented in this paper.

Proceedings Article
27 Sep 2018
TL;DR: Verification of piecewise-linear neural networks as a mixed integer program that is able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack for every network.
Abstract: Neural networks have demonstrated considerable success on a wide variety of real-world problems. However, neural networks can be fooled by adversarial examples – slightly perturbed inputs that are misclassified with high confidence. Verification of networks enables us to gauge their vulnerability to such adversarial examples. We formulate verification of piecewise-linear neural networks as a mixed integer program. Our verifier finds minimum adversarial distortions two to three orders of magnitude more quickly than the state-of-the-art. We achieve this via tight formulations for non-linearities, as well as a novel presolve algorithm that makes full use of all information available. The computational speedup enables us to verify properties on convolutional networks with an order of magnitude more ReLUs than had been previously verified by any complete verifier, and we determine for the first time the exact adversarial accuracy of an MNIST classifier to perturbations with bounded l[infinity] norm e = 0:1. On this network, we find an adversarial example for 4.38% of samples, and a certificate of robustness for the remainder. Across a variety of robust training procedures, we are able to certify more samples than the state-of-the-art and find more adversarial examples than a strong first-order attack for every network.

Journal ArticleDOI
TL;DR: Simulation results corroborate that the proposed deep learning based scheme can achieve better performance in terms of the DOA estimation and the channel estimation compared with conventional methods, and the proposed scheme is well investigated by extensive simulation in various cases for testing its robustness.
Abstract: The recent concept of massive multiple-input multiple-output (MIMO) can significantly improve the capacity of the communication network, and it has been regarded as a promising technology for the next-generation wireless communications. However, the fundamental challenge of existing massive MIMO systems is that high computational complexity and complicated spatial structures bring great difficulties to exploit the characteristics of the channel and sparsity of these multi-antennas systems. To address this problem, in this paper, we focus on channel estimation and direction-of-arrival (DOA) estimation, and a novel framework that integrates the massive MIMO into deep learning is proposed. To realize end-to-end performance, a deep neural network (DNN) is employed to conduct offline learning and online learning procedures, which is effective to learn the statistics of the wireless channel and the spatial structures in the angle domain. Concretely, the DNN is first trained by simulated data in different channel conditions with the aids of the offline learning, and then corresponding output data can be obtained based on current input data during online learning process. In order to realize super-resolution channel estimation and DOA estimation, two algorithms based on the deep learning are developed, in which the DOA can be estimated in the angle domain without additional complexity directly. Furthermore, simulation results corroborate that the proposed deep learning based scheme can achieve better performance in terms of the DOA estimation and the channel estimation compared with conventional methods, and the proposed scheme is well investigated by extensive simulation in various cases for testing its robustness.

Journal ArticleDOI
TL;DR: The analysis shows that augmenting off-board information to sensory information has potential to design low-cost localization systems with high accuracy and robustness, however, their performance depends on penetration rate of nearby connected vehicles or infrastructure and the quality of network service.
Abstract: For an autonomous vehicle to operate safely and effectively, an accurate and robust localization system is essential. While there are a variety of vehicle localization techniques in literature, there is a lack of effort in comparing these techniques and identifying their potentials and limitations for autonomous vehicle applications. Hence, this paper evaluates the state-of-the-art vehicle localization techniques and investigates their applicability on autonomous vehicles. The analysis starts with discussing the techniques which merely use the information obtained from on-board vehicle sensors. It is shown that although some techniques can achieve the accuracy required for autonomous driving but suffer from the high cost of the sensors and also sensor performance limitations in different driving scenarios (e.g., cornering and intersections) and different environmental conditions (e.g., darkness and snow). This paper continues the analysis with considering the techniques which benefit from off-board information obtained from V2X communication channels, in addition to vehicle sensory information. The analysis shows that augmenting off-board information to sensory information has potential to design low-cost localization systems with high accuracy and robustness, however, their performance depends on penetration rate of nearby connected vehicles or infrastructure and the quality of network service.

Proceedings Article
15 Feb 2018
TL;DR: A simple modification to standard neural network ar3 chitectures, thermometer encoding is proposed, which significantly increases the robustness of the network to adversarial examples, and the proper ties of these networks are explored, providing evidence that thermometer encodings help neural networks to find more-non-linear decision boundaries.
Abstract: It is well known that it is possible to construct "adversarial examples" for neural networks: inputs which are misclassified by the network yet indistinguishable from true data We propose a simple modification to standard neural network architectures, thermometer encoding, which significantly increases the robustness of the network to adversarial examples We demonstrate this robustness with experiments on the MNIST, CIFAR-10, CIFAR-100, and SVHN datasets, and show that models with thermometer-encoded inputs consistently have higher accuracy on adversarial examples, without decreasing generalization State-of-the-art accuracy under the strongest known white-box attack was increased from 9320% to 9430% on MNIST and 5000% to 7916% on CIFAR-10 We explore the properties of these networks, providing evidence that thermometer encodings help neural networks to find more-non-linear decision boundaries

Journal ArticleDOI
TL;DR: This paper proposes a multi-scale strategy for speeding up marker detection in video sequences by wisely selecting the most appropriate scale for detection, identification and corner estimation.

Proceedings Article
25 Apr 2018
TL;DR: It is demonstrated that regularizing input gradients makes them more naturally interpretable as rationales for model predictions, and also exhibits robustness to transferred adversarial examples generated to fool all of the other models.
Abstract: Deep neural networks have proven remarkably effective at solving many classification problems, but have been criticized recently for two major weaknesses: the reasons behind their predictions are uninterpretable, and the predictions themselves can often be fooled by small adversarial perturbations. These problems pose major obstacles for the adoption of neural networks in domains that require security or transparency. In this work, we evaluate the effectiveness of defenses that differentiably penalize the degree to which small changes in inputs can alter model predictions. Across multiple attacks, architectures, defenses, and datasets, we find that neural networks trained with this input gradient regularization exhibit robustness to transferred adversarial examples generated to fool all of the other models. We also find that adversarial examples generated to fool gradient-regularized models fool all other models equally well, and actually lead to more "legitimate," interpretable misclassifications as rated by people (which we confirm in a human subject experiment). Finally, we demonstrate that regularizing input gradients makes them more naturally interpretable as rationales for model predictions. We conclude by discussing this relationship between interpretability and robustness in deep neural networks.

Posted Content
TL;DR: This paper presents the first certified defense that both scales to large networks and datasets and applies broadly to arbitrary model types, based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired privacy formalism.
Abstract: Adversarial examples that fool machine learning models, particularly deep neural networks, have been a topic of intense research interest, with attacks and defenses being developed in a tight back-and-forth. Most past defenses are best effort and have been shown to be vulnerable to sophisticated attacks. Recently a set of certified defenses have been introduced, which provide guarantees of robustness to norm-bounded attacks, but they either do not scale to large datasets or are limited in the types of models they can support. This paper presents the first certified defense that both scales to large networks and datasets (such as Google's Inception network for ImageNet) and applies broadly to arbitrary model types. Our defense, called PixelDP, is based on a novel connection between robustness against adversarial examples and differential privacy, a cryptographically-inspired formalism, that provides a rigorous, generic, and flexible foundation for defense.

Proceedings Article
15 Feb 2018
TL;DR: The MAC network is presented, a novel fully differentiable neural network architecture, designed to facilitate explicit and expressive reasoning that is computationally-efficient and data-efficient, in particular requiring 5x less data than existing models to achieve strong results.
Abstract: We present the MAC network, a novel fully differentiable neural network architecture, designed to facilitate explicit and expressive reasoning. MAC moves away from monolithic black-box neural architectures towards a design that encourages both transparency and versatility. The model approaches problems by decomposing them into a series of attention-based reasoning steps, each performed by a novel recurrent Memory, Attention, and Composition (MAC) cell that maintains a separation between control and memory. By stringing the cells together and imposing structural constraints that regulate their interaction, MAC effectively learns to perform iterative reasoning processes that are directly inferred from the data in an end-to-end approach. We demonstrate the model's strength, robustness and interpretability on the challenging CLEVR dataset for visual reasoning, achieving a new state-of-the-art 98.9% accuracy, halving the error rate of the previous best model. More importantly, we show that the model is computationally-efficient and data-efficient, in particular requiring 5x less data than existing models to achieve strong results.

Proceedings ArticleDOI
18 Jun 2018
TL;DR: Qualitative and quantitative evaluations of the PPFNet network suggest increased recall, improved robustness and invariance as well as a vital step in the 3D descriptor extraction performance.
Abstract: We present PPFNet - Point Pair Feature NETwork for deeply learning a globally informed 3D local feature descriptor to find correspondences in unorganized point clouds. PPFNet learns local descriptors on pure geometry and is highly aware of the global context, an important cue in deep learning. Our 3D representation is computed as a collection of point-pair-features combined with the points and normals within a local vicinity. Our permutation invariant network design is inspired by PointNet and sets PPFNet to be ordering-free. As opposed to voxelization, our method is able to consume raw point clouds to exploit the full sparsity. PPFNet uses a novel N-tuple loss and architecture injecting the global information naturally into the local descriptor. It shows that context awareness also boosts the local feature representation. Qualitative and quantitative evaluations of our network suggest increased recall, improved robustness and invariance as well as a vital step in the 3D descriptor extraction performance.

Proceedings ArticleDOI
03 Sep 2018
TL;DR: The experimental results demonstrate that DeepRoad can detect thousands of inconsistent behaviors for DNN-based autonomous driving systems, and effectively validate input images to potentially enhance the system robustness as well.
Abstract: While Deep Neural Networks (DNNs) have established the fundamentals of image-based autonomous driving systems, they may exhibit erroneous behaviors and cause fatal accidents. To address the safety issues in autonomous driving systems, a recent set of testing techniques have been designed to automatically generate artificial driving scenes to enrich test suite, e.g., generating new input images transformed from the original ones. However, these techniques are insufficient due to two limitations: first, many such synthetic images often lack diversity of driving scenes, and hence compromise the resulting efficacy and reliability. Second, for machine-learning-based systems, a mismatch between training and application domain can dramatically degrade system accuracy, such that it is necessary to validate inputs for improving system robustness. In this paper, we propose DeepRoad, an unsupervised DNN-based framework for automatically testing the consistency of DNN-based autonomous driving systems and online validation. First, DeepRoad automatically synthesizes large amounts of diverse driving scenes without using image transformation rules (e.g. scale, shear and rotation). In particular, DeepRoad is able to produce driving scenes with various weather conditions (including those with rather extreme conditions) by applying Generative Adversarial Networks (GANs) along with the corresponding real-world weather scenes. Second, DeepRoad utilizes metamorphic testing techniques to check the consistency of such systems using synthetic images. Third, DeepRoad validates input images for DNN-based systems by measuring the distance of the input and training images using their VGGNet features. We implement DeepRoad to test three well-recognized DNN-based autonomous driving systems in Udacity self-driving car challenge. The experimental results demonstrate that DeepRoad can detect thousands of inconsistent behaviors for these systems, and effectively validate input images to potentially enhance the system robustness as well.

Posted Content
TL;DR: In this article, the authors show that there is a trade-off between the standard accuracy of a model and its robustness to adversarial perturbations in a simple and natural setting.
Abstract: We show that there may exist an inherent tension between the goal of adversarial robustness and that of standard generalization. Specifically, training robust models may not only be more resource-consuming, but also lead to a reduction of standard accuracy. We demonstrate that this trade-off between the standard accuracy of a model and its robustness to adversarial perturbations provably exists in a fairly simple and natural setting. These findings also corroborate a similar phenomenon observed empirically in more complex settings. Further, we argue that this phenomenon is a consequence of robust classifiers learning fundamentally different feature representations than standard classifiers. These differences, in particular, seem to result in unexpected benefits: the representations learned by robust models tend to align better with salient data characteristics and human perception.

Posted Content
TL;DR: The DkNN algorithm is evaluated on several datasets, and it is shown the confidence estimates accurately identify inputs outside the model, and that the explanations provided by nearest neighbors are intuitive and useful in understanding model failures.
Abstract: Deep neural networks (DNNs) enable innovative applications of machine learning like image recognition, machine translation, or malware detection. However, deep learning is often criticized for its lack of robustness in adversarial settings (e.g., vulnerability to adversarial inputs) and general inability to rationalize its predictions. In this work, we exploit the structure of deep learning to enable new learning-based inference and decision strategies that achieve desirable properties such as robustness and interpretability. We take a first step in this direction and introduce the Deep k-Nearest Neighbors (DkNN). This hybrid classifier combines the k-nearest neighbors algorithm with representations of the data learned by each layer of the DNN: a test input is compared to its neighboring training points according to the distance that separates them in the representations. We show the labels of these neighboring points afford confidence estimates for inputs outside the model's training manifold, including on malicious inputs like adversarial examples--and therein provides protections against inputs that are outside the models understanding. This is because the nearest neighbors can be used to estimate the nonconformity of, i.e., the lack of support for, a prediction in the training data. The neighbors also constitute human-interpretable explanations of predictions. We evaluate the DkNN algorithm on several datasets, and show the confidence estimates accurately identify inputs outside the model, and that the explanations provided by nearest neighbors are intuitive and useful in understanding model failures.

Posted Content
Yuhua Chen1, Wen Li1, Christos Sakaridis1, Dengxin Dai1, Luc Van Gool1 
TL;DR: Zhang et al. as discussed by the authors designed two domain adaptation components, on image level and instance level, to reduce the domain discrepancy in Faster R-CNN, which is based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner.
Abstract: Object detection typically assumes that training and test data are drawn from an identical distribution, which, however, does not always hold in practice. Such a distribution mismatch will lead to a significant performance drop. In this work, we aim to improve the cross-domain robustness of object detection. We tackle the domain shift on two levels: 1) the image-level shift, such as image style, illumination, etc, and 2) the instance-level shift, such as object appearance, size, etc. We build our approach based on the recent state-of-the-art Faster R-CNN model, and design two domain adaptation components, on image level and instance level, to reduce the domain discrepancy. The two domain adaptation components are based on H-divergence theory, and are implemented by learning a domain classifier in adversarial training manner. The domain classifiers on different levels are further reinforced with a consistency regularization to learn a domain-invariant region proposal network (RPN) in the Faster R-CNN model. We evaluate our newly proposed approach using multiple datasets including Cityscapes, KITTI, SIM10K, etc. The results demonstrate the effectiveness of our proposed approach for robust object detection in various domain shift scenarios.

Proceedings Article
01 Nov 2018
TL;DR: This paper introduces CROWN, a general framework to certify robustness of neural networks with general activation functions for given input data points and facilitates the search for a tighter certified lower bound by adaptively selecting appropriate surrogates for each neuron activation.
Abstract: Finding minimum distortion of adversarial examples and thus certifying robustness in neural networks classifiers is known to be a challenging problem. Nevertheless, recently it has been shown to be possible to give a non-trivial certified lower bound of minimum distortion, and some recent progress has been made towards this direction by exploiting the piece-wise linear nature of ReLU activations. However, a generic robustness certification for \textit{general} activation functions still remains largely unexplored. To address this issue, in this paper we introduce CROWN, a general framework to certify robustness of neural networks with general activation functions. The novelty in our algorithm consists of bounding a given activation function with linear and quadratic functions, hence allowing it to tackle general activation functions including but not limited to the four popular choices: ReLU, tanh, sigmoid and arctan. In addition, we facilitate the search for a tighter certified lower bound by \textit{adaptively} selecting appropriate surrogates for each neuron activation. Experimental results show that CROWN on ReLU networks can notably improve the certified lower bounds compared to the current state-of-the-art algorithm Fast-Lin, while having comparable computational efficiency. Furthermore, CROWN also demonstrates its effectiveness and flexibility on networks with general activation functions, including tanh, sigmoid and arctan.

Journal ArticleDOI
TL;DR: The proposed framework generalizes adversarial training, as well as previous approaches for increasing local stability of ANNs, and increases the robustness of the network to existing adversarial examples, while making it harder to generate new ones.

Journal ArticleDOI
TL;DR: Differentiable Inductive Logic Programming (DILP) as mentioned in this paper is a differentiable inductive logic framework that can not only solve tasks which traditional ILP systems are suited for, but shows a robustness to noise and error in the training data which ILP cannot cope with.
Abstract: Artificial Neural Networks are powerful function approximators capable of modelling solutions to a wide variety of problems, both supervised and unsupervised. As their size and expressivity increases, so too does the variance of the model, yielding a nearly ubiquitous overfitting problem. Although mitigated by a variety of model regularisation methods, the common cure is to seek large amounts of training data--which is not necessarily easily obtained--that sufficiently approximates the data distribution of the domain we wish to test on. In contrast, logic programming methods such as Inductive Logic Programming offer an extremely data-efficient process by which models can be trained to reason on symbolic domains. However, these methods are unable to deal with the variety of domains neural networks can be applied to: they are not robust to noise in or mislabelling of inputs, and perhaps more importantly, cannot be applied to non-symbolic domains where the data is ambiguous, such as operating on raw pixels. In this paper, we propose a Differentiable Inductive Logic framework, which can not only solve tasks which traditional ILP systems are suited for, but shows a robustness to noise and error in the training data which ILP cannot cope with. Furthermore, as it is trained by backpropagation against a likelihood objective, it can be hybridised by connecting it with neural networks over ambiguous data in order to be applied to domains which ILP cannot address, while providing data efficiency and generalisation beyond what neural networks on their own can achieve.

Proceedings Article
Gagandeep Singh1, Timon Gehr1, Matthew Mirman1, Markus Püschel1, Martin Vechev1 
01 Jan 2018
TL;DR: A new method and system, called DeepZ, for certifying neural network robustness based on abstract interpretation that handles ReLU, Tanh and Sigmoid activation functions, is significantly more scalable and precise, and is sound with respect to floating point arithmetic.
Abstract: We present a new method and system, called DeepZ, for certifying neural network robustness based on abstract interpretation Compared to state-of-the-art automated verifiers for neural networks, DeepZ: (i) handles ReLU, Tanh and Sigmoid activation functions, (ii) supports feedforward and convolutional architectures, (iii) is significantly more scalable and precise, and (iv) and is sound with respect to floating point arithmetic These benefits are due to carefully designed approximations tailored to the setting of neural networks As an example, DeepZ achieves a verification accuracy of 97% on a large network with 88,500 hidden units under $L_{\infty}$ attack with $\epsilon = 01$ with an average runtime of 133 seconds

Journal ArticleDOI
TL;DR: Deep learning is used to detect physical-layer attributes for the identification of cognitive radio devices, and the method is based on the empirical principle that manufacturing variability among wireless transmitters that conform to the same standard creates unique, repeatable signatures in each transmission.
Abstract: With the increasing presence of cognitive radio networks as a means to address limited spectral resources, improved wireless security has become a necessity. In particular, the potential of a node to impersonate a licensed user demonstrates the need for techniques to authenticate a radio's true identity. In this paper, we use deep learning to detect physical-layer attributes for the identification of cognitive radio devices, and demonstrate the performance of our method on a set of IEEE 802.15.4 devices. Our method is based on the empirical principle that manufacturing variability among wireless transmitters that conform to the same standard creates unique, repeatable signatures in each transmission, which can then be used as a fingerprint for device identification and verification. We develop a framework for training a convolutional neural network using the time-domain complex baseband error signal and demonstrate 92.29% identification accuracy on a set of seven 2.4 GHz commercial ZigBee devices. We also demonstrate the robustness of our method over a wide range of signal-to-noise ratios.