Open AccessPosted Content
Certified Adversarial Robustness via Randomized Smoothing
TLDR
Strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification on smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies.Abstract:
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at this http URL.read more
Citations
More filters
Journal ArticleDOI
Solving inverse problems using data-driven models
TL;DR: This survey paper aims to give an account of some of the main contributions in data-driven inverse problems.
Posted Content
Test-Time Training with Self-Supervision for Generalization under Distribution Shifts
TL;DR: This work turns a single unlabeled test sample into a self-supervised learning problem, on which the model parameters are updated before making a prediction, which leads to improvements on diverse image classification benchmarks aimed at evaluating robustness to distribution shifts.
Posted Content
Adversarial Examples Are Not Bugs, They Are Features
Andrew Ilyas,Shibani Santurkar,Dimitris Tsipras,Logan Engstrom,Brandon Tran,Aleksander Madry +5 more
TL;DR: The authors demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans.
Posted Content
Fast is better than free: Revisiting adversarial training
TL;DR: In this paper, the fast gradient sign method (FGSM) was used to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training.
Proceedings ArticleDOI
Privacy Risks of Securing Machine Learning Models against Adversarial Examples
TL;DR: This paper measures the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples, and proposes two new inference methods that exploit structural properties of robust models on adversarially perturbed data.
References
More filters
Proceedings ArticleDOI
Deep Residual Learning for Image Recognition
TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI
ImageNet: A large-scale hierarchical image database
TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Dissertation
Learning Multiple Layers of Features from Tiny Images
TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Proceedings Article
Intriguing properties of neural networks
Christian Szegedy,Wojciech Zaremba,Ilya Sutskever,Joan Bruna,Dumitru Erhan,Ian Goodfellow,Rob Fergus,Rob Fergus +7 more
TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
Proceedings Article
Explaining and Harnessing Adversarial Examples
TL;DR: It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.