Certified Adversarial Robustness via Randomized Smoothing

Open AccessPosted Content

Certified Adversarial Robustness via Randomized Smoothing

- 08 Feb 2019 -

TLDR

Strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification on smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies.

Abstract:

We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at this http URL.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Solving inverse problems using data-driven models

Simon R. Arridge, +3 more

- 01 May 2019 -

Acta Numerica

TL;DR: This survey paper aims to give an account of some of the main contributions in data-driven inverse problems.

...read moreread less

Posted Content

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

Yu Sun, +5 more

- 29 Sep 2019 -

arXiv: Learning

TL;DR: This work turns a single unlabeled test sample into a self-supervised learning problem, on which the model parameters are updated before making a prediction, which leads to improvements on diverse image classification benchmarks aimed at evaluating robustness to distribution shifts.

...read moreread less

Posted Content

Adversarial Examples Are Not Bugs, They Are Features

Andrew Ilyas, +5 more

- 06 May 2019 -

arXiv: Machine Learning

TL;DR: The authors demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans.

...read moreread less

Posted Content

Fast is better than free: Revisiting adversarial training

Eric Wong, +2 more

- 12 Jan 2020 -

arXiv: Learning

TL;DR: In this paper, the fast gradient sign method (FGSM) was used to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training.

...read moreread less

Proceedings ArticleDOI

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

Liwei Song, +2 more

TL;DR: This paper measures the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples, and proposes two new inference methods that exploit structural properties of robust models on adversarially perturbed data.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

Jia Deng, +5 more

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Proceedings Article

Intriguing properties of neural networks

Christian Szegedy, +7 more

TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

...read moreread less

Proceedings Article

Explaining and Harnessing Adversarial Examples

Ian Goodfellow, +2 more

TL;DR: It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

...read moreread less

Collapse

Certified Adversarial Robustness via Randomized Smoothing

Citations

Solving inverse problems using data-driven models

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

Adversarial Examples Are Not Bugs, They Are Features

Fast is better than free: Revisiting adversarial training

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

References

Deep Residual Learning for Image Recognition

ImageNet: A large-scale hierarchical image database

Learning Multiple Layers of Features from Tiny Images

Intriguing properties of neural networks

Explaining and Harnessing Adversarial Examples

Related Papers (5)

Explaining and Harnessing Adversarial Examples

Intriguing properties of neural networks

Towards Deep Learning Models Resistant to Adversarial Attacks.

Towards Evaluating the Robustness of Neural Networks

Deep Residual Learning for Image Recognition