scispace - formally typeset
Open AccessPosted Content

Certified Adversarial Robustness via Randomized Smoothing

TLDR
Strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification on smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies.
Abstract
We show how to turn any classifier that classifies well under Gaussian noise into a new classifier that is certifiably robust to adversarial perturbations under the $\ell_2$ norm. This "randomized smoothing" technique has been proposed recently in the literature, but existing guarantees are loose. We prove a tight robustness guarantee in $\ell_2$ norm for smoothing with Gaussian noise. We use randomized smoothing to obtain an ImageNet classifier with e.g. a certified top-1 accuracy of 49% under adversarial perturbations with $\ell_2$ norm less than 0.5 (=127/255). No certified defense has been shown feasible on ImageNet except for smoothing. On smaller-scale datasets where competing approaches to certified $\ell_2$ robustness are viable, smoothing delivers higher certified accuracies. Our strong empirical results suggest that randomized smoothing is a promising direction for future research into adversarially robust classification. Code and models are available at this http URL.

read more

Citations
More filters
Journal ArticleDOI

Solving inverse problems using data-driven models

TL;DR: This survey paper aims to give an account of some of the main contributions in data-driven inverse problems.
Posted Content

Test-Time Training with Self-Supervision for Generalization under Distribution Shifts

TL;DR: This work turns a single unlabeled test sample into a self-supervised learning problem, on which the model parameters are updated before making a prediction, which leads to improvements on diverse image classification benchmarks aimed at evaluating robustness to distribution shifts.
Posted Content

Adversarial Examples Are Not Bugs, They Are Features

TL;DR: The authors demonstrate that adversarial examples can be directly attributed to the presence of non-robust features: features derived from patterns in the data distribution that are highly predictive, yet brittle and incomprehensible to humans.
Posted Content

Fast is better than free: Revisiting adversarial training

TL;DR: In this paper, the fast gradient sign method (FGSM) was used to train empirically robust models using a much weaker and cheaper adversary, an approach that was previously believed to be ineffective, rendering the method no more costly than standard training.
Proceedings ArticleDOI

Privacy Risks of Securing Machine Learning Models against Adversarial Examples

TL;DR: This paper measures the success of membership inference attacks against six state-of-the-art defense methods that mitigate the risk of adversarial examples, and proposes two new inference methods that exploit structural properties of robust models on adversarially perturbed data.
References
More filters
Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.
Proceedings ArticleDOI

ImageNet: A large-scale hierarchical image database

TL;DR: A new database called “ImageNet” is introduced, a large-scale ontology of images built upon the backbone of the WordNet structure, much larger in scale and diversity and much more accurate than the current image datasets.
Dissertation

Learning Multiple Layers of Features from Tiny Images

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.
Proceedings Article

Intriguing properties of neural networks

TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.
Proceedings Article

Explaining and Harnessing Adversarial Examples

TL;DR: It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.
Related Papers (5)