Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Open AccessPosted Content

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

- 01 Feb 2018 -

TLDR

This work identifies obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples, and develops attack techniques to overcome this effect.

Abstract:

We identify obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples. While defenses that cause obfuscated gradients appear to defeat iterative optimization-based attacks, we find defenses relying on this effect can be circumvented. We describe characteristic behaviors of defenses exhibiting the effect, and for each of the three types of obfuscated gradients we discover, we develop attack techniques to overcome it. In a case study, examining non-certified white-box-secure defenses at ICLR 2018, we find obfuscated gradients are a common occurrence, with 7 of 9 defenses relying on obfuscated gradients. Our new attacks successfully circumvent 6 completely, and 1 partially, in the original threat model each paper considers.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Nicholas Carlini, +1 more

TL;DR: A white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end has a 100% success rate, and the feasibility of this attack introduce a new domain to study adversarial examples.

...read moreread less

Posted Content

Provable defenses against adversarial examples via the convex outer adversarial polytope

J. Zico Kolter, +1 more

- 02 Nov 2017 -

arXiv: Learning

TL;DR: A method to learn deep ReLU-based classifiers that are provably robust against norm-bounded adversarial perturbations, and it is shown that the dual problem to this linear program can be represented itself as a deep network similar to the backpropagation network, leading to very efficient optimization approaches that produce guaranteed bounds on the robust loss.

...read moreread less

Posted Content

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks.

Francesco Croce, +1 more

- 03 Mar 2020 -

arXiv: Learning

TL;DR: Two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function are proposed and combined with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness.

...read moreread less

Proceedings ArticleDOI

Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

Battista Biggio, +1 more

TL;DR: A thorough overview of the evolution of this research area over the last ten years and beyond is provided, starting from pioneering, earlier work on the security of non-deep learning algorithms up to more recent work aimed to understand the security properties of deep learning algorithms, in the context of computer vision and cybersecurity tasks.

...read moreread less

Posted Content

On Evaluating Adversarial Robustness

Nicholas Carlini, +8 more

- 18 Feb 2019 -

arXiv: Learning

TL;DR: The methodological foundations are discussed, commonly accepted best practices are reviewed, and new methods for evaluating defenses to adversarial examples are suggested.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Proceedings ArticleDOI

Deep Residual Learning for Image Recognition

Kaiming He, +3 more

TL;DR: In this article, the authors proposed a residual learning framework to ease the training of networks that are substantially deeper than those used previously, which won the 1st place on the ILSVRC 2015 classification task.

...read moreread less

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Journal ArticleDOI

Learning representations by back-propagating errors

David E. Rumelhart, +2 more

- 01 Jan 1988 -

Nature

TL;DR: Back-propagation repeatedly adjusts the weights of the connections in the network so as to minimize a measure of the difference between the actual output vector of the net and the desired output vector, which helps to represent important features of the task domain.

...read moreread less

Proceedings ArticleDOI

Rethinking the Inception Architecture for Computer Vision

Christian Szegedy, +4 more

TL;DR: In this article, the authors explore ways to scale up networks in ways that aim at utilizing the added computation as efficiently as possible by suitably factorized convolutions and aggressive regularization.

...read moreread less

Dissertation

Learning Multiple Layers of Features from Tiny Images

Alex Krizhevsky

TL;DR: In this paper, the authors describe how to train a multi-layer generative model of natural images, using a dataset of millions of tiny colour images, described in the next section.

...read moreread less

Collapse

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Citations

Audio Adversarial Examples: Targeted Attacks on Speech-to-Text

Provable defenses against adversarial examples via the convex outer adversarial polytope

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks.

Wild Patterns: Ten Years After the Rise of Adversarial Machine Learning

On Evaluating Adversarial Robustness

References

Deep Residual Learning for Image Recognition

Generative Adversarial Nets

Learning representations by back-propagating errors

Rethinking the Inception Architecture for Computer Vision

Learning Multiple Layers of Features from Tiny Images

Related Papers (5)

Explaining and Harnessing Adversarial Examples

Intriguing properties of neural networks

Towards Evaluating the Robustness of Neural Networks

Towards Deep Learning Models Resistant to Adversarial Attacks.

Deep Residual Learning for Image Recognition