Top 7 papers published by Nicholas Carlini from Google in 2017

Proceedings Article•DOI•

Towards Evaluating the Robustness of Neural Networks

[...]

Nicholas Carlini¹, David Wagner¹•Institutions (1)

22 May 2017

TL;DR: In this paper, the authors demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with 100% probability.

...read moreread less

Abstract: Neural networks provide state-of-the-art results for most machine learning tasks. Unfortunately, neural networks are vulnerable to adversarial examples: given an input x and any target classification t, it is possible to find a new input x' that is similar to x but classified as t. This makes it difficult to apply neural networks in security-critical areas. Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from 95% to 0.5%.In this paper, we demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with 100% probability. Our attacks are tailored to three distance metrics used previously in the literature, and when compared to previous adversarial example generation algorithms, our attacks are often much more effective (and never worse). Furthermore, we propose using high-confidence adversarial examples in a simple transferability test we show can also be used to break defensive distillation. We hope our attacks will be used as a benchmark in future defense attempts to create neural networks that resist adversarial examples.

...read moreread less

6,528 citations

Proceedings Article•DOI•

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

[...]

Nicholas Carlini¹, David Wagner¹•Institutions (1)

University of California, Berkeley¹

03 Nov 2017

TL;DR: In this paper, the authors survey ten recent proposals for adversarial examples and compare their efficacy, concluding that all can be defeated by constructing new loss functions, and propose several simple guidelines for evaluating future proposed defenses.

...read moreread less

Abstract: Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly. In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy. We show that all can be defeated by constructing new loss functions. We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not. Finally, we propose several simple guidelines for evaluating future proposed defenses.

...read moreread less

1,703 citations

Posted Content•

Adversarial Example Defenses: Ensembles of Weak Defenses are not Strong

[...]

Warren He, James Wei, Xinyun Chen, Nicholas Carlini, Dawn Song - Show less +1 more

15 Jun 2017-arXiv: Learning

TL;DR: It is shown that an adaptive adversary can create adversarial examples successfully with low distortion, implying that ensemble of weak defenses is not sufficient to provide strong defense against adversarialExamples.

...read moreread less

Abstract: Ongoing research has proposed several methods to defend neural networks against adversarial examples, many of which researchers have shown to be ineffective. We ask whether a strong defense can be created by combining multiple (possibly weak) defenses. To answer this question, we study three defenses that follow this approach. Two of these are recently proposed defenses that intentionally combine components designed to work well together. A third defense combines three independent defenses. For all the components of these defenses and the combined defenses themselves, we show that an adaptive adversary can create adversarial examples successfully with low distortion. Thus, our work implies that ensemble of weak defenses is not sufficient to provide strong defense against adversarial examples.

...read moreread less

268 citations

Adversarial Example Defense: Ensembles of Weak Defenses are not Strong

[...]

Warren He¹, James Wei¹, Xinyun Chen¹, Nicholas Carlini¹, Dawn Song¹ - Show less +1 more•Institutions (1)

University of California, Berkeley¹

01 Jan 2017

TL;DR: In this article, the authors show that ensemble of weak defenses is not sufficient to provide strong defense against adversarial examples, and they show that an adaptive adversary can create adversarial samples successfully with low distortion.

...read moreread less

Abstract: Ongoing research has proposed several methods to defend neural networks against adversarial examples, many of which researchers have shown to be ineffective. We ask whether a strong defense can be created by combining multiple (possibly weak) defenses. To answer this question, we study three defenses that follow this approach. Two of these are recently proposed defenses that intentionally combine components designed to work well together. A third defense combines three independent defenses. For all the components of these defenses and the combined defenses themselves, we show that an adaptive adversary can create adversarial examples successfully with low distortion. Thus, our work implies that ensemble of weak defenses is not sufficient to provide strong defense against adversarial examples.

...read moreread less

231 citations

Posted Content•

MagNet and "Efficient Defenses Against Adversarial Attacks" are Not Robust to Adversarial Examples

[...]

Nicholas Carlini, David Wagner

22 Nov 2017-arXiv: Learning

TL;DR: It is found that adversarial examples can be constructed that defeatMagNet and "Efficient Defenses" with only a slight increase in distortion.

...read moreread less

Abstract: MagNet and "Efficient Defenses..." were recently proposed as a defense to adversarial examples. We find that we can construct adversarial examples that defeat these defenses with only a slight increase in distortion.

...read moreread less

217 citations

Posted Content•

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

[...]

Nicholas Carlini¹, David Wagner¹•Institutions (1)

University of California, Berkeley¹

20 May 2017-arXiv: Learning

TL;DR: In this article, the authors survey ten recent proposals that are designed for detection of adversarial examples and compare their efficacy, concluding that all of them can be defeated by constructing new loss functions.

...read moreread less

Abstract: Neural networks are known to be vulnerable to adversarial examples: inputs that are close to natural inputs but classified incorrectly In order to better understand the space of adversarial examples, we survey ten recent proposals that are designed for detection and compare their efficacy We show that all can be defeated by constructing new loss functions We conclude that adversarial examples are significantly harder to detect than previously appreciated, and the properties believed to be intrinsic to adversarial examples are in fact not Finally, we propose several simple guidelines for evaluating future proposed defenses

...read moreread less

164 citations

Posted Content•

Provably Minimally-Distorted Adversarial Examples

[...]

Nicholas Carlini, Guy Katz, Clark Barrett, David L. Dill

29 Sep 2017-arXiv: Learning

TL;DR: It is demonstrated that one of the recent ICLR defense proposals, adversarial retraining, provably succeeds at increasing the distortion required to construct adversarial examples by a factor of 4.2.

...read moreread less

Abstract: The ability to deploy neural networks in real-world, safety-critical systems is severely limited by the presence of adversarial examples: slightly perturbed inputs that are misclassified by the network. In recent years, several techniques have been proposed for increasing robustness to adversarial examples --- and yet most of these have been quickly shown to be vulnerable to future attacks. For example, over half of the defenses proposed by papers accepted at ICLR 2018 have already been broken. We propose to address this difficulty through formal verification techniques. We show how to construct provably minimally distorted adversarial examples: given an arbitrary neural network and input sample, we can construct adversarial examples which we prove are of minimal distortion. Using this approach, we demonstrate that one of the recent ICLR defense proposals, adversarial retraining, provably succeeds at increasing the distortion required to construct adversarial examples by a factor of 4.2.

...read moreread less

115 citations

Showing papers by "Nicholas Carlini published in 2017"