Search or ask a question

Showing papers by "Nicholas Carlini published in 2016"

PDF

Open Access

Proceedings Article•

[...]

Nicholas Carlini¹, Pratyush Mishra¹, Tavish Vaidya², Yuankai Zhang², Micah Sherr², Clay Shields², David Wagner¹, Wenchao Zhou² - Show less +4 more•Institutions (2)

University of California, Berkeley¹, Georgetown University²

10 Aug 2016

TL;DR: This paper explores in this paper how voice interfaces can be attacked with hidden voice commands that are unintelligible to human listeners but which are interpreted as commands by devices.

...read moreread less

Abstract: Voice interfaces are becoming more ubiquitous and are now the primary input method for many devices. We explore in this paper how they can be attacked with hidden voice commands that are unintelligible to human listeners but which are interpreted as commands by devices. We evaluate these attacks under two different threat models. In the black-box model, an attacker uses the speech recognition system as an opaque oracle. We show that the adversary can produce difficult to understand commands that are effective against existing systems in the black-box model. Under the white-box model, the attacker has full knowledge of the internals of the speech recognition system and uses it to create attack commands that we demonstrate through user testing are not understandable by humans. We then evaluate several defenses, including notifying the user when a voice command is accepted; a verbal challenge-response protocol; and a machine learning approach that can detect our attacks with 99.8% accuracy.

...read moreread less

545 citations

Posted Content•

Technical Report on the CleverHans v2.1.0 Adversarial Examples Library

[...]

03 Oct 2016-arXiv: Learning

TL;DR: The core functionalities of the CleverHans library are presented, namely the attacks based on adversarial examples and defenses to improve the robustness of machine learning models to these attacks.

...read moreread less

Abstract: CleverHans is a software library that provides standardized reference implementations of adversarial example construction techniques and adversarial training. The library may be used to develop more robust machine learning models and to provide standardized benchmarks of models' performance in the adversarial setting. Benchmarks constructed without a standardized implementation of adversarial example construction are not comparable to each other, because a good result may indicate a robust model or it may merely indicate a weak implementation of the adversarial example construction procedure. This technical report is structured as follows. Section 1 provides an overview of adversarial examples in machine learning and of the CleverHans software. Section 2 presents the core functionalities of the library: namely the attacks based on adversarial examples and defenses to improve the robustness of machine learning models to these attacks. Section 3 describes how to report benchmark results using the library. Section 4 describes the versioning system.

...read moreread less

400 citations

Posted Content•

Defensive Distillation is Not Robust to Adversarial Examples

[...]

Nicholas Carlini, David Wagner

14 Jul 2016-arXiv: Cryptography and Security

TL;DR: It is shown that defensive distillation is not secure: it is no more resistant to targeted misclassification attacks than unprotected neural networks.

...read moreread less

Abstract: We show that defensive distillation is not secure: it is no more resistant to targeted misclassification attacks than unprotected neural networks.

...read moreread less

297 citations

Posted Content•

Towards Evaluating the Robustness of Neural Networks

[...]

Nicholas Carlini¹, David Wagner¹•Institutions (1)

University of California, Berkeley¹

16 Aug 2016-arXiv: Cryptography and Security

TL;DR: It is demonstrated that defensive distillation does not significantly increase the robustness of neural networks, and three new attack algorithms are introduced that are successful on both distilled and undistilled neural networks with 100% probability are introduced.

...read moreread less

Abstract: Neural networks provide state-of-the-art results for most machine learning tasks. Unfortunately, neural networks are vulnerable to adversarial examples: given an input $x$ and any target classification $t$, it is possible to find a new input $x'$ that is similar to $x$ but classified as $t$. This makes it difficult to apply neural networks in security-critical areas. Defensive distillation is a recently proposed approach that can take an arbitrary neural network, and increase its robustness, reducing the success rate of current attacks' ability to find adversarial examples from $95\%$ to $0.5\%$. In this paper, we demonstrate that defensive distillation does not significantly increase the robustness of neural networks by introducing three new attack algorithms that are successful on both distilled and undistilled neural networks with $100\%$ probability. Our attacks are tailored to three distance metrics used previously in the literature, and when compared to previous adversarial example generation algorithms, our attacks are often much more effective (and never worse). Furthermore, we propose using high-confidence adversarial examples in a simple transferability test we show can also be used to break defensive distillation. We hope our attacks will be used as a benchmark in future defense attempts to create neural networks that resist adversarial examples.

...read moreread less

224 citations