Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Open AccessPosted Content

Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Nicolas Papernot, +2 more

- 24 May 2016 -

arXiv: Cryptography and Security

Chats0

TLDR

New transferability attacks between previously unexplored (substitute, victim) pairs of machine learning model classes, most notably SVMs and decision trees are introduced.

Abstract:

Many machine learning models are vulnerable to adversarial examples: inputs that are specially crafted to cause a machine learning model to produce an incorrect output Adversarial examples that affect one model often affect another model, even if the two models have different architectures or were trained on different training sets, so long as both models were trained to perform the same task An attacker may therefore train their own substitute model, craft adversarial examples against the substitute, and transfer them to a victim model, with very little information about the victim Recent work has further developed a technique that uses the victim model as an oracle to label a synthetic training set for the substitute, so the attacker need not even collect a training set to mount the attack We extend these recent techniques using reservoir sampling to greatly enhance the efficiency of the training procedure for the substitute model We introduce new transferability attacks between previously unexplored (substitute, victim) pairs of machine learning model classes, most notably SVMs and decision trees We demonstrate our attacks on two commercial machine learning classification systems from Amazon (9619% misclassification rate) and Google (8894%) using only 800 queries of the victim model, thereby showing that existing machine learning approaches are in general vulnerable to systematic black-box attacks regardless of their structure

Citations

PDF

Open Access

More filters

Posted Content

Towards Deep Learning Models Resistant to Adversarial Attacks

Aleksander Madry, +4 more

- 19 Jun 2017 -

arXiv: Machine Learning

TL;DR: This work studies the adversarial robustness of neural networks through the lens of robust optimization, and suggests the notion of security against a first-order adversary as a natural and broad security guarantee.

...read moreread less

Book ChapterDOI

Adversarial examples in the physical world

Alexey Kurakin, +2 more

TL;DR: It is found that a large fraction of adversarial examples are classified incorrectly even when perceived through the camera, which shows that even in physical world scenarios, machine learning systems are vulnerable to adversarialExamples.

...read moreread less

Proceedings ArticleDOI

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Nicholas Carlini, +1 more

TL;DR: In this paper, the authors survey ten recent proposals for adversarial examples and compare their efficacy, concluding that all can be defeated by constructing new loss functions, and propose several simple guidelines for evaluating future proposed defenses.

...read moreread less

Proceedings ArticleDOI

Robust Physical-World Attacks on Deep Learning Visual Classification

Kevin Eykholt, +8 more

TL;DR: This work proposes a general attack algorithm, Robust Physical Perturbations (RP2), to generate robust visual adversarial perturbations under different physical conditions and shows that adversarial examples generated using RP2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints.

...read moreread less

Journal ArticleDOI

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

Naveed Akhtar, +1 more

- 19 Feb 2018 -

IEEE Access

TL;DR: A comprehensive survey on adversarial attacks on deep learning in computer vision can be found in this paper, where the authors review the works that design adversarial attack, analyze the existence of such attacks and propose defenses against them.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

Distilling the Knowledge in a Neural Network

Geoffrey E. Hinton, +2 more

- 09 Mar 2015 -

arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

Proceedings Article

Intriguing properties of neural networks

Christian Szegedy, +7 more

TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

...read moreread less

Book

Machine Learning : A Probabilistic Perspective

Kevin P. Murphy

TL;DR: This textbook offers a comprehensive and self-contained introduction to the field of machine learning, based on a unified, probabilistic approach, and is suitable for upper-level undergraduates with an introductory-level college math background and beginning graduate students.

...read moreread less

Proceedings Article

Explaining and Harnessing Adversarial Examples

Ian Goodfellow, +2 more

TL;DR: It is argued that the primary cause of neural networks' vulnerability to adversarial perturbation is their linear nature, supported by new quantitative results while giving the first explanation of the most intriguing fact about them: their generalization across architectures and training sets.

...read moreread less

Proceedings ArticleDOI

The Limitations of Deep Learning in Adversarial Settings

Nicolas Papernot, +5 more

TL;DR: This work formalizes the space of adversaries against deep neural networks (DNNs) and introduces a novel class of algorithms to craft adversarial samples based on a precise understanding of the mapping between inputs and outputs of DNNs.

...read moreread less

Transferability in Machine Learning: from Phenomena to Black-Box Attacks using Adversarial Samples

Citations

Towards Deep Learning Models Resistant to Adversarial Attacks

Adversarial examples in the physical world

Adversarial Examples Are Not Easily Detected: Bypassing Ten Detection Methods

Robust Physical-World Attacks on Deep Learning Visual Classification

Threat of Adversarial Attacks on Deep Learning in Computer Vision: A Survey

References

Distilling the Knowledge in a Neural Network

Intriguing properties of neural networks

Machine Learning : A Probabilistic Perspective

Explaining and Harnessing Adversarial Examples

The Limitations of Deep Learning in Adversarial Settings

Related Papers (5)

Explaining and Harnessing Adversarial Examples

Intriguing properties of neural networks

Towards Evaluating the Robustness of Neural Networks

Practical Black-Box Attacks against Machine Learning

Towards Deep Learning Models Resistant to Adversarial Attacks.