Practical Black-Box Attacks against Machine Learning

doi:10.1145/3052973.3053009

Open AccessProceedings ArticleDOI

Practical Black-Box Attacks against Machine Learning

Nicolas Papernot, +5 more

- pp 506-519

Chats0

TLDR

This work introduces the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge, and finds that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

Abstract:

Machine learning (ML) models, e.g., deep neural networks (DNNs), are vulnerable to adversarial examples: malicious inputs modified to yield erroneous model outputs, while appearing unmodified to human observers. Potential attacks include having malicious content like malware identified as legitimate or controlling vehicle behavior. Yet, all existing adversarial example attacks require knowledge of either the model internals or its training data. We introduce the first practical demonstration of an attacker controlling a remotely hosted DNN with no such knowledge. Indeed, the only capability of our black-box adversary is to observe labels given by the DNN to chosen inputs. Our attack strategy consists in training a local model to substitute for the target DNN, using inputs synthetically generated by an adversary and labeled by the target DNN. We use the local substitute to craft adversarial examples, and find that they are misclassified by the targeted DNN. To perform a real-world and properly-blinded evaluation, we attack a DNN hosted by MetaMind, an online deep learning API. We find that their DNN misclassifies 84.24% of the adversarial examples crafted with our substitute. We demonstrate the general applicability of our strategy to many ML techniques by conducting the same attack against models hosted by Amazon and Google, using logistic regression substitutes. They yield adversarial examples misclassified by Amazon and Google at rates of 96.19% and 88.94%. We also find that this black-box attack strategy is capable of evading defense strategies previously found to make adversarial example crafting harder.

Citations

PDF

Open Access

More filters

Proceedings ArticleDOI

Boosting Adversarial Attacks with Momentum

Yinpeng Dong, +6 more

TL;DR: A broad class of momentum-based iterative algorithms to boost adversarial attacks by integrating the momentum term into the iterative process for attacks, which can stabilize update directions and escape from poor local maxima during the iterations, resulting in more transferable adversarial examples.

...read moreread less

Posted Content

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

Anish Athalye, +2 more

- 01 Feb 2018 -

arXiv: Learning

TL;DR: This work identifies obfuscated gradients, a kind of gradient masking, as a phenomenon that leads to a false sense of security in defenses against adversarial examples, and develops attack techniques to overcome this effect.

...read moreread less

Journal ArticleDOI

One Pixel Attack for Fooling Deep Neural Networks

Jiawei Su, +2 more

- 04 Jan 2019 -

IEEE Transactions on Evolutionary Comput...

TL;DR: This paper proposes a novel method for generating one-pixel adversarial perturbations based on differential evolution (DE), which requires less adversarial information (a black-box attack) and can fool more types of networks due to the inherent features of DE.

...read moreread less

Proceedings ArticleDOI

Robust Physical-World Attacks on Deep Learning Visual Classification

Kevin Eykholt, +8 more

TL;DR: This work proposes a general attack algorithm, Robust Physical Perturbations (RP2), to generate robust visual adversarial perturbations under different physical conditions and shows that adversarial examples generated using RP2 achieve high targeted misclassification rates against standard-architecture road sign classifiers in the physical world under various environmental conditions, including viewpoints.

...read moreread less

Posted Content

Concrete Problems in AI Safety

Dario Amodei, +5 more

- 21 Jun 2016 -

arXiv: Artificial Intelligence

TL;DR: A list of five practical research problems related to accident risk, categorized according to whether the problem originates from having the wrong objective function, an objective function that is too expensive to evaluate frequently, or undesirable behavior during the learning process, are presented.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Gradient-based learning applied to document recognition

Yann LeCun, +6 more

TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.

...read moreread less

Proceedings ArticleDOI

Glove: Global Vectors for Word Representation

Jeffrey Pennington, +2 more

TL;DR: A new global logbilinear regression model that combines the advantages of the two major model families in the literature: global matrix factorization and local context window methods and produces a vector space with meaningful substructure.

...read moreread less

Posted Content

Distilling the Knowledge in a Neural Network

Geoffrey E. Hinton, +2 more

- 09 Mar 2015 -

arXiv: Machine Learning

TL;DR: This work shows that it can significantly improve the acoustic model of a heavily used commercial system by distilling the knowledge in an ensemble of models into a single model and introduces a new type of ensemble composed of one or more full models and many specialist models which learn to distinguish fine-grained classes that the full models confuse.

...read moreread less

Posted Content

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Kaiming He, +3 more

- 06 Feb 2015 -

arXiv: Computer Vision and Pattern Recog...

TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.

...read moreread less

Proceedings Article

Intriguing properties of neural networks

Christian Szegedy, +7 more

TL;DR: It is found that there is no distinction between individual highlevel units and random linear combinations of high level units, according to various methods of unit analysis, and it is suggested that it is the space, rather than the individual units, that contains of the semantic information in the high layers of neural networks.

...read moreread less

Collapse

Practical Black-Box Attacks against Machine Learning

Citations

Boosting Adversarial Attacks with Momentum

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples

One Pixel Attack for Fooling Deep Neural Networks

Robust Physical-World Attacks on Deep Learning Visual Classification

Concrete Problems in AI Safety

References

Gradient-based learning applied to document recognition

Glove: Global Vectors for Word Representation

Distilling the Knowledge in a Neural Network

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

Intriguing properties of neural networks

Related Papers (5)

Explaining and Harnessing Adversarial Examples

Intriguing properties of neural networks

Towards Evaluating the Robustness of Neural Networks

DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks

Towards Deep Learning Models Resistant to Adversarial Attacks.