Open AccessPosted Content
Dataset Security for Machine Learning: Data Poisoning, Backdoor Attacks, and Defenses.
Micah Goldblum,Dimitris Tsipras,Chulin Xie,Xinyun Chen,Avi Schwarzschild,Dawn Song,Aleksander Madry,Bo Li,Tom Goldstein +8 more
Reads0
Chats0
TLDR
In this article, the authors systematically categorize and discuss a wide range of dataset vulnerabilities and exploits, approaches for defending against these threats, and an array of open problems in this space.Abstract:
As machine learning systems grow in scale, so do their training data requirements, forcing practitioners to automate and outsource the curation of training data in order to achieve state-of-the-art performance. The absence of trustworthy human supervision over the data collection process exposes organizations to security vulnerabilities; training data can be manipulated to control and degrade the downstream behaviors of learned models. The goal of this work is to systematically categorize and discuss a wide range of dataset vulnerabilities and exploits, approaches for defending against these threats, and an array of open problems in this space. In addition to describing various poisoning and backdoor threat models and the relationships among them, we develop their unified taxonomy.read more
Citations
More filters
Posted Content
Evaluating Large Language Models Trained on Code
Mark Chen,Jerry Tworek,Heewoo Jun,Qiming Yuan,Henrique Ponde de Oliveira Pinto,Jared Kaplan,Harrison Edwards,Yuri Burda,Nicholas Joseph,Greg Brockman,Alex Ray,Raul Puri,Gretchen Krueger,Michael Petrov,Heidy Khlaaf,Girish Sastry,Pamela Mishkin,Brooke Chan,Scott Gray,Nick Ryder,Mikhail Pavlov,Alethea Power,Lukasz Kaiser,Mohammad Bavarian,Clemens Winter,Philippe Tillet,Felipe Petroski Such,Dave Cummings,Matthias Plappert,Fotios Chantzis,Elizabeth A. Barnes,Ariel Herbert-Voss,William H. Guss,Alex Nichol,Alex Paino,Nikolas Tezak,Jie Tang,Igor Babuschkin,Suchir Balaji,Shantanu Jain,William Saunders,Christopher Hesse,Andrew N. Carr,Jan Leike,Joshua Achiam,Vedant Misra,Evan Morikawa,Alec Radford,Matthew M. Knight,Miles Brundage,Mira Murati,Katie Mayer,Peter Welinder,Bob McGrew,Dario Amodei,Samuel McCandlish,Ilya Sutskever,Wojciech Zaremba +57 more
TL;DR: Codex as discussed by the authors is a GPT language model fine-tuned on publicly available code from GitHub, and study its Python code-writing capabilities, showing that repeated sampling from the model is a surprisingly effective strategy for producing working solutions to difficult prompts.
Proceedings ArticleDOI
Strong Data Augmentation Sanitizes Poisoning and Backdoor Attacks Without an Accuracy Tradeoff
Eitan Borgnia,Valeriia Cherepanova,Liam Fowl,Amin Ghiasi,Jonas Geiping,Micah Goldblum,Tom Goldstein,Arjun Gupta +7 more
TL;DR: In this paper, strong data augmentations, such as mixup and CutMix, can significantly diminish the threat of poisoning and backdoor attacks without trading off performance, and they further verify the effectiveness of this simple defense against adaptive poisoning methods, and compare to baselines including the popular differentially private SGD (DP-SGD) defense.
Posted Content
Property Inference From Poisoning
TL;DR: In this paper, the authors proposed a poisoning attack that allows the adversary to learn the prevalence in the training data of any property it chooses, which can boost the information leakage significantly and should be considered as a stronger threat model in sensitive applications.
Journal ArticleDOI
Adversarial XAI Methods in Cybersecurity
Aditya Kuppa,Nhien-An Le-Khac +1 more
TL;DR: In this paper, a black-box attack that leverages explainable artificial intelligence (XAI) methods to compromise the confidentiality and privacy properties of underlying classifiers is proposed, which can also facilitate powerful evasion attacks such as poisoning and back door attacks.
Posted Content
What Doesn't Kill You Makes You Robust(er): Adversarial Training against Poisons and Backdoors.
TL;DR: In this paper, the authors extend the adversarial training framework to instead defend against (training-time) poisoning and backdoor attacks, and they show that this defense withstands adaptive attacks, generalizes to diverse threat models, and incurs a better performance trade-off than previous defenses.
References
More filters
Posted Content
Backdoor Attacks Against Deep Learning Systems in the Physical World
TL;DR: This study confirms that (physical) backdoor attacks are not a hypothetical phenomenon but rather pose a serious real-world threat to critical classification tasks and needs new and more robust defenses against backdoors in the physical world.
Posted Content
Certified Robustness to Label-Flipping Attacks via Randomized Smoothing
TL;DR: This work presents a unifying view of randomized smoothing over arbitrary functions, and uses this novel characterization to propose a new strategy for building classifiers that are pointwise-certifiably robust to general data poisoning attacks.
Posted Content
Data Poisoning Attacks on Federated Machine Learning
TL;DR: Experimental results on real-world datasets show that federated multi-task learning model is very sensitive to poisoning attacks, when the attackers either directly poison the target nodes or indirectly poison the related nodes by exploiting the communication protocol.
Posted Content
A Little Is Enough: Circumventing Defenses For Distributed Learning
TL;DR: In this paper, the authors show that small but well-crafted changes are sufficient, leading to a novel non-omniscient attack on distributed learning that go undetected by all existing defenses.
Posted Content
On the Effectiveness of Mitigating Data Poisoning Attacks with Gradient Shaping
TL;DR: This work studies the feasibility of an attack-agnostic defense relying on artifacts that are common to all poisoning attacks, and proposes the prerequisite for a generic poisoning defense: it must bound gradient magnitudes and minimize differences in orientation.