On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning.

Open AccessPosted Content

On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning.

Anvith Thudi, +3 more

- 22 Oct 2021 -

arXiv: Learning

Chats0

TLDR

In this paper, the authors show that even for a given training trajectory one cannot formally prove the absence of certain data points used during training, since one can obtain the same model using different datasets.

Abstract:

Machine unlearning, i.e. having a model forget about some of its training data, has become increasingly more important as privacy legislation promotes variants of the right-to-be-forgotten. In the context of deep learning, approaches for machine unlearning are broadly categorized into two classes: exact unlearning methods, where an entity has formally removed the data point's impact on the model by retraining the model from scratch, and approximate unlearning, where an entity approximates the model parameters one would obtain by exact unlearning to save on compute costs. In this paper we first show that the definition that underlies approximate unlearning, which seeks to prove the approximately unlearned model is close to an exactly retrained model, is incorrect because one can obtain the same model using different datasets. Thus one could unlearn without modifying the model at all. We then turn to exact unlearning approaches and ask how to verify their claims of unlearning. Our results show that even for a given training trajectory one cannot formally prove the absence of certain data points used during training. We thus conclude that unlearning is only well-defined at the algorithmic level, where an entity's only possible auditable claim to unlearning is that they used a particular algorithm designed to allow for external scrutiny during an audit.

On the Necessity of Auditable Algorithmic Definitions for Machine Unlearning.

Citations

Forget-SVGD: Particle-Based Bayesian Federated Unlearning

References

BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Backpropagation applied to handwritten zip code recognition

Calibrating noise to sensitivity in private data analysis

Membership Inference Attacks Against Machine Learning Models

Language Models are Few-Shot Learners

Related Papers (5)

On the limits of forgetting in Answer Set Programming

Extending the Valiant Learning Model

Inductive inference in the limit

Almost Everybody Disagrees Almost All the Time: The Genericity of Weakly Merging Nowhere

Learning by Failing to Explain: Using Partial Explanations to Learn in Incomplete or Intractable Domains