Journal ArticleDOI
Learning to Generate Image Embeddings with User-level Differential Privacy
Zheng Xu,Maxwell D. Collins,Yuxiao Wang,Liviu Panait,Se-Heum Oh,Sean Augenstein,Ting Liu,Florian Schroff,H. Brendan McMahan +8 more
Reads0
Chats0
TLDR
DP-FedEmb as discussed by the authors is a variant of federated learning algorithms with per-user sensitivity control and noise addition, to train from user-partitioned data centralized in the datacenter.Abstract:
Small on-device models have been successfully trained with user-level differential privacy (DP) for next word prediction and image classification tasks in the past. However, existing methods can fail when directly applied to learn embedding models using supervised training data with a large class space. To achieve user-level DP for large image-to-embedding feature extractors, we propose DP-FedEmb, a variant of federated learning algorithms with per-user sensitivity control and noise addition, to train from user-partitioned data centralized in the datacenter. DP-FedEmb combines virtual clients, partial aggregation, private local fine-tuning, and public pretraining to achieve strong privacy utility trade-offs. We apply DP-FedEmb to train image embedding models for faces, landmarks and natural species, and demonstrate its superior utility under same privacy budget on benchmark datasets DigiFace, EMNIST, GLD and iNaturalist. We further illustrate it is possible to achieve strong user-level DP guarantees of $\epsilon<4$ while controlling the utility drop within 5%, when millions of users can participate in training.read more
Citations
More filters
Journal ArticleDOI
How to DP-fy ML: A Practical Guide to Machine Learning with Differential Privacy
Natalia Ponomareva,Hussein Hazimeh,Alexey Kurakin,Zheng Xu,Carson E. Denison,H. Brendan McMahan,Sergei Vassilvitskii,Steve Chien,Abhradeep Guha Thakurta +8 more
TL;DR: Differential privacy has become a gold standard for making formal statements about data anonymization as mentioned in this paper , and while some adoption of DP has happened in industry, attempts to apply DP to real world complex ML models are still few and far between.
Journal ArticleDOI
Differentially Private Diffusion Models Generate Useful Synthetic Images
Sahra Ghalebikesabi,Leonard Berrada,Sven Gowal,Ira Ktena,Robert Stanforth,Jamie Hayes,Soham De,Samuel L. Smith,Olivia Wiles,Borja Balle +9 more
TL;DR: In this article , Wang et al. used differential privacy to fine-tune ImageNet pre-trained diffusion models with more than 80M parameters and obtained SOTA results on CIFAR-10 and Camelyon17 in terms of both FID and the accuracy of downstream classifiers trained on synthetic data.
Proceedings ArticleDOI
Federated Learning of Gboard Language Models with Differential Privacy
TL;DR: In this article , the authors train and deploy language models (LMs) with federated learning (FL) and differential privacy (DP) in Google Keyboard (Gboard) using the recent DP-Follow the Regularized Leader (DP-FTRL) algorithm.
Journal ArticleDOI
An Empirical Evaluation of Federated Contextual Bandit Algorithms
TL;DR: In this article , the authors propose federated contextual bandits for learning from sensitive data local to user devices, where the learning can be done using implicit signals generated as users interact with the applications of interest, rather than requiring access to explicit labels.
Journal ArticleDOI
Can Public Large Language Models Help Private Cross-device Federated Learning?
TL;DR: The authors proposed a distribution matching algorithm with theoretical grounding to sample public data close to private data distribution, which significantly improves the sample efficiency of (pre-)training on public data, and further improves the privacy-utility tradeoff by techniques of distillation.
References
More filters
Posted Content
Deep Residual Learning for Image Recognition
TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Proceedings Article
Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift
Sergey Ioffe,Christian Szegedy +1 more
TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Gradient-based learning applied to document recognition
Yann LeCun,Léon Bottou,Léon Bottou,Yoshua Bengio,Yoshua Bengio,Yoshua Bengio,Patrick Haffner,Patrick Haffner +7 more
TL;DR: This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task, and Convolutional neural networks are shown to outperform all other techniques.
Posted Content
MobileNetV2: Inverted Residuals and Linear Bottlenecks
TL;DR: A new mobile architecture, MobileNetV2, is described that improves the state of the art performance of mobile models on multiple tasks and benchmarks as well as across a spectrum of different model sizes and allows decoupling of the input/output domains from the expressiveness of the transformation.
Posted Content
A Simple Framework for Contrastive Learning of Visual Representations
TL;DR: It is shown that composition of data augmentations plays a critical role in defining effective predictive tasks, and introducing a learnable nonlinear transformation between the representation and the contrastive loss substantially improves the quality of the learned representations, and contrastive learning benefits from larger batch sizes and more training steps compared to supervised learning.