scispace - formally typeset
Open AccessPosted Content

Personalized Federated Learning with Gaussian Processes

Reads0
Chats0
TLDR
In this paper, a solution to PFL that is based on Gaussian processes (GPs) with deep kernel learning is presented, where a shared kernel function across all clients, parameterized by a neural network, with a personal GP classifier for each client.
Abstract
Federated learning aims to learn a global model that performs well on client devices with limited cross-client communication. Personalized federated learning (PFL) further extends this setup to handle data heterogeneity between clients by learning personalized models. A key challenge in this setting is to learn effectively across clients even though each client has unique data that is often limited in size. Here we present pFedGP, a solution to PFL that is based on Gaussian processes (GPs) with deep kernel learning. GPs are highly expressive models that work well in the low data regime due to their Bayesian nature. However, applying GPs to PFL raises multiple challenges. Mainly, GPs performance depends heavily on access to a good kernel function, and learning a kernel requires a large training set. Therefore, we propose learning a shared kernel function across all clients, parameterized by a neural network, with a personal GP classifier for each client. We further extend pFedGP to include inducing points using two novel methods, the first helps to improve generalization in the low data regime and the second reduces the computational cost. We derive a PAC-Bayes generalization bound on novel clients and empirically show that it gives non-vacuous guarantees. Extensive experiments on standard PFL benchmarks with CIFAR-10, CIFAR-100, and CINIC-10, and on a new setup of learning under input noise show that pFedGP achieves well-calibrated predictions while significantly outperforming baseline methods, reaching up to 21% in accuracy gain.

read more

Citations
More filters
Posted Content

Inference-Time Personalized Federated Learning.

TL;DR: In this paper, the authors proposed a new learning setup, Inference-Time PFL (IT-PFL), where a model trained on a set of clients, needs to be later evaluated on novel unlabeled clients at inference time.
References
More filters
Proceedings Article

Personalized Federated Learning using Hypernetworks

TL;DR: In this article, a central hypernetwork model is trained to generate a set of models, one model for each client, while maintaining the capacity to generate unique and diverse personal models, and since hypernetwork parameters are never transmitted, this approach decouples the communication cost from the trainable model size.
Proceedings ArticleDOI

Towards Ubiquitous Learning: A First Measurement of On-Device Training Performance

TL;DR: In this article, the authors performed comprehensive measurements of on-device training with the state-of-the-art training library, 6 mobile phones, and 5 classical neural networks, and reported metrics of training time, energy consumption, memory footprint, hardware utilization, and thermal dynamics.
Proceedings Article

Bayesian Few-Shot Classification with One-vs-Each Pólya-Gamma Augmented Gaussian Processes

TL;DR: This article proposed a Gaussian process classifier based on a novel combination of Polya-gamma augmentation and the one-vs-each softmax approximation that allows us to efficiently marginalize over functions rather than model parameters.
Proceedings Article

Automated Augmented Conjugate Inference for Non-conjugate Gaussian Process Models

TL;DR: In this article, the authors propose automated augmented conjugate inference, a new inference method for non-conjugate Gaussian processes (GP) models, which automatically constructs an auxiliary variable augmentation that renders the GP model conditionally conjugates.
Proceedings Article

Fast Adaptation with Linearized Neural Networks

TL;DR: In this paper, the authors propose to embed the inductive biases of linearization of neural networks into Gaussian processes through a kernel designed from the Jacobian of the network, which can be used for transfer learning.
Related Papers (5)