scispace - formally typeset
Open AccessPosted Content

A Theorem of the Alternative for Personalized Federated Learning.

TLDR
This paper shows how the excess risks of personalized federated learning with a smooth, strongly convex loss depend on data heterogeneity from a minimax point of view, and reveals a surprising theorem of the alternative for personalized federation learning.
Abstract
A widely recognized difficulty in federated learning arises from the statistical heterogeneity among clients: local datasets often come from different but not entirely unrelated distributions, and personalization is, therefore, necessary to achieve optimal results from each individual's perspective. In this paper, we show how the excess risks of personalized federated learning with a smooth, strongly convex loss depend on data heterogeneity from a minimax point of view. Our analysis reveals a surprising theorem of the alternative for personalized federated learning: there exists a threshold such that (a) if a certain measure of data heterogeneity is below this threshold, the FedAvg algorithm [McMahan et al., 2017] is minimax optimal; (b) when the measure of heterogeneity is above this threshold, then doing pure local training (i.e., clients solve empirical risk minimization problems on their local datasets without any communication) is minimax optimal. As an implication, our results show that the presumably difficult (infinite-dimensional) problem of adapting to client-wise heterogeneity can be reduced to a simple binary decision problem of choosing between the two baseline algorithms. Our analysis relies on a new notion of algorithmic stability that takes into account the nature of federated learning.

read more

Citations
More filters
Proceedings ArticleDOI

FedAvg with Fine Tuning: Local Updates Lead to Representation Learning

TL;DR: The reason behind generalizability of the FedAvg’s output is its power in learning the common data representation among the clients’ tasks, by leveraging the diversity among client data distributions via local updates, in the multi-task linear representation setting.
Journal Article

Adaptive and Robust Multi-task Learning

Yaqiong Duan, +1 more
- 10 Feb 2022 - 
TL;DR: In this article , a family of adaptive methods that automatically utilize possible similarities among those tasks while carefully handling their differences is proposed. But their robustness against outlier tasks is questionable.
Proceedings ArticleDOI

Privacy-Preserving Federated Multi-Task Linear Regression: A One-Shot Linear Mixing Approach Inspired By Graph Regularization

TL;DR: This work focuses on the federated multi-task linear regression setting, where each machine possesses its own data for individual tasks and sharing the full local data between machines is prohibited, and proposes a novel fusion framework that only requires a one-shot communication of local estimates.
Posted Content

Personalized Federated Learning with Gaussian Processes

TL;DR: In this paper, a solution to PFL that is based on Gaussian processes (GPs) with deep kernel learning is presented, where a shared kernel function across all clients, parameterized by a neural network, with a personal GP classifier for each client.
Journal ArticleDOI

Personalized Federated Learning with Multiple Known Clusters

TL;DR: This work develops an algorithm that allows each cluster to communicate independently and derive the convergence results, and studies a hierarchical linear model to theoretically demonstrate that this approach outperforms agents learning independently and agents learning a single shared weight.
References
More filters
Proceedings Article

Stochastic Convex Optimization.

TL;DR: Stochastic convex optimization is studied, and it is shown that the key ingredient is strong convexity and regularization, which is only a sufficient, but not necessary, condition for meaningful non-trivial learnability.
Posted Content

Federated Learning with Personalization Layers.

TL;DR: FedPer, a base + personalization layer approach for federated training of deep feedforward neural networks, which can combat the ill-effects of statistical heterogeneity is proposed.
Journal Article

Adaptive Personalized Federated Learning

TL;DR: Information theoretically, it is proved that the mixture of local and global models can reduce the generalization error and a communication-reduced bilevel optimization method is proposed, which reduces the communication rounds to $O(\sqrt{T})$ and can achieve a convergence rate of $O(1/T)$ with some residual error.
Journal ArticleDOI

Information-Theoretic Lower Bounds on the Oracle Complexity of Stochastic Convex Optimization

TL;DR: In this paper, the complexity of stochastic convex optimization in an oracle model of computation is studied and a new notion of discrepancy between functions is introduced, which can be used to reduce problems of convex optimisation to statistical parameter estimation, which is lower bounded using information-theoretic methods.
Journal ArticleDOI

Practical Aspects of the Moreau--Yosida Regularization: Theoretical Preliminaries

TL;DR: The most important part of this study concerns second-order differentiability: existence of a second- order development of f implies that its regularization has a Hessian.