A Theorem of the Alternative for Personalized Federated Learning.

Open AccessPosted Content

A Theorem of the Alternative for Personalized Federated Learning.

- 02 Mar 2021 -

TLDR

This paper shows how the excess risks of personalized federated learning with a smooth, strongly convex loss depend on data heterogeneity from a minimax point of view, and reveals a surprising theorem of the alternative for personalized federation learning.

Abstract:

A widely recognized difficulty in federated learning arises from the statistical heterogeneity among clients: local datasets often come from different but not entirely unrelated distributions, and personalization is, therefore, necessary to achieve optimal results from each individual's perspective. In this paper, we show how the excess risks of personalized federated learning with a smooth, strongly convex loss depend on data heterogeneity from a minimax point of view. Our analysis reveals a surprising theorem of the alternative for personalized federated learning: there exists a threshold such that (a) if a certain measure of data heterogeneity is below this threshold, the FedAvg algorithm [McMahan et al., 2017] is minimax optimal; (b) when the measure of heterogeneity is above this threshold, then doing pure local training (i.e., clients solve empirical risk minimization problems on their local datasets without any communication) is minimax optimal. As an implication, our results show that the presumably difficult (infinite-dimensional) problem of adapting to client-wise heterogeneity can be reduced to a simple binary decision problem of choosing between the two baseline algorithms. Our analysis relies on a new notion of algorithmic stability that takes into account the nature of federated learning.

A Theorem of the Alternative for Personalized Federated Learning.

Citations

FedAvg with Fine Tuning: Local Updates Lead to Representation Learning

Adaptive and Robust Multi-task Learning

Privacy-Preserving Federated Multi-Task Linear Regression: A One-Shot Linear Mixing Approach Inspired By Graph Regularization

Personalized Federated Learning with Gaussian Processes

Personalized Federated Learning with Multiple Known Clusters

References

Minibatch vs Local SGD for Heterogeneous Distributed Learning

Backpropagation Convergence Via Deterministic Nonmonotone Perturbed Minimization

A Unified Analysis of Stochastic Gradient Methods for Nonconvex Federated Optimization.

A No-Free-Lunch Theorem for MultiTask Learning.

Distributed Stochastic Multi-Task Learning with Graph Regularization.

Related Papers (5)

Second-order quantile methods for experts and combinatorial games

A Decision Theoretic Approach to A/B Testing

Learning From People

Data-Pooling in Stochastic Optimization

Structured Output Learning with High Order Loss Functions