scispace - formally typeset
Journal ArticleDOI

Federated Optimization Algorithms with Random Reshuffling and Gradient Compression

Reads0
Chats0
TLDR
This work develops a distributed variant of random reshuffling with gradient compression (Q-RR), and shows how to reduce the variance coming from gradient quantization through the use of control iterates, and proposes a variant of Q-RR called Q-NASTYA to have a better fit to Federated Learning applications.
Abstract
Gradient compression is a popular technique for improving communication complexity of stochastic first-order methods in distributed training of machine learning models. However, the existing works consider only with-replacement sampling of stochastic gradients. In contrast, it is well-known in practice and recently confirmed in theory that stochastic methods based on without-replacement sampling, e.g., Random Reshuffling (RR) method, perform better than ones that sample the gradients with-replacement. In this work, we close this gap in the literature and provide the first analysis of methods with gradient compression and without-replacement sampling. We first develop a distributed variant of random reshuffling with gradient compression (Q-RR), and show how to reduce the variance coming from gradient quantization through the use of control iterates. Next, to have a better fit to Federated Learning applications, we incorporate local computation and propose a variant of Q-RR called Q-NASTYA. Q-NASTYA uses local gradient steps and different local and global stepsizes. Next, we show how to reduce compression variance in this setting as well. Finally, we prove the convergence results for the proposed methods and outline several settings in which they improve upon existing algorithms.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Provably Doubly Accelerated Federated Learning: The First Theoretically Successful Combination of Local Training and Compressed Communication

TL;DR: This paper proposes the first algorithm for distributed optimization and federated learning, which harnesses these two strategies jointly and converges linearly to an exact solution, with a doubly accelerated rate.
Journal ArticleDOI

FedVQCS: Federated Learning via Vector Quantized Compressed Sensing

TL;DR: Simulation results on the MNIST and CIFAR-10 datasets demonstrate that the proposed framework provides more than a 2.5 % increase in classification accuracy compared to state-of-the-art FL frameworks when the communication overhead of the local model update transmission is less than 0.1 bit per local model entry.
Journal ArticleDOI

Federated Learning with Regularized Client Participation

TL;DR: In this article , the authors proposed a regularized client participation scheme, where each client joins the learning process every $R$ communication rounds, referred to as a meta epoch, which leads to a reduction in the variance caused by client sampling.

CD-GraB: Coordinating Distributed Example Orders for Provably Accelerated Training

TL;DR: Coordinated Distributed Gradient Balancing (CD-GraB) as discussed by the authors uses insights from prior work on kernel thinning to translate the benefits of provably faster permutation-based example ordering to distributed settings.
Journal ArticleDOI

Improving Accelerated Federated Learning with Compression and Importance Sampling

TL;DR: In this paper , Chen et al. presented a complete method for federated learning that incorporates all necessary ingredients: Local Training, Compression, and Partial Participation, and obtained state-of-the-art convergence guarantees in the considered setting.
References
More filters
Posted Content

Deep Residual Learning for Image Recognition

TL;DR: This work presents a residual learning framework to ease the training of networks that are substantially deeper than those used previously, and provides comprehensive empirical evidence showing that these residual networks are easier to optimize, and can gain accuracy from considerably increased depth.
Journal ArticleDOI

LIBSVM: A library for support vector machines

TL;DR: Issues such as solving SVM optimization problems theoretical convergence multiclass classification probability estimates and parameter selection are discussed in detail.
Proceedings Article

Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift

TL;DR: Applied to a state-of-the-art image classification model, Batch Normalization achieves the same accuracy with 14 times fewer training steps, and beats the original model by a significant margin.
Posted Content

Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification

TL;DR: This work proposes a Parametric Rectified Linear Unit (PReLU) that generalizes the traditional rectified unit and derives a robust initialization method that particularly considers the rectifier nonlinearities.
Posted Content

Communication-Efficient Learning of Deep Networks from Decentralized Data

TL;DR: This work presents a practical method for the federated learning of deep networks based on iterative model averaging, and conducts an extensive empirical evaluation, considering five different model architectures and four datasets.
Trending Questions (1)
What are the top3 federated learning SOTA algorithms?

The paper does not explicitly mention the top three federated learning SOTA algorithms.