scispace - formally typeset
Open AccessJournal ArticleDOI

One-Bit Over-the-Air Aggregation for Communication-Efficient Federated Edge Learning: Design and Convergence Analysis

TLDR
A comprehensive analysis of the effects of wireless channel hostilities on the convergence rate of the proposed FEEL scheme is provided, showing that the hostilities slow down the convergence of the learning process by introducing a scaling factor and a bias term into the gradient norm.
Abstract
Federated edge learning (FEEL) is a popular framework for model training at an edge server using data distributed at edge devices (e.g., smart-phones and sensors) without compromising their privacy. In the FEEL framework, edge devices periodically transmit high-dimensional stochastic gradients to the edge server, where these gradients are aggregated and used to update a global model. When the edge devices share the same communication medium, the multiple access channel (MAC) from the devices to the edge server induces a communication bottleneck. To overcome this bottleneck, an efficient broadband analog transmission scheme has been recently proposed, featuring the aggregation of analog modulated gradients (or local models) via the waveform-superposition property of the wireless medium. However, the assumed linear analog modulation makes it difficult to deploy this technique in modern wireless systems that exclusively use digital modulation. To address this issue, we propose in this work a novel digital version of broadband over-the-air aggregation, called one-bit broadband digital aggregation (OBDA). The new scheme features one-bit gradient quantization followed by digital quadrature amplitude modulation (QAM) at edge devices and over-the-air majority-voting based decoding at edge server. We provide a comprehensive analysis of the effects of wireless channel hostilities (channel noise, fading, and channel estimation errors) on the convergence rate of the proposed FEEL scheme. The analysis shows that the hostilities slow down the convergence of the learning process by introducing a scaling factor and a bias term into the gradient norm. However, we show that all the negative effects vanish as the number of participating devices grows, but at a different rate for each type of channel hostility.

read more

Citations
More filters
Journal ArticleDOI

Accelerating DNN Training in Wireless Federated Edge Learning Systems

TL;DR: In this paper, a federated edge learning framework is proposed to aggregate local learning updates at the network edge in lieu of users' raw data to accelerate the training process of deep neural networks.
Journal ArticleDOI

Communication-Efficient Edge AI: Algorithms and Systems

TL;DR: In this paper, a comprehensive survey of the recent developments in various techniques for overcoming the communication challenges in edge AI systems is presented, and the authors also introduce communication-efficient techniques, from both algorithmic and system perspectives, for training and inference tasks at the network edge.
Journal ArticleDOI

Convergence of Update Aware Device Scheduling for Federated Learning at the Wireless Edge

TL;DR: This work designs novel scheduling and resource allocation policies that decide on the subset of the devices to transmit at each round, and how the resources should be allocated among the participating devices, not only based on their channel conditions, but also on the significance of their local model updates.
Journal ArticleDOI

Client Selection and Bandwidth Allocation in Wireless Federated Learning Networks: A Long-Term Perspective

TL;DR: In this paper, a stochastic optimization problem for joint client selection and bandwidth allocation under long-term client energy constraints is formulated, and a new algorithm that utilizes only currently available wireless channel information but can achieve longterm performance guarantee is proposed.
Posted Content

Communication-Efficient Edge AI: Algorithms and Systems

TL;DR: A comprehensive survey of the recent developments in various techniques for overcoming key communication challenges in edge AI systems is presented, and communication-efficient techniques are introduced from both algorithmic and system perspectives for training and inference tasks at the network edge.
References
More filters
Proceedings Article

Communication-Efficient Learning of Deep Networks from Decentralized Data

TL;DR: In this paper, the authors presented a decentralized approach for federated learning of deep networks based on iterative model averaging, and conduct an extensive empirical evaluation, considering five different model architectures and four datasets.
Journal ArticleDOI

A Survey on Mobile Edge Computing: The Communication Perspective

TL;DR: A comprehensive survey of the state-of-the-art MEC research with a focus on joint radio-and-computational resource management is provided in this paper, where a set of issues, challenges, and future research directions for MEC are discussed.
Posted Content

Federated Learning: Strategies for Improving Communication Efficiency

TL;DR: Two ways to reduce the uplink communication costs are proposed: structured updates, where the user directly learns an update from a restricted space parametrized using a smaller number of variables, e.g. either low-rank or a random mask; and sketched updates, which learn a full model update and then compress it using a combination of quantization, random rotations, and subsampling.
Proceedings Article

QSGD: Communication-Efficient SGD via Gradient Quantization and Encoding

TL;DR: Quantized SGD (QSGD) as discussed by the authors is a family of compression schemes for gradient updates which provides convergence guarantees for convex and nonconvex objectives, under asynchrony, and can be extended to stochastic variance-reduced techniques.
Journal ArticleDOI

Computation Over Multiple-Access Channels

TL;DR: It is shown that there is no source-channel separation theorem even when the individual sources are independent, and joint source- channel strategies are developed that are optimal when the structure of the channel probability transition matrix and the function are appropriately matched.
Related Papers (5)