Federated Machine Learning: Survey, Multi-Level Classification, Desirable Criteria and Future Directions in Communication and Networking Systems

doi:10.1109/COMST.2021.3058573

This is an electronic reprint of the original article.

This reprint may differ from the original in pagination and typographic detail.

Powered by TCPDF (www.tcpdf.org)

This material is protected by copyright and other intellectual property rights, and duplication or sale of all or

part of any of the repository collections is not permitted, except that material may be duplicated by you for

your research use or educational purposes in electronic or print form. You must obtain permission for any

other use. Electronic or print copies may not be offered, whether for sale or otherwise to anyone who is not

an authorised user.

Wahab, Omar Abdel; Otrok, Hadi; Mourad, Azzam; Taleb, Tarik

Federated Machine Learning

Published in:

IEEE Communications Surveys and Tutorials

DOI:

10.1109/COMST.2021.3058573

Published: 01/02/2021

Document Version

Peer reviewed version

Please cite the original version:

Wahab, O. A., Otrok, H., Mourad, A., & Taleb, T. (2021). Federated Machine Learning: Survey, Multi-Level

Classification, Desirable Criteria and Future Directions in Communication and Networking Systems. IEEE

Communications Surveys and Tutorials, 23(2), 1342-1397. [9352033].

https://doi.org/10.1109/COMST.2021.3058573

Personal use of this material is permitted. Permission from IEEE must be obtained for all other

uses, in any current or future media, including reprinting/republishing this material for

advertising or promotional purposes, creating new collective works, for resale or redistribution to

servers or lists, or reuse of any copyrighted component of this work in other works.

IEEE COMMUNICATIONS SURVEYS & TUTORIALS 1

Federated Machine Learning: Survey, Multi-Level

Classiﬁcation, Desirable Criteria and Future Directions in

Communication and Networking Systems

Omar Abdel Wahab, Azzam Mourad, Hadi Otrok and Tarik Taleb

Abstract—The communication and networking ﬁeld is hungry for machine learning decision-making solutions to replace the traditional

model-driven approaches that proved to be not rich enough for seizing the ever-growing complexity and heterogeneity of the modern

systems in the ﬁeld. Traditional machine learning solutions assume the existence of (cloud-based) central entities that are in charge of

processing the data. Nonetheless, the difﬁculty of accessing private data, together with the high cost of transmitting raw data to the

central entity gave rise to a decentralized machine learning approach called Federated Learning. The main idea of federated learning is

to perform an on-device collaborative training of a single machine learning model without having to share the raw training data with any

third-party entity. Although few survey articles on federated learning already exist in the literature, the motivation of this survey stems

from three essential observations. The ﬁrst one is the lack of a ﬁne-grained multi-level classiﬁcation of the federated learning literature,

where the existing surveys base their classiﬁcation on only one criterion or aspect. The second observation is that the existing surveys

focus only on some common challenges, but disregard other essential aspects such as reliable client selection, resource management

and training service pricing. The third observation is the lack of explicit and straightforward directives for researchers to help them

design future federated learning solutions that overcome the state-of-the-art research gaps. To address these points, we ﬁrst provide a

comprehensive tutorial on federated learning and its associated concepts, technologies and learning approaches. We then survey and

highlight the applications and future directions of federated learning in the domain of communication and networking. Thereafter, we

design a three-level classiﬁcation scheme that ﬁrst categorizes the federated learning literature based on the high-level challenge that

they tackle. Then, we classify each high-level challenge into a set of speciﬁc low-level challenges to foster a better understanding of the

topic. Finally, we provide, within each low-level challenge, a ﬁne-grained classiﬁcation based on the technique used to address this

particular challenge. For each category of high-level challenges, we provide a set of desirable criteria and future research directions

that are aimed to help the research community design innovative and efﬁcient future solutions. To the best of our knowledge, our

survey is the most comprehensive in terms of challenges and techniques it covers and the most ﬁne-grained in terms of the multi-level

classiﬁcation scheme it presents.

Index Terms—Federated Learning; Federated Learning Tutorial; Multi-Level Classiﬁcation; Statistical Challenges; Transfer Learning;

Machine Learning; Security; Communication and Networking Systems.

F

1 INTRODUCTION

The fast-growing adoption of Internet of Things (IoT) and social

networking applications is leading to an unprecedented growth in the

volumes of data that are generated on a daily basis. In particular,

the International Data Corporation (IDC) anticipates that, by 2025,

there will be 79ZB of data created by billions of IoT devices,

pushing organizations to rethink their data governance, retention, and

usage strategies. Storing and analyzing such large volumes of data

has long been done on the cloud, owing to the large number of

advantages that the cloud computing technology provides, such as

cost efﬁciency and unlimited computing and storage capabilities [1],

[2], [3]. Nonetheless, due to the ever-rising data privacy concerns

O. Abdel Wahab is with the Department of Computer Science and Engi-

neering, Universit

´

e du Qu

´

ebec en Outaouais, Gatineau, Canada (e-mail:

omar.abdulwahab@uqo.ca).

H. Otrok is with the Department of EECS, Center on Cyber-Physical

Systems, Khalifa University of Science and Technology, Abu Dhabi, UAE

(e-mail: Hadi.Otrok@ku.ac.ae).

A. Mourad is with the Department of Mathematics and Computer

Science, Lebanese American University, Beirut, Lebanon (e-mail: az-

zam.mourad@lau.edu.lb).

T. Taleb is with the Department of Communications and Networking, Aalto

University, Espoo 02150, Finland, with the Centre for Wireless Com-

munications (CWC), University of Oulu, Oulu 90570, Finland, and with

the Department of Computer and Information Security, Sejong University,

Seoul 05006, South Korea (e-mail: tarik.taleb@aalto.ﬁ).

and network limitations, a pure centralized cloud-based data storage

and analytics approach becomes unrealistic. In fact, data owners often

feel concerned about sharing their data with a third-party whether it

is a well-known organization or mysterious to them. In this context,

strict legislations such as the US Consumer Privacy Bill of Rights

1

and the European Commission’s General Data Protection Regulation

(GDPR)

2

have been designed to protect users’ privacy. For instance,

the Articles 5 and 6 of the GDPR restrict the data collection and

storage to only what is user-consented and decidedly indispensable for

processing. Moving to the network limitation problem and emergency

of low-latency applications requiring fast analysis, the fact that the

cloud data centers are often deployed in locations that are far from

those of the data owners leads to high data processing delays due to

the long-distance communications. In the light of these two crucial

factors, the trend in data storage and analysis is shifting from being

cloud-based and centralized to being distributed and on-device [4],

[5]. The key enabler technology for such a shift is that of edge

computing [6], [7], wherein edge nodes such as smartphones, sensor,

micro servers, autonomous vehicles and home gateways are supplied

with computing and storage capabilities to enable them to host

and analyze data locally within minimal delay. Edge nodes then

periodically communicate with the cloud servers to send them the

processed data for historical and long-term storage.

1. https://www.congress.gov/bill/116th-congress/senate-bill/2968/text

2. https://gdpr.eu/data-privacy/

IEEE COMMUNICATIONS SURVEYS & TUTORIALS 2

In order to make this idea feasible, it was necessary to adapt the

machine learning process to this vision in order to enable what is

known as the machine learning at the edge. In this context, the new

paradigm of Federated Learning (FL) has been proposed in 2016 by

McMahan et al. [8] to enable local and distributed machine learning

training at the level of edge nodes or end devices. The main idea

of federated learning is to enable a large number of edge devices

or servers storing local data observations, called clients, to locally

and collaboratively train one single machine learning model without

having to share their raw data. A coordinating server (often called

parameter server

3

) then aggregates the contributions from all the

clients, derives an updated model and shares this model with the

participating clients to beneﬁt from their learning experience and to

enable them to pursue their local training in future iterations. Feder-

ated learning substantially differs from the centralized (cloud-based)

machine learning paradigm and poses additional unique challenges in

the following aspects [9]:

Privacy: In federated learning, the raw data never leaves the

user’s device since the training is done locally on each device.

Nonetheless, having more users involved in one collaborative

model increases the risk of launching inference attacks that

aim to infer sensitive information from the users’ training data;

Communication: In federated learning, no raw data need to

be communicated with any central server, which reduces the

amount of information that needs to be transmitted over the

network. However, since the machine learning model is trained

collaboratively, many model updates need to be communicated

between the clients and the server over many iterations, which

poses additional communication costs.

Latency: With federated learning, the decision-making mod-

els are trained locally on the edge/end devices instead of being

sent to the cloud, leading to lower latency and waiting times.

Statistical Heterogeneity: Given that the training data on

each client device depends on its own usage patterns, the local

dataset of one client in federated learning is not expected to

be representative of the overall data distribution. Similarly, as

clients use their services or applications in varying degrees,

the local datasets across clients tend to have varying sizes.

Massive Distribution: The number of clients that participate

in the federated training is expected to be signiﬁcantly larger

than the average number of training samples per client.

Connectivity: In federated learning, client devices are fre-

quently ofﬂine or on slow or expensive connection. This means

that the connectivity in federated learning is limited and that

the process of selecting clients to participate in the federated

training might be biased toward certain conditions (e.g., local

time zone, device being charged or not, etc).

From the technical perspective, federated learning can be implemented

using two main strategies: Horizontal Federated Learning (HFL) and

Vertical Federated Learning (VFL) [10]. In HFL, the participating

client devices share the same set of features but target different

populations. An example of HFL could be two banks operating in

the same country. Even though the clientele of the banks is non-

overlapping, their data are likely to have a similar feature space since

they adopt similar business models and operate in the same country. In

VFL, the client devices share the same population but target different

sets of features. An example of VFL is two companies offering two

different services (e.g., counselling and shipping) but having a large

intersection at the level of the clienteles. Such companies might be

interested in cooperating on the (distinct) feature spaces they own to

gain each a better understanding about its own business situation.

3. In the rest of the paper, we use the term parameter server to refer to the

cental coordinating server.

1.1 Related Work

Few recent survey articles on federated learning have been pro-

posed. We discuss hereafter these surveys and highlight the unique

contributions of our work. In [11], the authors provide a detailed

survey on the challenges and research directions of federated learning.

In particular, they discuss the challenges related to communication

efﬁciency, data privacy, data heterogeneity and model aggregation.

In [12], the authors classify the federated learning approaches based

on six aspects, i.e., machine learning model, data distribution, com-

munication architecture, privacy mechanism, scale of federation and

motivation of federation. In [13], the authors discuss the unique

features and challenges of federated learning, offer a broad overview

of the literature and highlight several future research directions. In

particular, they consider four challenges of federated learning, i.e.,

communication-efﬁciency, systems heterogeneity, statistical hetero-

geneity and privacy. In [10], the authors discuss the deﬁnitions,

architectures and applications of federated learning framework. They

classify the literature of federated learning based on the learning

architecture, resulting in three categories: vertical federated learning,

horizontal federated learning and federated transfer learning. Different

from these surveys, we consider in this work a wider set of challenges

such as client selection and scheduling, and service pricing. Moreover,

different from these surveys, we provide in this work a more ﬁne-

grained three-level classiﬁcation of the current literature based on

the challenge that they address, the sub-challenges that exist within

each challenge and the techniques used to address each particular

sub-challenge. Furthermore, we deﬁne a set of desirable criteria and

future research directions that we believe are necessary to address

each underlying challenge.

The potential of federated learning in the domains of wireless

communication and mobile edge network has been studied in [14] and

[15] respectively. The authors of [14] investigate the role of federated

learning in the emerging 5G technology. Several use cases that

demonstrate how federated learning could be effective in addressing

key challenges related to 5G are discussed. In the context of edge

computing and content caching, discussions supported with simulation

results show that federated learning is an effective means to predicting

popular content on mobile devices while preserving the privacy of the

users’ data. Moving to spectrum management, federated learning can

be capitalized on to allow each radio to transfer its local utilization

model to a central aggregator, which then leverages these data to

create a global learning model. This global model can then be used to

derive efﬁcient spectrum access decision-making models. Finally, in

the context of 5G core network, vertical federated learning, in which

distributed datasets share the same sample space but differ in the

feature space, can be used to design intelligent network management

techniques. The idea is to allow each entity to manage some speciﬁc

features (e.g., access mobility management function, session manage-

ment function, etc.) of the whole dataset that englobes the overall

users of the network. Different from our work which addresses the

different aspects of federated learning, this survey is restricted to

discussing the role of federated learning in the domain of wireless

communications. In [15], the authors present a survey that combines

the concepts federated learning and Mobile Edge Computing (MEC).

After presenting a tutorial on federated learning and explaining its

role as an enabling technology for MEC optimization, the authors

classify the federated learning approaches into two categories, i.e.,

federated learning at mobile edge networks and federated learning

for mobile edge networks. The ﬁrst category gathers the approaches

that address the challenges of implementing federated training on

the edge devices, while the second category gathers the approaches

that investigate federated learning as a means for optimizing MECs.

These two surveys are restricted to only discussing the potential

of federated learning in different aspects of networking, but they

IEEE COMMUNICATIONS SURVEYS & TUTORIALS 3

provide no classiﬁcation of the existing federated learning literature

nor desirable criteria for future solutions. In this survey, we believe

that, besides illustrating the potential of federated learning in the

communication and networking domain, providing a multi-level ﬁne-

grained classiﬁcation of the federated learning literature in general

would help researchers in the domain better understand the ﬁeld which

would enable them to design more detailed and efﬁcient solutions.

In [16], the authors survey the current progress on federated

learning in the domain of healthcare informatics. They classify the

current approaches in terms of statistical challenges, communication

efﬁciency, privacy and security issues. Different from this survey

which is speciﬁc to the healthcare informatics domain, our survey is

oriented to the communication and networking research community.

In [17], the authors survey the topic of distributed machine learning

with federated learning as an example. The distributed machine

learning is divided into three main processes, i.e., machine learning

optimizers, distributed optimization and data aggregation. Thereafter,

the federated learning framework is introduced and discussed only

from the perspective of communication efﬁciency. Different from

this survey, our survey is speciﬁc to federated learning, where we

address the different aspects and application domains of this emerging

concept.

We summarize in Table 1 the main similarities and differences

between our survey and the existing surveys on federated learning.

1.2 Contributions

The motivation for this survey stems from four main observations. The

ﬁrst one is that the existing survey papers focus only on some com-

mon challenges of federated learning such as statistical challenges,

communication efﬁciency, security and privacy. Nonetheless, there

exists some other substantial challenges that need further investigation

such as service pricing and client selection and scheduling. In this

work, we provide a comprehensive survey that considers all these

aspects to provide the reader with a holistic view of the federated

learning paradigm.

The second observation is the lack of a ﬁne-grained multi-

level classiﬁcation of the federated learning literature, where

the classiﬁcation schemes in the existing surveys are based on only

one aspect such as the addressed challenges, learning architecture

or role of federated learning in a particular application domain (i.e.,

healthcare and networking). In this work, we take one step ahead

and propose a three-level ﬁne-grained classiﬁcation scheme. First, we

classify the federated learning approaches based on the challenge that

they address. Then, we classify each corresponding challenge into

several speciﬁc sub-challenges to enable a better understanding of the

topic. Finally, we provide a classiﬁcation within each sub-challenge

based on the technique used to address the underlying sub-challenge.

Even though a couple of survey papers [15], [14] discuss the potential

of federated learning in the networking domain, these surveys do

not provide any classiﬁcation of the federated learning literature.

Different from these papers, our vision in this work is that providing a

detailed and ﬁne-grained classiﬁcation of the broad federated learning

literature in an accessible fashion would help the communication and

networking research community better understand the tiniest details

in the domain. This would enable them to design more thoughtful

and to the point solutions. For example, by learning the statistical and

security challenges that encounter federated learning along with the

techniques that are used in the literature to address them, a researcher

in the domain of communication and networking would be able

to design a more holistic federated learning-based communication

solution that also deals with the non-Independent and Identically

Distributed nature of the data and the malicious attacks that can be

launched against the distributed training process.

The third observation is the lack of explicit and elaborate

directives for researchers to help them design future federated

learning solutions. We deﬁne in this work, for each underlying

challenge, a set of desirable criteria and future research directions that

we believe are helpful for the success and effectiveness of the future

federated learning solutions. In summary, the proposed classiﬁcation

scheme and criteria aim to help (1) readers to easily visualize the

current challenges of federated learning along with the state-of-the-art

techniques that are employed to address them; (2) research community

to have a clear roadmap on how to design prospective solutions based

on a set of explicit and well-deﬁned criteria; and (3) beginners in the

ﬁeld to easily grasp the main concepts of federated learning and to be

on the lookout for the current trends in this emerging ﬁeld.

We also provide an accessible tutorial on FL, its alternative

learning paradigms (i.e., distributed learning, parallel learning, en-

semble learning and gossip learning), and its enabling technologies

(i.e., Internet of Things (IoT), cloud computing, edge computing

and 5G/6G networks). Additionally, we discuss the applications of

federated learning in the domain of communication and networking

and highlight some future promising applications of federated learning

in this domain.

1.3 Survey Methodology

The approaches chosen to be included in our survey are selected

from papers published between 2016 (the year when the concept of

federated learning was ﬁrst introduced) to 2020 in refereed journals

and conferences as well as in preprints, resulting in 130 surveyed

papers. We believe that we have covered most of the papers that ad-

dressed problems related to federated learning. The strategy followed

to gather these papers consisted in (1) searching for the keyword

“federated learning” on many existing search engines; and (2) tracking

the citations of the collected papers to make sure that we cover the

articles that may not be returned in the search engine’s result set.

The classiﬁcation scheme consists of three interdependent lev-

els. In the ﬁrst level, the current federated learning approaches are

categorized based on the high-level challenge they address. In the

second level, each high-level challenge is broken down into several

speciﬁc low-level sub-challenges. In the third level, a classiﬁcation

within each sub-challenge is provided based on the technique that is

used to deal with that sub-challenge. Note that in some cases, it is

possible for an article to appear in more than one category of high-

level challenges. For example, if a certain article mainly addresses a

statistical challenge of federated learning but also provides a privacy-

preservation component, the article would appear under both the

statistical challenges category and privacy concerns category. In such

a case, only the statistical part of the article is classiﬁed and discussed

under the statistical challenges category, whereas the privacy part is

classiﬁed and discussed under the privacy concerns category.

The criteria that are deﬁned in this survey have been inspired

by our readings of the surveyed papers. We do not claim that our

proposed criteria cover all the necessary aspects for improvement;

but we believe that these criteria could be quite useful for designing

innovative solutions and overcoming some persisting challenges. It is

worth mentioning that, in some cases, not all the criteria deﬁned for a

particular aspect (e.g., communication efﬁciency, client selection and

scheduling, etc.) need to be met to design an “ideal solution”. A subset

or a combination of these criteria might be enough to design a good

solution.

1.4 Survey Insights

As mentioned earlier, our classiﬁcation scheme consists of three

levels. The ﬁrst classiﬁcation level, which is based on the high-

level addressed challenge, resulted in six categories, i.e., statistical

challenges, communication efﬁciency, client selection and schedul-

ing, security concerns, privacy concerns and service pricing. This

classiﬁcation scheme is depicted in Fig. 1. We provide in Fig. 2

Federated Machine Learning: Survey, Multi-Level Classification, Desirable Criteria and Future Directions in Communication and Networking Systems

Figures

Citations

Distributed Learning in Wireless Networks: Recent Progress and Future Challenges

Federated Deep Learning for Zero-Day Botnet Attack Detection in IoT Edge Devices

Federated Learning in Edge Computing: A Systematic Survey

FBI: A Federated Learning-Based Blockchain-Embedded Data Accumulation Scheme Using Drones for Internet of Things

Energy-Efficient Fog Computing for 6G-Enabled Massive IoT: Recent Trends and Future Opportunities

References

MapReduce: simplified data processing on large clusters

MapReduce: simplified data processing on large clusters

The Hungarian method for the assignment problem

Technical Note : \cal Q -Learning

Communication-Efficient Learning of Deep Networks from Decentralized Data

Related Papers (5)

Federated Learning in Mobile Edge Networks: A Comprehensive Survey

Communication-Efficient Learning of Deep Networks from Decentralized Data

Federated Machine Learning: Concept and Applications

Federated Learning Systems: Vision, Hype and Reality for Data Privacy and Protection

Federated Learning for Wireless Communications: Motivation, Opportunities, and Challenges