scispace - formally typeset
Search or ask a question
Journal ArticleDOI

An Overview of Recent Progress in the Study of Distributed Multi-Agent Coordination

TL;DR: In this article, the authors reviewed some main results and progress in distributed multi-agent coordination, focusing on papers published in major control systems and robotics journals since 2006 and proposed several promising research directions along with some open problems that are deemed important for further investigations.
Abstract: This paper reviews some main results and progress in distributed multi-agent coordination, focusing on papers published in major control systems and robotics journals since 2006. Distributed coordination of multiple vehicles, including unmanned aerial vehicles, unmanned ground vehicles, and unmanned underwater vehicles, has been a very active research subject studied extensively by the systems and control community. The recent results in this area are categorized into several directions, such as consensus, formation control, optimization, and estimation. After the review, a short discussion section is included to summarize the existing research and to propose several promising research directions along with some open problems that are deemed important for further investigations.

Summary (7 min read)

Introduction

  • As the popularity of big data analytics has continued to grow, so has the need for accessible and scalable machine-learning implementations.
  • In recent years, Apache Spark’s machine-learning library, MLlib, has been used to fulfill this need.
  • For all four of the applications tested, the speed of Smart-MLlib’s implementation outperformed Spark’s MLlib in every configuration by at least 90%.
  • On average, the authors outperformed Spark’s MLlib by over 800%.

Acknowledgments

  • I would like to start by thanking my advisor, Dr. Gagan Agrawal, for his support and guidance throughout my career at Ohio State.
  • My parents, Adam and Lorna, for their support and advice throughout my college career.
  • The trend of data-driven decision making has created a new multi-billion dollar big-data technology and services industry [11].

1.1 Motivation

  • In part to demonstrate the support for cyclic dataflow in Spark, its makers implemented a full machine-learning library, coined MLlib, on top of the framework.
  • This library not only provides concrete examples of iterative applications in Spark, but also provides an extremely user-friendly API for production-quality, machine-learning algorithms.
  • For many of the algorithms, users need less than 20 lines of code to build complex statistical models from stored semi-structured data [24].
  • Other distributed-computing frameworks, such as their in-Situ MApReduce liTe system, have been shown to outperform Spark by at least an order of magnitude for some machine-learning tasks [28].
  • Building a machine-learning library that mimics Spark’s MLlib on top of Smart could lead to increased performance without compromising the easy-to-use interface.

1.2 Our Contributions

  • The authors present a machine-learning library prototype, comparable to Spark’s MLlib, built on top of their Smart system.
  • In addition to presenting the specific Smart-MLlib applications in this paper, the authors will also describe the underlying architecture of their system.
  • The main focus of their architectural discussion revolves around launching Smart’s native jobs from within Scala’s Java virtual machine (JVM) environment.
  • After describing the complications surrounding this issue, the authors explain why utilizing the scala.sys.
  • In addition to these performance results, the authors also show that Smart-MLlib scales better than Spark’s MLlib in almost all cases.

1.3 Organization

  • This thesis is organized in the following way: Chapter 2 discusses the background and motivation for MapReduce, Spark, Smart, and MLlib.
  • Chapter 4 covers each of Smart-MLlib’s algorithms by giving a brief overview, exploring the Smart implementation, presenting the Smart-MLlib API, and comparing the usage of Smart-MLlib to the usage of Spark’s MLlib for the same algorithm.
  • Chapter 5 presents and analyzes the experiments that were conducted to compare the libraries.
  • Finally, Chapter 6 concludes the thesis with an overview of what was covered and provides topics for further work.
  • First, the authors cover the history and basic ideas of the MapReduce programming model.

2.1 MapReduce

  • In the early 2000s, distributed computing was already making a large impact on the technology industry [2].
  • The ability to leverage clusters of computers to speed up computation was motivating Google and many other companies to implement hundreds of special-purpose distributed applications [4].
  • Straightforward algorithms were obfuscated by the details required to build distributed algorithms: explicit parallelization of the computation, intelligent distribution of the data, etc.
  • The model provides a very simple API containing only two core 5 functions: map and reduce.
  • This simple dataflow, coupled with MapReduce’s straightforward, functional-style API, has made the programming model very popular for a variety of applications [7, 5].

2.2 Spark

  • Spark was introduced in 2010 to fulfill the need for a general-purpose, parallelprocessing framework with built-in support for nonlinear dataflows [30].
  • In addition to defining these new distributed datasets, Spark defined a series of operations on RDDs that supported parallel computation.
  • 7 RDD operations can be loosely grouped into two categories: transformations and actions [25].
  • Examples of actions include reduce and collect.
  • Finally, an action is performed and the program is terminated.

2.3 Smart

  • Smart [28] is the next generation of a parallel-computing framework that has evolved from FREERIDE (FRamework for Rapid Implementation of Data Mining Engines) [13] and MATE (Map-reduce with an AlternaTE API) [12].
  • Instead of map and reduce phases of computation, Smart uses reduction and combination phases.
  • More specifically, Smart processes data in the following way.
  • After every data chunk has been reduced, merge is used to combine all of the reduction maps into a single combination map.
  • Two functions not shown are process extra data which initializes the combination map and convert which is used to convert the combination map to an output result.

3.1 System Overview

  • At its core, Smart-MLlib is a Scala-based API that is used to execute machinelearning algorithms on Smart.
  • Connecting a JVM-based language, like Scala, with a native language, like C++, is not a new problem.
  • Smart is not just a C++ library, but a C++ library that uses MPI to distribute work across clusters of nodes.
  • The Scala API prepares a mpiexec command complete with all arguments needed by Smart. 15 3. Using scala.sys.process, the Smart job is executed with the mpiexec command prepared in Step 2.
  • The Scala API returns the model from Step 7 to the user as a Scala object.

3.1.1 System Advantages

  • Architecting the system in the ways outlined above provides several advantages over other possible designs.
  • First, calling Smart as an external command, as it would be called without the MLlib wrapper, ensures that all necessary runtime configurations are properly set up.
  • Beyond the simplification of launching Smart jobs, the architecture also forces every MLlib algorithm to define a savable model.
  • As shown in Step 5 of Figure 3.1, this requirement comes from the Smart executable, which finishes only after saving a model to disk.
  • Additionally, since the model is saved to persistent storage, if the JVM process crashes after a Smart job terminates, the machine-learning job doesn’t need to be executed again.

3.1.2 System Disadvantages

  • While the system architecture utilized has several benefits, it also introduces a few disadvantages.
  • The only real usable information Scala gets directly from Smart is the exit status after the Smart job finishes execution.
  • In addition to the loss of control over the execution of Smart jobs, the architecture also brings additional latency into the system.
  • Writing and reading from disk is a 17 very slow way to communicate between two processes.
  • Fortunately, these failures are infrequent, and many of the issues that do appear can easily be handled in the language in which they occur.

3.2 Smart-MLlib Walkthrough

  • In Section 3.1, an overview of the system architecture was presented.
  • The main tasks for which these applications are responsible are as follows:.

4.1 K-Means Clustering

  • The first algorithm to be covered is k-means clustering (k-means).
  • K-means is an unsupervised machine-learning technique used to separate a dataset into k groups such that each group contains similar patterns [23].
  • K-means, in particular, typically uses the Euclidian distance between two patterns as the measure of similarity [10].
  • The goal of the algorithm is to group all of the patterns into k clusters in such a way 22 that the sum of the squared distance between every data pattern and its assigned cluster center is minimized [23].
  • The basic k-means algorithm works as follows: 1) the initial k centers are set; 2) each data point in the dataset is assigned to the nearest center; 3) each center recomputes its location as the mean of all data points assigned to it; and 4) Steps 2 and 3 are repeated until a stopping condition is met.

4.1.1 Smart’s Implementation

  • To fully describe Smart’s k-means implementation, both the reduction object and core API functions need to be defined.
  • The pseudocode for these items can be seen in Listing 4.1 which was adapted from Smart’s original paper [28].
  • Next, merge accumulates all the reduction maps produced by accumulate into a single combination map.
  • The final combination map, which holds one ClusterObj for each cluster in the algorithm, is then updated by post combine.

4.1.2 Smart-MLlib Interface

  • The k-means API in Smart-MLlib is currently implemented as a single function.
  • Table 4.1 gives the API and describes all of the formal parameters that may be specified.
  • From examining the table, it is clear that the interface supplies options for declaring the initial k-means model, determining the number of clusters to use (i.e. k), and setting the number of iterations for the algorithm.
  • The parameters unused for these tasks provide general information on the data being processed and the environment Smart is using to execute the distributed algorithm.

4.2 Linear Regression

  • Linear regression is a technique used to model the relationship between variables [29].
  • In the version the authors discuss, the idea is to determine the linear combination of independent variables that will best explain a single dependent variable within a dataset.
  • For a more comprehensive explanation of the linear regression algorithm, please refer to Spark’s MLlib guide [24].

4.2.1 Smart’s Implementation

  • The Smart implementation for linear regression can be fully described by defining both the reduction object and the core API functions for the algorithm.
  • In addition, WeightObj is responsible for accumulating the number of points processed and the sum of weighted errors detected throughout the linear regression program.
  • In accumulate, the input pair is processed and both the size and sum weighted error, which is based on the input vector and current model weights, are reduced into the reduction object. merge further accumulates the reduction maps produced by accumulate into a single combination map.

4.2.2 Smart-MLlib Interface

  • The Smart-MLlib linear regression API is shown in Table 4.2.
  • The interface provides options for altering both the learning rate of the algorithm as well as the number of iterations the algorithm should perform.
  • In addition to these parameters, the API also allows users to specify an initial model and gives an option for including a bias or intercept term in the linear-regression computation.
  • The remaining parameters provide relevant information about the data being processed and the cluster environment Smart is using to execute the distributed algorithm.

4.3 Gaussian Mixture Model

  • A Gaussian mixture model (GMM) is a probability distribution that is constructed using a weighted combination of k Gaussian functions.
  • The aim of training a GMM is to modify the weights (i.e. linear coefficients), means, and covariance matrices of the Gaussian functions in order to maximize the likelihood that a particular dataset could be generated by the mixture model [19].
  • Typically, a GMM is trained through the utilization of the expectation-maximization (EM) algorithm [19, 18].
  • Within the context of GMM, the EM algorithm works as follows: 1) the initial k Gaussians are selected; 2) the responsibility of each Gaussian to every data point is determined; 3) based on the responsibilities computed in the previous step, the Gaussian weights, means, and covariance matrices are updated; and 4) Steps 2 and 3 are repeated until convergence.
  • For a more comprehensive explanation of the GMM algorithm, please refer to Spark’s MLlib guide [24].

4.3.1 Smart’s Implementation

  • Smart’s GMM implementation can be described through defining the algorithm’s reduction object and core API functions.
  • Since this algorithm only requires a single reduction object, gen key simply returns a constant number.

4.3.2 Smart-MLlib Interface

  • Since the Gaussian mixture model algorithm is similar in flavor to the k-means algorithm, the Smart-MLlib interface for these applications is almost identical.
  • As Table 4.3 shows, the GMM interface provides options for specifying the initial model, the number of Gaussians to use, and the number of iterations for the algorithm to complete.
  • The other parameters are used to specify both the data being processed and the system on which the Smart job will be running.

4.4 Support Vector Machine

  • Support vector machines (SVMs) are used to classify data into two groups.
  • Unlike various other types of classifiers that do not make determinations on the “goodness” of a classification (e.g. perceptrons [20]), SVMs attempt to optimally classify datasets [10].
  • The classification is defined with a hyperplane that separates the dataset into two classes.
  • For a more comprehensive explanation of the SVM algorithm, please refer to Spark’s MLlib guide [24].

4.4.1 Smart’s Implementation

  • The Smart implementation for SVM can be described by defining the algorithm’s reduction object and core API functions.
  • As this algorithm requires only one reduction object per reduction map, gen key always returns a constant number.
  • Using this master combination map, post combine performs a gradient-descent update on the weights of the model using all of the accumulated values.

4.4.2 Smart-MLlib Interface

  • As Table 4.4 suggests, the Smart-MLlib SVM API provides options for specifying the initial SVM model, setting the number of iterations, and declaring if the SVM model being produced should include a bias term.
  • In addition to these parameters, 40 the interface also gives users the opportunity to alter the algorithm’s learning rule through the learningRate and regParam parameters.
  • As with the other Smart-MLlib interfaces discussed, the remaining parameters exist to provide information on the data being processed and the system being used to execute the SVM algorithm.

5.1 Environment

  • The authors experiments were all conducted on the same homogeneous, multi-core computing cluster.
  • Specifically, their tests were performed using 4, 8, 16, and 32 nodes configurations.
  • The processors each have a combined total of eight computing cores that run at a base frequency of 2.53GHz.
  • The Spark experiments, on the other hand, used Spark’s standalone cluster for communication.
  • This translates to Smart using one MPI process and eight OpenMP threads per node and Spark using one executor and eight executor cores per node.

5.2 K-Means Clustering Experiments

  • For the k-means experiments, the performance of the basic k-means implementation was compared between Smart-MLlib and Spark’s MLlib.
  • The performance reported for each configuration is an average of five independent trials.
  • In all of the tests, k-means was run with four cluster centers for exactly 1000 iterations on 16-dimensional input.
  • Since Spark will stop iterating when a default convergence condition is met, the source code was modified to ensure all iterations actually occurred.

5.2.1 Results

  • The results of all the k-means experiments can be seen in Figure 5.1.
  • In addition to outperforming Spark in head-to-head experiments, Smart-MLlib also out-scaled Spark’s MLlib.
  • Figure 5.2 shows this by tracking Smart’s speedup over Spark while increasing the number of nodes.
  • As these results are consistent with all other results within this chapter, please refer to Section 5.6 for a detailed analysis.

5.3 Linear Regression Experiments

  • For the linear regression experiments, the performance of the linear regression implementation on Smart-MLlib and Spark’s MLlib were compared.
  • As in Section 5.2, the results reported for each configuration are an average of five independent trials.
  • In all of the tests, the linear regression processed input with 15 dimensions and 1 output dimension and ran for exactly 1000 iterations.
  • Note that the Spark source code had to be modified to guarantee all 1000 iterations of the algorithm were completed.

5.3.1 Results

  • The results of the linear regression experiments can be seen in Figure 5.3.
  • From examining the graphs, it is clear that the Smart-MLlib implementation outperforms Spark’s in every configuration.
  • As with k-means, the Smart-MLlib version also scales better than Spark’s.
  • Figure 5.4 shows this superior scaling graphically.
  • As these results are consistent with all other results in this chapter, please refer to Section 5.6 for a detailed analysis.

5.4 Gaussian Mixture Model Experiments

  • The Gaussian mixture model (GMM) experiments were conducted to compare the performance of GMM on Smart-MLlib and Spark’s MLlib.
  • The results reported for each configuration are the average of five independent tests.
  • Since GMM takes substantially longer to execute than the other algorithms covered, each trial was only run for 100 iterations using a four Gaussian 49 model.

5.4.1 Results

  • The Gaussian mixture model results can be seen in Figure 5.5.
  • Results from k-means, linear regression, and SVM show Smart having roughly a 2 to 15 times advantage over Smart; however, for the GMM tests, this range balloons to a 13 to 54 times advantage.
  • In all other algorithms presented, the Smart-MLlib implementation scales better than Spark’s in all cases.
  • In Figure 5.6, the authors see a single negatively sloping line segment for both input sizes plotted.
  • The authors suspect that this decrease in input size allowed each Spark executor to cache one or more additional RDDs resulting in significantly improved performance.

5.5 SVM Experiments

  • For the SVM experiments, the performance of the linear SVM implementation is compared between Smart-MLlib and Spark’s MLlib.
  • Each test ran for exactly 1000 iterations on samples with 15 input dimensions and 1 output dimension.
  • Figure 5.8 shows this graphically through the positively sloped line segments.
  • As the number of nodes increases, Smart-MLlib’s SVM performance gets better relative to Spark’s MLlib.

5.6 Analysis and Discussion

  • All of the results presented in this chapter show that, for the algorithms discussed, Smart-MLlib performs strictly better than Spark’s MLlib.
  • In every configuration tested, the Smart implementation performed at least 90% times better than the Spark implementation.
  • The performance advantages of Smart result from three key differences between Smart and Spark [28].
  • Matei Zaharia, Mosharaf Chowdhury, Michael J Franklin, Scott Shenker, and Ion Stoica.

Did you find this useful? Give us your feedback

Content maybe subject to copyright    Report

Please do not remove this page
An overview of recent progress in the study of
distributed multi-agent coordination
Cao, Yongcan; Yu, Wenwu; Ren, Wei; Chen, Guanrong
https://researchrepository.rmit.edu.au/discovery/delivery/61RMIT_INST:ResearchRepository/12246869400001341?l#13248385820001341
Cao, Y., Yu, W., Ren, W., & Chen, G. (2013). An overview of recent progress in the study of distributed
multi-agent coordination. IEEE Transactions on Industrial Informatics, 9(1), 427–438.
https://doi.org/10.1109/TII.2012.2219061
Published Version: https://doi.org/10.1109/TII.2012.2219061
Downloaded On 2022/08/09 23:55:43 +1000
© 2005-2012 IEEE.
Repository homepage: https://researchrepository.rmit.edu.au
Please do not remove this page

Thank
7KH5
0
RXWSXW
V
50,75
Citati
o
See th
Versio
n
Copyri
Link t
o

you for do
0
,75HVHDU
F
V
RI50,78
5HVHDUFK5H
o
n:
is record
i
n
:
ght State
m
o
Publishe
d
wnloading
F
K5HSRVLW
R
QLYHUVLW\UH
V
HSRVLWRU\KWWSUHVHDUFKEDQNUPLWHGXDX
i
n the RM
I
m
ent: ©
Version:
this docu
m
R
U\LVDQRS
H
V
HDUFKHUV
I
T Resear
c
m
ent from
HQDFFHVV
G
c
h Reposit
the RMIT
R
G
DWDEDVHV
K
ory at:
R
esearch
R
K
RZFDVLQJ
W
R
epositor
y
WKHUHVHDU
F

F
K
PLEASE DO NOT REMOVE THIS PAGE
Cao, Y, Yu, W, Ren, W and Chen, G 2013, 'An overview of recent progress in the study of
distributed multi-agent coordination', IEEE Transactions on Industrial Informatics, vol. 9, no.
1, pp. 427-438.
http://researchbank.rmit.edu.au/view/rmit:20438
A
ccepted Manuscript
2005-2012 IEEE.
http://researchbank.rmit.edu.au/view/rmit:20438

1
An Overview of Recent Progress in the Study
of Distributed Multi-agent Coordination
Yongcan Cao, Member, IEEE, Wenwu Yu, Member, IEEE,
Wei Ren, Member, IEEE, and Guanrong Chen Fellow, IEEE
Abstract
This article reviews some main results and progress in distributed multi-agent coordination, with
the focus on papers published in major control systems and robotics journals since 2006. Distributed
coordination of multiple vehicles, including unmanned aerial vehicles (UAVs), unmanned ground vehicles
(UGVs) and unmanned underwater vehicles (UUVs), has been a very active research subject studied
extensively by the systems and control community. The recent results in this area are categorized into
several directions, such as consensus, formation control, optimization, distributed task assignment, and
estimation. After the review, a short discussion section is included to summarize the existing research
and to propose several promising research directions along with some open problems that are deemed
important therefore deserving further investigations.
Index Terms
Distributed coordination, formation control, sensor network, multi-agent system
I. INTRODUCTION
Control theory and practice may date back to the beginning of the last century when Wright Brothers
attempted their first test flight in 1903. Since then, control theory has gradually gained popularity, receiving
more and wider attention especially during the World War II when it was developed and applied to fire-
control systems, missile navigation and guidance, as well as various electronic automation devices. In
This work was supported by ......, and the Hong Kong RGC under GRF Grant CityU1114/11E.
Y. Cao and W. Ren are with the Department of Electrical and Computer Engineering, Utah State University, Logan, Utah
84322, USA. W. Yu is with the Department of Mathematics, Southeast University, Nanjing 210096, China. G. Chen is with the
Department of Electronic Engineering, City University of Hong Kong, Hong Kong SAR, China.
Manuscript submitted to IEEE Transactions on Industrial Informatics on 31 July 2011.
July 31, 2011 DRAFT

2
the past several decades, modern control theory was further advanced due to the booming of aerospace
technology based on large-scale engineering systems.
During the rapid and sustained development of the modern control theory, technology for controlling
a single vehicle, albeit higher-dimensional and complex, has become relatively mature and has produced
many effective control tools such as PID control, adaptive control, nonlinear control, intelligent control,
and robust control methodologies. In the past two decades in particular, control of multiple vehicles
has received increasing demands spurred by the fact that many benefits can be obtained when a single
complicated vehicle is equivalently replaced by multiple yet simpler vehicles. In this endeavor, two ap-
proaches are commonly adopted for controlling multiple vehicles: a centralized approach and a distributed
approach. The centralized approach is based on a basic assumption that a central station is available and
powerful enough to control a whole group of vehicles. Essentially, the centralized approach is a direct
extension of the traditional single-vehicle-based control philosophy and strategy. On the contrary, the
distributed approach does not require a central station for control, at the cost of becoming far more
complex than the centralized one in structure and organization. Although both approaches are considered
practical depending on the situations and conditions of the real applications, the distributed approach
is believed more promising due to many inevitable physical constraints such as limited resources and
energy, short wireless communication ranges, narrow bandwidths, and large sizes of vehicles to manage
and control. Therefore, the focus of this overview is placed on the distributed approach.
In distributed control of a group of autonomous vehicles such as UAVs, UGVs and UUVs, the main
objective typically is to have the whole group of vehicles working in a cooperative fashion throughout
a distributed protocol. Here, cooperative refers to a close relationship among all vehicles in the group
where information sharing plays a central role. The distributed approach has many advantages in achiev-
ing cooperative group performances, especially with low operational costs, less system requirements,
high robustness, strong adaptivity, and flexible scalability, therefore has been widely recognized and
appreciated.
The study of distributed control of multiple vehicles was perhaps first motivated by the work in
distributed computing [1], management science [2], [3], and statistical physics [4]. In the control systems
society, some pioneering works are generally referred to [5], [6], where an asynchronous agreement
problem was studied for distributed decision-making problems. Thereafter, some consensus algorithms
were studied under various information-flow constraints [7]–[11]. There are several journal special issues
on the related topics published after 2006, including the IEEE Transactions on Control Systems Technol-
ogy (vol. 15, no. 4, 2007), Proceedings of the IEEE (vol. 94, no. 4, 2007), ASME Journal of Dynamic
July 31, 2011 DRAFT

3
Systems, Measurement, and Control (vol. 129, no. 5, 2007), SIAM Journal of Control and Optimization
(vol. 48, no.1, 2009), and International Journal of Robust and Nonlinear Control (Vol. 21, no. 12, 2011).
In addition, there are some more recent reviews and progress reports given in the surveys [12]–[15] and
the books [16]–[21].
This article reviews some main results and recent progress in distributed multi-agent coordination,
published in major control systems and robotics journals since 2006. For results before 2006, the readers
are referred to [12]–[15].
Specifically, this article reviews the recent research results in the following directions, which are not
independent but actually may have overlapping to some extent:
1. Consensus and the like (synchronization, rendezvous). Consensus refers to the group behavior that
all the agents asymptotically reach a certain common agreement through a local distributed protocol,
with or without predefined common speed and orientation.
2. Distributed formation and the like (flocking). Distributed formation refers to the group behavior
that all the agents form a pre-designed geometrical configuration through local interactions with or
without a common reference.
3. Distributed optimization. This refers to algorithmic developments for the analysis and optimization
of large-scale distributed systems.
4. Distributed task assignment. This refers to the implementation of a task-assignment algorithm in a
distributed fashion based on local information.
5. Distributed estimation and control. This refers to distributed control design based on local estimation
about the needed global information.
The rest of this article is organized as follows. In Section II, basic notations of graph theory and
stochastic matrices are introduced. Sections III, IV, V, VI, and VII describe the recent research results
and progress in consensus, formation control, optimization, task assignment, and estimation, respectively.
Finally, the article is concluded by a short section of discussions with future perspectives.
II. PRELIMINARIES
This section introduces basic concepts and notations of graph theory and stochastic matrices.
A. Graph Theory
For a system of n connected agents, its network topology may be modeled as a directed graph denoted
G = (V, W), where V = {v
1
, v
2
, ··· , v
n
} and W V ×V are, respectively, the set of agents and the set
July 31, 2011 DRAFT

Citations
More filters
Proceedings Article
05 Dec 2016
TL;DR: A simple neural model is explored, called CommNet, that uses continuous communication for fully cooperative tasks and the ability of the agents to learn to communicate amongst themselves is demonstrated, yielding improved performance over non-communicative agents and baselines.
Abstract: Many tasks in AI require the collaboration of multiple agents. Typically, the communication protocol between agents is manually specified and not altered during training. In this paper we explore a simple neural model, called CommNet, that uses continuous communication for fully cooperative tasks. The model consists of multiple agents and the communication between them is learned alongside their policy. We apply this model to a diverse set of tasks, demonstrating the ability of the agents to learn to communicate amongst themselves, yielding improved performance over non-communicative agents and baselines. In some cases, it is possible to interpret the language devised by the agents, revealing simple but effective strategies for solving the task at hand.

804 citations

Journal ArticleDOI
TL;DR: An overview of recent advances in event-triggered consensus of MASs is provided and some in-depth analysis is made on several event- Triggered schemes, including event-based sampling schemes, model-based event-Triggered scheme, sampled-data-basedevent-trIGgered schemes), and self- triggered sampling schemes.
Abstract: Event-triggered consensus of multiagent systems (MASs) has attracted tremendous attention from both theoretical and practical perspectives due to the fact that it enables all agents eventually to reach an agreement upon a common quantity of interest while significantly alleviating utilization of communication and computation resources. This paper aims to provide an overview of recent advances in event-triggered consensus of MASs. First, a basic framework of multiagent event-triggered operational mechanisms is established. Second, representative results and methodologies reported in the literature are reviewed and some in-depth analysis is made on several event-triggered schemes, including event-based sampling schemes, model-based event-triggered schemes, sampled-data-based event-triggered schemes, and self-triggered sampling schemes. Third, two examples are outlined to show applicability of event-triggered consensus in power sharing of microgrids and formation control of multirobot systems, respectively. Finally, some challenging issues on event-triggered consensus are proposed for future research.

770 citations

Posted Content
TL;DR: In this article, the authors propose a value-based method that can train decentralised policies in a centralised end-to-end fashion in simulated or laboratory settings, where global state information is available and communication constraints are lifted.
Abstract: In many real-world settings, a team of agents must coordinate their behaviour while acting in a decentralised way. At the same time, it is often possible to train the agents in a centralised fashion in a simulated or laboratory setting, where global state information is available and communication constraints are lifted. Learning joint action-values conditioned on extra state information is an attractive way to exploit centralised learning, but the best strategy for then extracting decentralised policies is unclear. Our solution is QMIX, a novel value-based method that can train decentralised policies in a centralised end-to-end fashion. QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations. We structurally enforce that the joint-action value is monotonic in the per-agent values, which allows tractable maximisation of the joint action-value in off-policy learning, and guarantees consistency between the centralised and decentralised policies. We evaluate QMIX on a challenging set of StarCraft II micromanagement tasks, and show that QMIX significantly outperforms existing value-based multi-agent reinforcement learning methods.

693 citations

Journal ArticleDOI
TL;DR: Focusing on different kinds of constraints on the controller and the self-dynamics of each individual agent, as well as the coordination schemes, the recent results are categorized into consensus with constraints, event-based consensus, consensus over signed networks, and consensus of heterogeneous agents.
Abstract: In this paper, we mainly review the topics in consensus and coordination of multi-agent systems, which have received a tremendous surge of interest and progressed rapidly in the past few years. Focusing on different kinds of constraints on the controller and the self-dynamics of each individual agent, as well as the coordination schemes, we categorize the recent results into the following directions: consensus with constraints, event-based consensus, consensus over signed networks, and consensus of heterogeneous agents. We also review some applications of the very well developed consensus algorithms to the topics such as economic dispatch problem in smart grid and k -means clustering algorithms.

595 citations

Journal ArticleDOI
TL;DR: This paper provides an overview and makes a deep investigation on sampled-data-based event-triggered control and filtering for networked systems, finding that a sampled- Data-based Event-Triggered Scheme can ensure a positive minimum inter-event time and make it possible to jointly design suitable feedback controllers and event- triggered threshold parameters.
Abstract: This paper provides an overview and makes a deep investigation on sampled-data-based event-triggered control and filtering for networked systems. Compared with some existing event-triggered and self-triggered schemes, a sampled-data-based event-triggered scheme can ensure a positive minimum inter-event time and make it possible to jointly design suitable feedback controllers and event-triggered threshold parameters. Thus, more attention has been paid to the sampled-data-based event-triggered scheme. A deep investigation is first made on the sampled-data-based event-triggered scheme. Then, recent results on sampled-data-based event-triggered state feedback control, dynamic output feedback control, $H_\infty$ filtering for networked systems are surveyed and analyzed. An overview on sampled-data-based event-triggered consensus for distributed multiagent systems is given. Finally, some challenging issues are addressed to direct the future research.

572 citations

References
More filters
Journal ArticleDOI
TL;DR: A distinctive feature of this work is to address consensus problems for networks with directed information flow by establishing a direct connection between the algebraic connectivity of the network and the performance of a linear consensus protocol.
Abstract: In this paper, we discuss consensus problems for networks of dynamic agents with fixed and switching topologies. We analyze three cases: 1) directed networks with fixed topology; 2) directed networks with switching topology; and 3) undirected networks with communication time-delays and fixed topology. We introduce two consensus protocols for networks with and without time-delays and provide a convergence analysis in all three cases. We establish a direct connection between the algebraic connectivity (or Fiedler eigenvalue) of the network and the performance (or negotiation speed) of a linear consensus protocol. This required the generalization of the notion of algebraic connectivity of undirected graphs to digraphs. It turns out that balanced digraphs play a key role in addressing average-consensus problems. We introduce disagreement functions for convergence analysis of consensus protocols. A disagreement function is a Lyapunov function for the disagreement network dynamics. We proposed a simple disagreement function that is a common Lyapunov function for the disagreement dynamics of a directed network with switching topology. A distinctive feature of this work is to address consensus problems for networks with directed information flow. We provide analytical tools that rely on algebraic graph theory, matrix theory, and control theory. Simulations are provided that demonstrate the effectiveness of our theoretical results.

11,658 citations

Journal ArticleDOI
05 Mar 2007
TL;DR: A theoretical framework for analysis of consensus algorithms for multi-agent networked systems with an emphasis on the role of directed information flow, robustness to changes in network topology due to link/node failures, time-delays, and performance guarantees is provided.
Abstract: This paper provides a theoretical framework for analysis of consensus algorithms for multi-agent networked systems with an emphasis on the role of directed information flow, robustness to changes in network topology due to link/node failures, time-delays, and performance guarantees. An overview of basic concepts of information consensus in networks and methods of convergence and performance analysis for the algorithms are provided. Our analysis framework is based on tools from matrix theory, algebraic graph theory, and control theory. We discuss the connections between consensus problems in networked dynamic systems and diverse applications including synchronization of coupled oscillators, flocking, formation control, fast consensus in small-world networks, Markov processes and gossip-based algorithms, load balancing in networks, rendezvous in space, distributed sensor fusion in sensor networks, and belief propagation. We establish direct connections between spectral and structural properties of complex networks and the speed of information diffusion of consensus algorithms. A brief introduction is provided on networked systems with nonlocal information flow that are considerably faster than distributed systems with lattice-type nearest neighbor interactions. Simulation results are presented that demonstrate the role of small-world effects on the speed of consensus algorithms and cooperative control of multivehicle formations

9,715 citations

Book
01 Jan 2009
TL;DR: The Laplacian of a Graph and Cuts and Flows are compared to the Rank Polynomial.
Abstract: Graphs.- Groups.- Transitive Graphs.- Arc-Transitive Graphs.- Generalized Polygons and Moore Graphs.- Homomorphisms.- Kneser Graphs.- Matrix Theory.- Interlacing.- Strongly Regular Graphs.- Two-Graphs.- Line Graphs and Eigenvalues.- The Laplacian of a Graph.- Cuts and Flows.- The Rank Polynomial.- Knots.- Knots and Eulerian Cycles.- Glossary of Symbols.- Index.

8,307 citations

Journal ArticleDOI
TL;DR: A theoretical explanation for the observed behavior of the Vicsek model, which proves to be a graphic example of a switched linear system which is stable, but for which there does not exist a common quadratic Lyapunov function.
Abstract: In a recent Physical Review Letters article, Vicsek et al. propose a simple but compelling discrete-time model of n autonomous agents (i.e., points or particles) all moving in the plane with the same speed but with different headings. Each agent's heading is updated using a local rule based on the average of its own heading plus the headings of its "neighbors." In their paper, Vicsek et al. provide simulation results which demonstrate that the nearest neighbor rule they are studying can cause all agents to eventually move in the same direction despite the absence of centralized coordination and despite the fact that each agent's set of nearest neighbors change with time as the system evolves. This paper provides a theoretical explanation for this observed behavior. In addition, convergence results are derived for several other similarly inspired models. The Vicsek model proves to be a graphic example of a switched linear system which is stable, but for which there does not exist a common quadratic Lyapunov function.

8,233 citations

Proceedings ArticleDOI
01 Aug 1987
TL;DR: In this article, an approach based on simulation as an alternative to scripting the paths of each bird individually is explored, with the simulated birds being the particles and the aggregate motion of the simulated flock is created by a distributed behavioral model much like that at work in a natural flock; the birds choose their own course.
Abstract: The aggregate motion of a flock of birds, a herd of land animals, or a school of fish is a beautiful and familiar part of the natural world. But this type of complex motion is rarely seen in computer animation. This paper explores an approach based on simulation as an alternative to scripting the paths of each bird individually. The simulated flock is an elaboration of a particle systems, with the simulated birds being the particles. The aggregate motion of the simulated flock is created by a distributed behavioral model much like that at work in a natural flock; the birds choose their own course. Each simulated bird is implemented as an independent actor that navigates according to its local perception of the dynamic environment, the laws of simulated physics that rule its motion, and a set of behaviors programmed into it by the "animator." The aggregate motion of the simulated flock is the result of the dense interaction of the relatively simple behaviors of the individual simulated birds.

7,365 citations

Frequently Asked Questions (16)
Q1. What are the contributions in "An overview of recent progress in the study of distributed multi-agent coordination" ?

This article reviews some main results and progress in distributed multi-agent coordination, with the focus on papers published in major control systems and robotics journals since 2006. Distributed coordination of multiple vehicles, including unmanned aerial vehicles ( UAVs ), unmanned ground vehicles ( UGVs ) and unmanned underwater vehicles ( UUVs ), has been a very active research subject studied extensively by the systems and control community. After the review, a short discussion section is included to summarize the existing research and to propose several promising research directions along with some open problems that are deemed important therefore deserving further investigations. 

In order to show consensus in multi-agent systems with time-varying network structures, stochastic matrix theory [5]–[7], [10] and convexity analysis [11] are often applied. 

Distributed surveillance has a number of potential applications, such as board security guarding, forest fire monitoring, and oil spill patrolling. 

If enough information of the group reference is known, such as acceleration and/or velocity information of the group reference, flocking with a dynamic group reference can be solved by employing a gradient-based control law [203]–[205]. 

Distributed task assignment refers to the study of task assignment of a group of dynamical agents in a distributed manner, which can be roughly categorized into coverage control, scheduling, and surveillance. 

Although the study of consensus under various system dynamics is due to the existence of complex dynamics in practical systems, it is also interesting to observe that system dynamics play an important role in determining the final consensus state. 

An interesting problem in formation tracking is to design a distributed control algorithm to drive a team of agents to track some desired state. 

The formation tracking problem can be converted to a traditional stability problem by redefining the variables as the errors between each agent’s state and the group reference. 

Because time delay might affect the system stability, it is important to study under what conditions consensus can still be guaranteed even if time delay exists. 

Some other times, when considering random communication failures, random packet drops, communication channel instabilities inherited in physical communication channels, etc., it is necessary and important to study consensus problem in the stochastic setting where a network topology evolves according to some random distributions. 

Note that the existing research on consensus in a sampled-data framework mainly focuses on the simple system dynamics and thus the closed-loop system can be represented in terms of a linear matrix equation. 

The main approach to maintaining the connectivity of a team of agents is to define some artificial potentials (between any pair of agents) in a proper way such that if two agents are neighbors initially then they will always communicate with each other thereafter [206], [219]–[228]. 

In [233], an incremental subgradient approach was used to solve the optimizationJuly 31, 2011 DRAFT26problem for a ring type of network. 

Due to the existence of the group reference, formation tracking is usually much more challenging than formation producing and control algorithms for the latter might not be useful for the former. 

Although both approaches are considered practical depending on the situations and conditions of the real applications, the distributed approach is believed more promising due to many inevitable physical constraints such as limited resources and energy, short wireless communication ranges, narrow bandwidths, and large sizes of vehicles to manage and control. 

it remains a challenging problem to incorporate both dynamics of consensus and probabilistic filtering (Kalman) into a unified methodology.