scispace - formally typeset
Search or ask a question

Showing papers in "Autonomous Agents and Multi-Agent Systems in 2021"


Journal ArticleDOI
TL;DR: A comprehensive view of logic-based technologies for agents and multi-agent systems is provided by making them the subject of a systematic literature review (SLR) and the resulting technologies are discussed and evaluated from two different perspectives.
Abstract: Precisely when the success of artificial intelligence (AI) sub-symbolic techniques makes them be identified with the whole AI by many non-computer-scientists and non-technical media, symbolic approaches are getting more and more attention as those that could make AI amenable to human understanding. Given the recurring cycles in the AI history, we expect that a revamp of technologies often tagged as “classical AI”—in particular, logic-based ones—will take place in the next few years. On the other hand, agents and multi-agent systems (MAS) have been at the core of the design of intelligent systems since their very beginning, and their long-term connection with logic-based technologies, which characterised their early days, might open new ways to engineer explainable intelligent systems. This is why understanding the current status of logic-based technologies for MAS is nowadays of paramount importance. Accordingly, this paper aims at providing a comprehensive view of those technologies by making them the subject of a systematic literature review (SLR). The resulting technologies are discussed and evaluated from two different perspectives: the MAS and the logic-based ones.

57 citations


Journal ArticleDOI
TL;DR: What is needed in order to provide verified reliable behaviour of an autonomous system is analyzed, what can be done as the state-of-the-art in automated verification is analysed, and a roadmap towards developing regulatory guidelines is proposed.
Abstract: A computational system is called autonomous if it is able to make its own decisions, or take its own actions, without human supervision or control. The capability and spread of such systems have reached the point where they are beginning to touch much of everyday life. However, regulators grapple with how to deal with autonomous systems, for example how could we certify an Unmanned Aerial System for autonomous use in civilian airspace? We here analyse what is needed in order to provide verified reliable behaviour of an autonomous system, analyse what can be done as the state-of-the-art in automated verification, and propose a roadmap towards developing regulatory guidelines, including articulating challenges to researchers, to engineers, and to regulators. Case studies in seven distinct domains illustrate the article.

35 citations


Journal ArticleDOI
TL;DR: A privacy recommendation model for images using tags and an agent that implements this, namely pelte, which can accurately predict privacy settings even when a user has shared a few images with others, the images have only a few tags or the user’s friends have varying privacy preferences is proposed.
Abstract: Image sharing is a service offered by many online social networks. In order to preserve privacy of images, users need to think through and specify a privacy setting for each image that they upload. This is difficult for two main reasons: first, research shows that many times users do not know their own privacy preferences, but only become aware of them over time. Second, even when users know their privacy preferences, editing these privacy settings is cumbersome and requires too much effort, interfering with the quick sharing behavior expected on an online social network. Accordingly, this paper proposes a privacy recommendation model for images using tags and an agent that implements this, namely pelte. Each user agent makes use of the privacy settings that its user have set for previous images to predict automatically the privacy setting for an image that is uploaded to be shared. When in doubt, the agent analyzes the sharing behavior of other users in the user’s network to be able to recommend to its user about what should be considered as private. Contrary to existing approaches that assume all the images are available to a centralized model, pelte is compatible to distributed environments since each agent accesses only the privacy settings of the images that the agent owner has shared or those that have been shared with the user. Our simulations on a real-life dataset shows that pelte can accurately predict privacy settings even when a user has shared a few images with others, the images have only a few tags or the user’s friends have varying privacy preferences.

25 citations


Journal ArticleDOI
TL;DR: In this paper, the authors explored the effectiveness of different trust repair strategies from an intelligent agent by measuring the development of human trust and advice taking in a Human-Agent Teaming task.
Abstract: The role of intelligent agents becomes more social as they are expected to act in direct interaction, involvement and/or interdependency with humans and other artificial entities, as in Human-Agent Teams (HAT). The highly interdependent and dynamic nature of teamwork demands correctly calibrated trust among team members. Trust violations are an inevitable aspect of the cycle of trust and since repairing damaged trust proves to be more difficult than building trust initially, effective trust repair strategies are needed to ensure durable and successful team performance. The aim of this study was to explore the effectiveness of different trust repair strategies from an intelligent agent by measuring the development of human trust and advice taking in a Human-Agent Teaming task. Data for this study were obtained using a task environment resembling a first-person shooter game. Participants carried out a mission in collaboration with their artificial team member. A trust violation was provoked when the agent failed to detect an approaching enemy. After this, the agent offered one of four trust repair strategies, composed of the apology components explanation and expression of regret (either one alone, both or neither). Our results indicated that expressing regret was crucial for effective trust repair. After trust declined due to the violation by the agent, trust only significantly recovered when an expression of regret was included in the apology. This effect was stronger when an explanation was added. In this context, the intelligent agent was the most effective in its attempt of rebuilding trust when it provided an apology that was both affective, and informational. Finally, the implications of our findings for the design and study of Human-Agent trust repair are discussed.

14 citations


Journal ArticleDOI
TL;DR: A new deep Bayesian policy reuse algorithm, a.k.a. DPN-BPR+ is proposed, by extending the recent BPR+ algorithm with a neural network as the value-function approximator and taking advantage of the opponent model to infer the other agents’ policy from reward signals and its behavior.
Abstract: One challenging problem in multiagent systems is to cooperate or compete with non-stationary agents that change behavior from time to time. An agent in such a non-stationary environment is usually supposed to be able to quickly detect the other agents’ policy during online interaction, and then adapt its own policy accordingly. This article studies efficient policy detecting and reusing techniques when playing against non-stationary agents in cooperative or competitive Markov games. We propose a new deep Bayesian policy reuse algorithm, a.k.a. DPN-BPR+, by extending the recent BPR+ algorithm with a neural network as the value-function approximator. To detect policy accurately, we propose the rectified belief model taking advantage of the opponent model to infer the other agents’ policy from reward signals and its behavior. Instead of directly storing individual policies as BPR+, we introduce distilled policy network that serves as the policy library, and policy distillation to achieve efficient online policy learning and reuse. DPN-BPR+ inherits all the advantages of BPR+. In experiments, we evaluate DPN-BPR+ in terms of detection accuracy, cumulative reward and speed of convergence in four complex Markov games with raw visual inputs, including two cooperative games and two competitive games. Empirical results show that our proposed DPN-BPR+ approach has better performance than existing algorithms in all these Markov games.

13 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed charging behavior model can better capture the bounded rationality of human players in the charging activity compared with state-of-the-art behavior models.
Abstract: Optimal placement of charging stations for electric vehicles (EVs) is critical for providing convenient charging service to EV owners and promoting public acceptance of EVs. There has been a lot of work on EV charging station placement, yet EV drivers’ charging strategy, which plays an important role in deciding charging stations’ performance, is missing. EV drivers make choice among charging stations according to various factors, including the distance, the charging fare and queuing condition in different stations etc. In turn, some factors, like queuing condition, is greatly influenced by EV drivers’ choices. As more EVs visit the same station, longer queuing duration should be expected. This work first proposes a behavior model to capture the decision making of EV drivers in choosing charging stations, based on which an optimal charging station placement model is presented to minimize the social cost (defined as the congestion in charging stations suffered by all EV drivers). Through analyzing EV drivers’ decision-making in the charging process, we propose a k-Level nested Quantal Response Equilibrium charging behavior model inspired by Quantal Response Equilibrium model and level-k thinking model. We then design a set of user studies to simulate charging scenarios and collect data from human players to learn the parameters of different behavior models. Experimental results show that our charging behavior model can better capture the bounded rationality of human players in the charging activity compared with state-of-the-art behavior models. Furthermore, to evaluate the proposed charging behavior model, we formulate the charging station placement problem with it and design an algorithm to solve the problem. It is shown that our approach obtains placement with a significantly better performance to different extent, especially when the budget is limited and relatively low.

10 citations


Journal Article
TL;DR: The deep convolutional network MAPFAST (Multi-Agent Path Finding Algorithm SelecTor) is developed, which takes a MAPF problem instance and attempts to select the fastest algorithm to use from a portfolio of algorithms.
Abstract: Solving the Multi-Agent Path Finding (MAPF) problem optimally is known to be NP-Hard for both make-span and total arrival time minimization. While many algorithms have been developed to solve MAPF problems, there is no dominating optimal MAPF algorithm that works well in all types of problems and no standard guidelines for when to use which algorithm. In this work, we develop the deep convolutional network MAPFAST (Multi-Agent Path Finding Algorithm SelecTor), which takes a MAPF problem instance and attempts to select the fastest algorithm to use from a portfolio of algorithms. We improve the performance of our model by including single-agent shortest paths in the instance embedding given to our model and by utilizing supplemental loss functions in addition to a classification loss. We evaluate our model on a large and diverse dataset of MAPF instances, showing that it outperforms all individual algorithms in its portfolio as well as the state-of-the-art optimal MAPF algorithm selector. We also provide an analysis of algorithm behavior in our dataset to gain a deeper understanding of optimal MAPF algorithms' strengths and weaknesses to help other researchers leverage different heuristics in algorithm designs.

8 citations



Journal ArticleDOI
TL;DR: In this article, the authors propose a formalism to model and reason about reconfigurable multi-agent systems, where agents interact and communicate in different modes so that they can pursue joint tasks; agents may dynamically synchronize, exchange data, adapt their behaviour, and reconfigure their communication interfaces.
Abstract: We propose a formalism to model and reason about reconfigurable multi-agent systems. In our formalism, agents interact and communicate in different modes so that they can pursue joint tasks; agents may dynamically synchronize, exchange data, adapt their behaviour, and reconfigure their communication interfaces. Inspired by existing multi-robot systems, we represent a system as a set of agents (each with local state), executing independently and only influence each other by means of message exchange. Agents are able to sense their local states and partially their surroundings. We extend ltl to be able to reason explicitly about the intentions of agents in the interaction and their communication protocols. We also study the complexity of satisfiability and model-checking of this extension.

6 citations


Journal Article
TL;DR: In this work, a novel communication mechanism called Intrinsic Motivated Multi-Agent Communication (IMMAC) is presented, which uses an observation-dependent intrinsic value to represent the importance of observed information and an attentional mechanism based on intrinsic values to control communication.
Abstract: Efficient communication is a promising way to achieve cooperation among agents in many real-world scenarios. However, aimless and motiveless information sharing may not work or even degrade the cooperative performance. Typically, the multi-agent communication behaviors are motivated by extrinsic rewards from environment. We conclude the mechanism as ’Communicate what rewards you’. In this work, we present a novel communication mechanism called Intrinsic Motivated Multi-Agent Communication (IMMAC). Our key insight can be summarized as ’Communicate what surprises you’. Concretely, we use an observation-dependent intrinsic value to represent the importance of observed information. Then a gating mechanism and an attentional mechanism based on intrinsic values are designed to control communication. By encouraging agent to communicate and focus on the observations with uncertain and important information, our algorithm achieves superior communication efficiency and cooperative performance.We evaluate IMMAC on a variety of challenging tasks, and demonstrate that intrinsic values are sufficient to drive efficient communication behaviors. Moreover, we found that the combination of intrinsic values and extrinsic values can further improve the communication efficiency. Consequently, intrinsic motivation is a promising way to control communication and it is capable of being a good complement to the existing extrinsic motivated communication methods.

6 citations



Journal ArticleDOI
TL;DR: It is proved that tournaments with a static schedule and at least five players always include irrelevant matches, and dynamic schedules for an arbitrary number of players can be devised that avoid irrelevant matches.
Abstract: We consider tournaments played by a set of players in order to establish a ranking among them. We introduce the notion of irrelevant match, as a match that does not influence the ultimate ranking of the involved parties. After discussing the basic properties of this notion, we seek out tournaments that have no irrelevant matches, focusing on the class of tournaments where each player challenges each other exactly once. We prove that tournaments with a static schedule and at least five players always include irrelevant matches. Conversely, dynamic schedules for an arbitrary number of players can be devised that avoid irrelevant matches, at least for one of the players involved in each match. Finally, we prove by computational means that there exist tournaments where all matches are relevant to both players, at least up to eight players.


Journal ArticleDOI
TL;DR: In this article, a formal definition and mathematical model of real-time multi-agent systems (RT-MAS) is proposed and the results obtained by testing the dynamics characterizing the RT-MAS model within the simulator MAXIM-GPRT are presented.
Abstract: Since its dawn as a discipline, Artificial Intelligence (AI) has focused on mimicking the human mental processes. As AI applications matured, the interest for employing them into real-world complex systems (i.e., coupling AI with Cyber-Physical Systems—CPS) kept increasing. In the last decades, the multi-agent systems (MAS) paradigm has been among the most relevant approaches fostering the development of intelligent systems. In numerous scenarios, MAS boosted distributed autonomous reasoning and behaviors. However, many real-world applications (e.g., CPS) demand the respect of strict timing constraints. Unfortunately, current AI/MAS theories and applications only reason “about time” and are incapable of acting “in time” guaranteeing any timing predictability. This paper analyzes the MAS compliance with strict timing constraints (real-time compliance)—crucial for safety-critical applications such as healthcare, industry 4.0, and automotive. Moreover, it elicits the main reasons for the lack of real-time satisfiability in MAS (originated from current theories, standards, and implementations). In particular, traditional internal agent schedulers (general-purpose-like), communication middlewares, and negotiation protocols have been identified as co-factors inhibiting real-time compliance. To pave the road towards reliable and predictable MAS, this paper postulates a formal definition and mathematical model of real-time multi-agent systems (RT-MAS). Furthermore, this paper presents the results obtained by testing the dynamics characterizing the RT-MAS model within the simulator MAXIM-GPRT. Thus, it has been possible to analyze the deadline miss ratio between the algorithms employed in the most popular frameworks and the proposed ones. Finally, discussing the obtained results, the ongoing and future steps are outlined.

Journal ArticleDOI
TL;DR: In this paper, an actor-critic architecture with model-free reinforcement learning is used to learn a negotiation strategy expressed as a deep neural network, which can adapt to different e-market settings without the need to be pre-programmed.
Abstract: We present a novel negotiation model that allows an agent to learn how to negotiate during concurrent bilateral negotiations in unknown and dynamic e-markets. The agent uses an actor-critic architecture with model-free reinforcement learning to learn a strategy expressed as a deep neural network. We pre-train the strategy by supervision from synthetic market data, thereby decreasing the exploration time required for learning during negotiation. As a result, we can build automated agents for concurrent negotiations that can adapt to different e-market settings without the need to be pre-programmed. Our experimental evaluation shows that our deep reinforcement learning based agents outperform two existing well-known negotiation strategies in one-to-many concurrent bilateral negotiations for a range of e-market settings.


Journal Article
TL;DR: The challenge of building a seeing-eye robot is proposed, as a thought-provoking use-case that helps identify the challenges to be faced when creating behaviors for robot assistants in general.
Abstract: Automated care systems are becoming more tangible than ever: recent breakthroughs in robotics and machine learning can be used to address the need for automated care created by the increasing aging population. However, such systems require overcoming several technological, ethical, and social challenges. One inspirational manifestation of these challenges can be observed in the training of seeing-eye dogs for visually impaired people. A seeing-eye dog is not just trained to obey its owner, but also to “intelligently disobey”: if it is given an unsafe command from its handler, it is taught to disobey it or even insist on a different course of action. This paper proposes the challenge of building a seeing-eye robot, as a thought-provoking use-case that helps identify the challenges to be faced when creating behaviors for robot assistants in general. Through this challenge, this paper delineates the prerequisites that an automated care system will need to have in order to perform intelligent disobedience and to serve as a true agent for its handler.

Journal ArticleDOI
TL;DR: In this paper, game description language (GDL) has been used to define negotiation domains, such as Colored Trails and Genius, which can also be used for defining more complex negotiation domains.
Abstract: Recently, it has been proposed that Game Description Language (GDL) could be used to define negotiation domains. This would open up an entirely new, declarative, approach to Automated Negotiations in which a single algorithm could negotiate over any domain, as long as that domain is expressible in GDL. However, until now, the feasibility of this approach has only been demonstrated on a few toy-world problems. Therefore, in this paper we show that GDL is a truly unifying language that can also be used to define more general and more complex negotiation domains. We demonstrate this by showing that some of the most commonly used test-beds in the Automated Negotiations literature, namely Genius and Colored Trails, can be described in GDL. More specifically, we formally prove that the set of possible agreements of any negotiation domain from Genius (either linear or non-linear) can be modeled as a set of strategies over a deterministic extensive-form game. Furthermore, we show that this game can be effectively described in GDL and we show experimentally that, given only this GDL description, we can explore the agreement space efficiently using entirely generic domain-independent algorithms. In addition, we show that the same holds for negotiation domains in the Colored Trails framework. This means that one could indeed implement a single negotiating agent that is capable of negotiating over a broad class of negotiation domains, including Genius and Colored Trails.

Journal ArticleDOI
TL;DR: In this paper, a hierarchical AV behaviour model is proposed for the holistic evaluation of autonomous and mixed traffic by unifying a wide spectrum of AV functionality, including long-term planning, path planning, complex platooning manoeuvres, and low-level longitudinal and lateral control.
Abstract: Microscopic agent-based traffic simulation is an important tool for the efficient and safe resolution of various traffic challenges accompanying the introduction of autonomous vehicles on the roads. Both the variety of questions that can be asked and the quality of answers provided by simulations, however, depend on the underlying models. In mixed traffic, the two most critical models are the models describing the driving behaviour of humans and AVs, respectively. This paper presents AVDM (Autonomous Vehicle Driving Model), a hierarchical AV behaviour model that allows the holistic evaluation of autonomous and mixed traffic by unifying a wide spectrum of AV functionality, including long-term planning, path planning, complex platooning manoeuvres, and low-level longitudinal and lateral control. The model consists of hierarchically layered modules bidirectionally connected by messages and commands. On top, a high-level planning module makes decisions whether to join/form platoons and how to follow the vehicle’s route. A platooning manoeuvres layer guides involved AVs through the manoeuvres chosen to be executed, assisted by the trajectory planning layer, which, after finding viable paths through complex traffic conditions, sends simple commands to the low-level control layer to execute those paths. The model has been implemented in the BEHAVE mixed traffic simulation tool and achieved a 92% success rate for platoon joining manoeuvres in mixed traffic conditions. As a proof of concept, we conducted a mixed traffic simulation study showing that enabling platooning on a highway scenario shifts the velocity-density curve upwards despite the additional lane changing and manoeuvring it induces.

Journal ArticleDOI
TL;DR: In this paper, the authors provide a complete computational account of argumentation-based negotiation under incomplete opponent profiles, and present a negotiation framework based on these ideas, along with experimental evidence that highlights the advantages of their approach.
Abstract: Computational argumentation has taken a predominant place in the modeling of negotiation dialogues over the last years. A competent agent participating in a negotiation process is expected to decide its next move taking into account an, often incomplete, model of its opponent. This work provides a complete computational account of argumentation-based negotiation under incomplete opponent profiles. After the agent identifies its best option, in any state of a negotiation, it looks for suitable arguments that support this option in the theory of its opponent. As the knowledge on the opponent is uncertain, the challenge is to find arguments that, ideally, support the selected option despite the uncertainty. We present a negotiation framework based on these ideas, along with experimental evidence that highlights the advantages of our approach.

Journal ArticleDOI
TL;DR: In this article, the dominant set selection problem is formulated as finding the preferences over all possible sets of objects by grounding the preference over features to preferences over the objects themselves, and lifting these preferences to preferences on all possible set of objects.
Abstract: Decision makers can often be confronted with the need to select a subset of objects from a set of candidate objects by just counting on preferences regarding the objects’ features. Here we formalise this problem as the dominant set selection problem. Solving this problem amounts to finding the preferences over all possible sets of objects. We accomplish so by: (i) grounding the preferences over features to preferences over the objects themselves; and (ii) lifting these preferences to preferences over all possible sets of objects. This is achieved by combining lex-cel –a method from the literature—with our novel anti-lex-cel method, which we formally (and thoroughly) study. Furthermore, we provide a binary integer program encoding to solve the problem. Finally, we illustrate our overall approach by applying it to the selection of value-aligned norm systems.


Journal Article
TL;DR: In this model, voters submit interaction structures, and the goal is to find an aggregated structure, given a set P ofm projects, and n partitions of P, which is to aggregate these n partitions into one aggregated partition.
Abstract: Recently, Jain et al. [IJCAI, 2019] studied the effect of project interactions in participatory budgeting (PB) by assuming an existing partition of the projects to interaction structures, namely a grouping of the projects into substitution and complementarity groups. Motivated by their study, here we take voter preferences to find such interaction structures. In our model, voters submit interaction structures, and the goal is to find an aggregated structure. Formally, given a set P ofm projects, and n partitions of P , the task is to aggregate these n partitions into one aggregated partition. We consider this partition aggregation task both for substitution structures and for complementarity structures, studying several aggregation methods for each, including utility-based methods and Condorcet-based methods; we evaluate these methods by analyzing their computational complexity and their behavior with respect to certain relevant axiomatic properties.


Journal Article
TL;DR: This paper uses the generative adversarial imitation learning (GAIL) technique to coordinate the drones’ actions by directly imitating the peer’s demonstrations, and transforms historical observation-action trajectories into belief representations, which are trained in conjunction with the imitation policies.
Abstract: The proliferation of unmanned aerial vehicles (UAVs) has flourished various intelligent services, in which the effective coordination plays a significant role in enhancing swarm execution efficiency. However, due to the unreliable communication in the air as well as the heterogeneity in operation mode, it is challenging to achieve highly coordinated actions, particularly in the fully distributed environment with incomplete observations. In this paper, we leverage the generative adversarial imitation learning (GAIL) technique to coordinate the drones’ actions by directly imitating the peer’s demonstrations. In order to characterize the true environment state under local incomplete observations, we transform historical observation-action trajectories into belief representations, which are trained in conjunction with the imitation policies. We also gain regularized belief representations by correlating the prediction of future states, the trace of historical contexts, and the action-assisted guidance information, which contribute to more accurate imitation policies. We evaluate the proposed algorithm on the drones’ formation control scenario. Evaluation results show the superiorities on imitation accuracy, teamwork execution time and energy cost.

Journal ArticleDOI
TL;DR: In this paper, the authors provide lexicographic probabilistic serial (LexiPS) as an extension of the PS mechanism for multi-type resource allocation with lexico-ographic preferences, and prove that LexiPS satisfies sd-efficiency and sd-envy-freeness.
Abstract: In multi-type resource allocation (MTRA) problems, there are $$d\ge 2$$ types of items, and n agents who each demand one unit of items of each type and have strict linear preferences over bundles consisting of one item of each type For MTRAs with indivisible items, our first result is an impossibility theorem that is in direct contrast to the single type ( $$d=1$$ ) setting: no mechanism, the output of which is always decomposable into a probability distribution over discrete assignments (where no item is split between agents), can satisfy both sd-efficiency and sd-envy-freeness We show that this impossibility result is circumvented under the natural assumption of lexicographic preferences We provide lexicographic probabilistic serial (LexiPS) as an extension of the probabilistic serial (PS) mechanism for MTRAs with lexicographic preferences, and prove that LexiPS satisfies sd-efficiency and sd-envy-freeness, retaining the desirable properties of PS Moreover, LexiPS satisfies sd-weak-strategyproofness when agents are not allowed to misreport their importance orders For MTRAs with divisible items, we show that the existing multi-type probabilistic serial (MPS) mechanism satisfies the stronger efficiency notion of lexi-efficiency, and is sd-envy-free under strict linear preferences and sd-weak-strategyproof under lexicographic preferences We also prove that MPS can be characterized both by leximin-optimality and by item-wise ordinal fairness, and the family of eating algorithms which MPS belongs to can be characterized by lexi-efficiency

Journal Article
TL;DR: In this article, the authors considered the problem of facility location on discrete trees, and proved that the trajectory of the facility is almost contained in the trajectories of the agent and both move in the same direction along the common edges.
Abstract: We address the problem of strategyproof (SP) facility location mechanisms on discrete trees. Our main result is a full characterization of onto and SP mechanisms. In particular, we prove that when a single agent significantly affects the outcome, the trajectory of the facility is almost contained in the trajectory of the agent, and both move in the same direction along the common edges. We show tight relations of our characterization to previous results on discrete lines and on continuous trees. We then derive further implications of the main result for infinite discrete lines.

Journal ArticleDOI
TL;DR: In this article, a cloud-native multi-agent platform (cloneMAP) is proposed to enable scalability and fault-tolerance in the field of the Internet of Things (IoT).
Abstract: Multi-agent systems (MAS) represent a distributed computing paradigm well suited to tackle today’s challenges in the field of the Internet of Things (IoT). Both share many similarities such as the interconnection of distributed devices and their cooperation. The combination of MAS and IoT would allow the transfer of the experience gained in MAS research to the broader range of IoT applications. The key enabler for utilizing MAS in the IoT is the ability to build large-scale and fault-tolerant MASs since IoT concepts comprise possibly thousands or even millions of devices. However, well known multi-agent platforms (MAP), e. g., Java Agent DE-velopment Framework (JADE), are not able to deal with these challenges. To this aim, we present a cloud-native Multi-Agent Platform (cloneMAP) as a modern MAP based on cloud-computing techniques to enable scalability and fault-tolerance. A microservice architecture is used to implement it in a distributed way utilizing the open-source container orchestration system Kubernetes. Thereby, bottlenecks and single-points of failure are conceptually avoided. A comparison with JADE via relevant performance metrics indicates the massively improved scalability. Furthermore, the implementation of a large-scale use case verifies cloneMAP’s suitability for IoT applications. This leads to the conclusion that cloneMAP extends the range of possible MAS applications and enables the integration with IoT concepts.

Journal Article
TL;DR: In this article, the authors formally analyze centralized and decentralized critic approaches and provide a deeper understanding of the implications of critic choice, and empirically compare the centralized and decentralised critic methods over a wide set of environments.
Abstract: Centralized Training for Decentralized Execution, where agents are trained offline using centralized information but execute in a decentralized manner online, has gained popularity in the multi-agent reinforcement learning community. In particular, actor-critic methods with a centralized critic and decentralized actors are a common instance of this idea. However, the implications of using a centralized critic in this context are not fully discussed and understood even though it is the standard choice of many algorithms. We therefore formally analyze centralized and decentralized critic approaches, providing a deeper understanding of the implications of critic choice. Because our theory makes unrealistic assumptions, we also empirically compare the centralized and decentralized critic methods over a wide set of environments to validate our theories and to provide practical advice. We show that there exist misconceptions regarding centralized critics in the current literature and show that the centralized critic design is not strictly beneficial, but rather both centralized and decentralized critics have different pros and cons that should be taken into account by algorithm designers.

Journal ArticleDOI
TL;DR: In this article, the computational complexity of strategic voting for shortlisting based on the perhaps most basic voting rule in this scenario was studied, and it was shown that in an egalitarian setting, strategic voting may indeed be computationally intractable regardless of the tie-breaking rule.
Abstract: Shortlisting of candidates—selecting a group of “best” candidates—is a special case of multiwinner elections. We provide the first in-depth study of the computational complexity of strategic voting for shortlisting based on the perhaps most basic voting rule in this scenario, $$\ell $$ -Bloc (every voter approves $$\ell $$ candidates). In particular, we investigate the influence of several different group evaluation functions (e.g., egalitarian versus utilitarian) and tie-breaking mechanisms modeling pessimistic and optimistic manipulators. Among other things, we conclude that in an egalitarian setting strategic voting may indeed be computationally intractable regardless of the tie-breaking rule. Altogether, we provide a fairly comprehensive picture of the computational complexity landscape of this scenario.