Open AccessProceedings Article
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid,Mikayel Samvelyan,Christian Schroeder,Gregory Farquhar,Jakob Foerster,Shimon Whiteson +5 more
- pp 4292-4301
TLDR
QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.About:
This article is published in International Conference on Machine Learning.The article was published on 2018-07-03 and is currently open access. It has received 505 citations till now. The article focuses on the topics: Reinforcement learning & Monotonic function.read more
Citations
More filters
Posted Content
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid,Mikayel Samvelyan,Christian Schroeder de Witt,Gregory Farquhar,Jakob Foerster,Shimon Whiteson +5 more
TL;DR: In this article, the authors propose a value-based method that can train decentralised policies in a centralised end-to-end fashion in simulated or laboratory settings, where global state information is available and communication constraints are lifted.
Book ChapterDOI
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.
Proceedings Article
Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward
Peter Sunehag,Guy Lever,Audrunas Gruslys,Wojciech Marian Czarnecki,Vinicius Zambaldi,Max Jaderberg,Marc Lanctot,Nicolas Sonnerat,Joel Z. Leibo,Karl Tuyls,Thore Graepel +10 more
TL;DR: This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.
Journal ArticleDOI
Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications
TL;DR: A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning.
Proceedings Article
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Shariq Iqbal,Fei Sha +1 more
TL;DR: This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.
References
More filters
Journal ArticleDOI
Grandmaster level in StarCraft II using multi-agent reinforcement learning.
Oriol Vinyals,Igor Babuschkin,Wojciech Marian Czarnecki,Michael Mathieu,Andrew Dudzik,Junyoung Chung,David H. Choi,Richard E. Powell,Timo Ewalds,Petko Georgiev,Junhyuk Oh,Dan Horgan,Manuel Kroiss,Ivo Danihelka,Aja Huang,Laurent Sifre,Trevor Cai,John P. Agapiou,Max Jaderberg,Alexander Vezhnevets,Rémi Leblond,Tobias Pohlen,Valentin Dalibard,David Budden,Yury Sulsky,James Molloy,Tom Le Paine,Caglar Gulcehre,Ziyu Wang,Tobias Pfaff,Yuhuai Wu,Roman Ring,Dani Yogatama,Dario Wünsch,Katrina McKinney,Oliver Smith,Tom Schaul,Timothy P. Lillicrap,Koray Kavukcuoglu,Demis Hassabis,Chris Apps,David Silver +41 more
TL;DR: The agent, AlphaStar, is evaluated, which uses a multi-agent reinforcement learning algorithm and has reached Grandmaster level, ranking among the top 0.2% of human players for the real-time strategy game StarCraft II.
Journal ArticleDOI
The arcade learning environment: an evaluation platform for general agents
TL;DR: The Arcade Learning Environment (ALE) as discussed by the authors is a platform for evaluating the development of general, domain-independent AI technology, which provides an interface to hundreds of Atari 2600 game environments, each one different, interesting, and designed to be a challenge for human players.
Journal ArticleDOI
A Comprehensive Survey of Multiagent Reinforcement Learning
TL;DR: The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.
Posted Content
An Overview of Recent Progress in the Study of Distributed Multi-agent Coordination
TL;DR: In this paper, the authors reviewed some main results and progress in distributed multi-agent coordination, focusing on papers published in major control systems and robotics journals since 2006, and proposed several promising research directions along with some open problems that are deemed important for further investigations.
Proceedings Article
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
TL;DR: In this article, an actor-critic method was used to learn multi-agent coordination policies in cooperative and competitive multi-player RL games, where agent populations are able to discover various physical and informational coordination strategies.