Open AccessProceedings Article
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid,Mikayel Samvelyan,Christian Schroeder,Gregory Farquhar,Jakob Foerster,Shimon Whiteson +5 more
- pp 4292-4301
TLDR
QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.About:Â
This article is published in International Conference on Machine Learning.The article was published on 2018-07-03 and is currently open access. It has received 505 citations till now. The article focuses on the topics: Reinforcement learning & Monotonic function.read more
Citations
More filters
Posted Content
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid,Mikayel Samvelyan,Christian Schroeder de Witt,Gregory Farquhar,Jakob Foerster,Shimon Whiteson +5 more
TL;DR: In this article, the authors propose a value-based method that can train decentralised policies in a centralised end-to-end fashion in simulated or laboratory settings, where global state information is available and communication constraints are lifted.
Book ChapterDOI
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.
Proceedings Article
Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward
Peter Sunehag,Guy Lever,Audrunas Gruslys,Wojciech Marian Czarnecki,Vinicius Zambaldi,Max Jaderberg,Marc Lanctot,Nicolas Sonnerat,Joel Z. Leibo,Karl Tuyls,Thore Graepel +10 more
TL;DR: This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.
Journal ArticleDOI
Deep Reinforcement Learning for Multiagent Systems: A Review of Challenges, Solutions, and Applications
TL;DR: A survey of different approaches to problems related to multiagent deep RL (MADRL) is presented, including nonstationarity, partial observability, continuous state and action spaces, multiagent training schemes, and multiagent transfer learning.
Proceedings Article
Actor-Attention-Critic for Multi-Agent Reinforcement Learning
Shariq Iqbal,Fei Sha +1 more
TL;DR: This work presents an actor-critic algorithm that trains decentralized policies in multi-agent settings, using centrally computed critics that share an attention mechanism which selects relevant information for each agent at every timestep, which enables more effective and scalable learning in complex multi- agent environments, when compared to recent approaches.
References
More filters
Proceedings Article
QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning.
TL;DR: In this article, value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently are explored.
Posted Content
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
Matthias Plappert,Marcin Andrychowicz,Alex Ray,Bob McGrew,Bowen Baker,Glenn Powell,Jonas Schneider,Josh Tobin,Maciek Chociej,Peter Welinder,Vikash Kumar,Wojciech Zaremba +11 more
TL;DR: A suite of challenging continuous control tasks (integrated with OpenAI Gym) based on currently existing robotics hardware and following a Multi-Goal Reinforcement Learning (RL) framework are introduced.
Proceedings Article
Deep decentralized multi-task multi-agent reinforcement learning under partial observability
TL;DR: A decentralized single-task learning approach that is robust to concurrent interactions of teammates is introduced, and an approach for distilling single- task policies into a unified policy that performs well across multiple related tasks, without explicit provision of task identity is presented.
Proceedings Article
The StarCraft Multi-Agent Challenge
Mikayel Samvelyan,Tabish Rashid,Christian Schroeder de Witt,Gregory Farquhar,Nantas Nardelli,Tim G. J. Rudner,Chia-Man Hung,Philip H. S. Torr,Jakob Foerster,Shimon Whiteson +9 more
TL;DR: The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.
Journal ArticleDOI
Collaborative Multiagent Reinforcement Learning by Payoff Propagation
Jelle R. Kok,Nikos Vlassis +1 more
TL;DR: A set of scalable techniques for learning the behavior of a group of agents in a collaborative multiagent setting using the framework of coordination graphs of Guestrin, Koller, and Parr (2002a) and introduces different model-free reinforcement-learning techniques, unitedly called Sparse Cooperative Q-learning, which approximate the global action-value function based on the topology of a coordination graph.