Open AccessProceedings Article
Value-Decomposition Networks For Cooperative Multi-Agent Learning Based On Team Reward
Peter Sunehag,Guy Lever,Audrunas Gruslys,Wojciech Marian Czarnecki,Vinicius Zambaldi,Max Jaderberg,Marc Lanctot,Nicolas Sonnerat,Joel Z. Leibo,Karl Tuyls,Thore Graepel +10 more
- pp 2085-2087
TLDR
This work addresses the problem of cooperative multi-agent reinforcement learning with a single joint reward signal by training individual agents with a novel value decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.Abstract:
We study the problem of cooperative multi-agent reinforcement learning with a single joint reward signal. This class of learning problems is difficult because of the often large combined action and observation spaces. In the fully centralized and decentralized approaches, we find the problem of spurious rewards and a phenomenon we call the "lazy agent'' problem, which arises due to partial observability. We address these problems by training individual agents with a novel value-decomposition network architecture, which learns to decompose the team value function into agent-wise value functions.read more
Citations
More filters
Posted Content
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning
Tabish Rashid,Mikayel Samvelyan,Christian Schroeder de Witt,Gregory Farquhar,Jakob Foerster,Shimon Whiteson +5 more
TL;DR: In this article, the authors propose a value-based method that can train decentralised policies in a centralised end-to-end fashion in simulated or laboratory settings, where global state information is available and communication constraints are lifted.
Book ChapterDOI
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.
Journal ArticleDOI
A survey and critique of multiagent deep reinforcement learning
TL;DR: In this paper, the authors provide a clear overview of current multi-agent deep reinforcement learning (MDRL) literature, and provide general guidelines to new practitioners in the area: describing lessons learned from MDRL works, pointing to recent benchmarks, and outlining open avenues of research.
Journal ArticleDOI
Multi-agent deep reinforcement learning: a survey
Sven Gronauer,Klaus Diepold +1 more
TL;DR: This article provides an overview of the current developments in the field of multi-agent deep reinforcement learning, focusing primarily on literature from recent years that combinesDeep reinforcement learning methods with a multi- agent scenario.
Proceedings Article
QPLEX: Duplex Dueling Multi-Agent Q-Learning
TL;DR: A novel MARL approach, called duPLEX dueling multi-agent Q-learning (QPLEX), which takes a duplex dueling network architecture to factorize the joint value function and encodes the IGM principle into the neural network architecture and thus enables efficient value function learning.
References
More filters
Journal ArticleDOI
A Comprehensive Survey of Multiagent Reinforcement Learning
TL;DR: The benefits and challenges of MARL are described along with some of the problem domains where the MARL techniques have been applied, and an outlook for the field is provided.
Journal ArticleDOI
Cooperative Multi-Agent Learning: The State of the Art
Liviu Panait,Sean Luke +1 more
TL;DR: This survey attempts to draw from multi-agent learning work in a spectrum of areas, including RL, evolutionary computation, game theory, complex systems, agent modeling, and robotics, and finds that this broad view leads to a division of the work into two categories.
Proceedings Article
The dynamics of reinforcement learning in cooperative multiagent systems
Caroline Claus,Craig Boutilier +1 more
TL;DR: This work distinguishes reinforcement learners that are unaware of (or ignore) the presence of other agents from those that explicitly attempt to learn the value of joint actions and the strategies of their counterparts, and proposes alternative optimistic exploration strategies that increase the likelihood of convergence to an optimal equilibrium.
Proceedings Article
The complexity of decentralized control of Markov decision processes
TL;DR: In this paper, the authors considered the problem of planning for distributed agents with partial state information from a decision-theoretic perspective, and provided mathematical evidence corresponding to the intuition that decentralized planning problems cannot easily be reduced to centralized problems and solved exactly using established techniques.
Book
A Concise Introduction to Decentralized POMDPs
TL;DR: This book introduces multiagent planning under uncertainty as formalized by decentralized partially observable Markov decision processes (Dec-POMDPs).