Mean Field Multi-Agent Reinforcement Learning

Open AccessPosted Content

Mean Field Multi-Agent Reinforcement Learning

Yaodong Yang, +5 more

- 15 Feb 2018 -

arXiv: Multiagent Systems

Chats0

TLDR

In this paper, a mean field Q-learning and mean field Actor-Critic algorithms are proposed to solve the Ising model via model-free reinforcement learning methods. But the authors admit that the learning of the individual agent's optimal policy depends on the dynamics of the population, while the dynamics change according to the collective patterns of individual policies.

Abstract:

Existing multi-agent reinforcement learning methods are limited typically to a small number of agents. When the agent number increases largely, the learning becomes intractable due to the curse of the dimensionality and the exponential growth of agent interactions. In this paper, we present \emph{Mean Field Reinforcement Learning} where the interactions within the population of agents are approximated by those between a single agent and the average effect from the overall population or neighboring agents; the interplay between the two entities is mutually reinforced: the learning of the individual agent's optimal policy depends on the dynamics of the population, while the dynamics of the population change according to the collective patterns of the individual policies. We develop practical mean field Q-learning and mean field Actor-Critic algorithms and analyze the convergence of the solution to Nash equilibrium. Experiments on Gaussian squeeze, Ising model, and battle games justify the learning effectiveness of our mean field approaches. In addition, we report the first result to solve the Ising model via model-free reinforcement learning methods.

Citations

PDF

Open Access

More filters

Posted Content

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Tabish Rashid, +5 more

- 30 Mar 2018 -

arXiv: Learning

TL;DR: In this article, the authors propose a value-based method that can train decentralised policies in a centralised end-to-end fashion in simulated or laboratory settings, where global state information is available and communication constraints are lifted.

...read moreread less

Book ChapterDOI

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Kaiqing Zhang, +2 more

- 29 Apr 2021 -

arXiv: Learning

TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.

...read moreread less

Proceedings Article

QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning

Tabish Rashid, +5 more

TL;DR: QMIX employs a network that estimates joint action-values as a complex non-linear combination of per-agent values that condition only on local observations, and structurally enforce that the joint-action value is monotonic in the per- agent values, which allows tractable maximisation of the jointaction-value in off-policy learning.

...read moreread less

Proceedings Article

QTRAN: Learning to Factorize with Transformation for Cooperative Multi-Agent Reinforcement Learning.

Kyunghwan Son, +4 more

TL;DR: In this article, value-based solutions for multi-agent reinforcement learning (MARL) tasks in the centralized training with decentralized execution (CTDE) regime popularized recently are explored.

...read moreread less

Proceedings Article

The StarCraft Multi-Agent Challenge

Mikayel Samvelyan, +9 more

TL;DR: The StarCraft Multi-Agent Challenge (SMAC), based on the popular real-time strategy game StarCraft II, is proposed as a benchmark problem and an open-source deep multi-agent RL learning framework including state-of-the-art algorithms is opened.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Phase Transitions and Critical Phenomena

Cyril Domb, +2 more

TL;DR: The field of phase transitions and critical phenomena continues to be active in research, producing a steady stream of interesting and fruitful results as discussed by the authors, and the major aim of this serial is to provide review articles that can serve as standard references for research workers in the field.

...read moreread less

Journal ArticleDOI

Beitrag zur Theorie des Ferromagnetismus

Ernst Ising

- 01 Feb 1925 -

European Physical Journal A

Book

Introductory functional analysis with applications

Erwin Kreyszig

TL;DR: In this paper, the spectral theory of linear operators in normed spaces and their spectrum has been studied in the context of bounded self-and-adjoint linear operators and their applications in quantum mechanics.

...read moreread less

Book ChapterDOI

Markov games as a framework for multi-agent reinforcement learning

Michael L. Littman

TL;DR: A Q-learning-like algorithm for finding optimal policies and its application to a simple two-player game in which the optimal policy is probabilistic is demonstrated.

...read moreread less

Journal ArticleDOI

Stochastic Games

Lloyd S. Shapley

- 01 Oct 1953 -

Proceedings of the National Academy of S...

TL;DR: In a stochastic game the play proceeds by steps from position to position, according to transition probabilities controlled jointly by the two players, and the expected total gain or loss is bounded by M, which depends on N 2 + N matrices.

...read moreread less