Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Open AccessPosted Content

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

- 07 Jun 2017 -

TLDR

An adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination is presented.

Abstract:

We explore deep reinforcement learning methods for multi-agent domains. We begin by analyzing the difficulty of traditional algorithms in the multi-agent case: Q-learning is challenged by an inherent non-stationarity of the environment, while policy gradient suffers from a variance that increases as the number of agents grows. We then present an adaptation of actor-critic methods that considers action policies of other agents and is able to successfully learn policies that require complex multi-agent coordination. Additionally, we introduce a training regimen utilizing an ensemble of policies for each agent that leads to more robust multi-agent policies. We show the strength of our approach compared to existing methods in cooperative as well as competitive scenarios, where agent populations are able to discover various physical and informational coordination strategies.

Citations

PDF

Open Access

More filters

Posted Content

Deep Reinforcement Learning: An Overview

Yuxi Li

- 25 Jan 2017 -

arXiv: Learning

TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.

...read moreread less

Book ChapterDOI

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Kaiqing Zhang, +2 more

- 29 Apr 2021 -

arXiv: Learning

TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.

...read moreread less

Posted Content

Counterfactual Multi-Agent Policy Gradients

Jakob Foerster, +4 more

- 24 May 2017 -

arXiv: Artificial Intelligence

TL;DR: A new multi-agent actor-critic method called counterfactual multi- agent (COMA) policy gradients, which uses a centralised critic to estimate the Q-function and decentralised actors to optimise the agents' policies.

...read moreread less

Posted Content

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

Nguyen Cong Luong, +6 more

- 18 Oct 2018 -

arXiv: Networking and Internet Architect...

TL;DR: In this paper, a comprehensive literature review on applications of deep reinforcement learning in communications and networking is presented, which includes dynamic network access, data rate control, wireless caching, data offloading, network security, and connectivity preservation.

...read moreread less

Posted Content

Mean Field Multi-Agent Reinforcement Learning

Yaodong Yang, +5 more

- 15 Feb 2018 -

arXiv: Multiagent Systems

TL;DR: In this paper, a mean field Q-learning and mean field Actor-Critic algorithms are proposed to solve the Ising model via model-free reinforcement learning methods. But the authors admit that the learning of the individual agent's optimal policy depends on the dynamics of the population, while the dynamics change according to the collective patterns of individual policies.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Journal ArticleDOI

Generative Adversarial Nets

Ian Goodfellow, +7 more

TL;DR: A new framework for estimating generative models via an adversarial process, in which two models are simultaneously train: a generative model G that captures the data distribution and a discriminative model D that estimates the probability that a sample came from the training data rather than G.

...read moreread less

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Journal ArticleDOI

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Ronald J. Williams

- 01 May 1992 -

Machine Learning

TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.

...read moreread less

Collapse

Related Papers (5)

Continuous control with deep reinforcement learning

Timothy P. Lillicrap, +7 more

- 09 Sep 2015 -

arXiv: Learning

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

Citations

Deep Reinforcement Learning: An Overview

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Counterfactual Multi-Agent Policy Gradients

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

Mean Field Multi-Agent Reinforcement Learning

References

Generative Adversarial Nets

Reinforcement Learning: An Introduction

Human-level control through deep reinforcement learning

Mastering the game of Go with deep neural networks and tree search

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Related Papers (5)

Human-level control through deep reinforcement learning

Continuous control with deep reinforcement learning

Reinforcement Learning: An Introduction

Asynchronous methods for deep reinforcement learning

Proximal Policy Optimization Algorithms