Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Open AccessPosted Content

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

- 03 Mar 2016 -

TLDR

In this paper, a scalable end-to-end approach to learn approximate Nash equilibria without prior domain knowledge is proposed. But this approach is not suitable for large-scale games of imperfect information.

Abstract:

Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Mastering the game of Go without human knowledge

David Silver, +16 more

- 19 Oct 2017 -

Nature

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

...read moreread less

Journal ArticleDOI

A brief survey of deep reinforcement learning

Kai Arulkumaran, +3 more

- 09 Nov 2017 -

arXiv: Learning

TL;DR: This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.

...read moreread less

Journal ArticleDOI

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

Nguyen Cong Luong, +6 more

- 14 May 2019 -

IEEE Communications Surveys and Tutorial...

TL;DR: This paper presents a comprehensive literature review on applications of deep reinforcement learning (DRL) in communications and networking, and presents applications of DRL for traffic routing, resource sharing, and data collection.

...read moreread less

Book ChapterDOI

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

Kaiqing Zhang, +2 more

- 29 Apr 2021 -

arXiv: Learning

TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Posted Content

Continuous control with deep reinforcement learning

Timothy P. Lillicrap, +7 more

- 09 Sep 2015 -

arXiv: Learning

TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.

...read moreread less

Book

The theory of learning in games

Drew Fudenberg, +1 more

TL;DR: Fudenberg and Levine as discussed by the authors developed an alternative explanation that equilibrium arises as the long-run outcome of a process in which less than fully rational players grope for optimality over time.

...read moreread less

Journal ArticleDOI

Some studies in machine learning using the game of checkers

SamuelA. L.

- 01 Jul 1959 -

Ibm Journal of Research and Development

TL;DR: In this article, two machine learning procedures have been investigated in some detail using the game of checkers, and enough work has been done to verify the fact that a computer can be programmed so that it will lear...

...read moreread less

Collapse

Related Papers (5)

Proximal Policy Optimization Algorithms

John Schulman, +4 more

- 20 Jul 2017 -

arXiv: Learning

Asynchronous methods for deep reinforcement learning

Volodymyr Mnih, +7 more

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Citations

Mastering the game of Go without human knowledge

A brief survey of deep reinforcement learning

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

Dota 2 with Large Scale Deep Reinforcement Learning

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

References

Human-level control through deep reinforcement learning

Mastering the game of Go with deep neural networks and tree search

Continuous control with deep reinforcement learning

The theory of learning in games

Some studies in machine learning using the game of checkers

Related Papers (5)

Human-level control through deep reinforcement learning

Mastering the game of Go with deep neural networks and tree search

Mastering the game of Go without human knowledge

Proximal Policy Optimization Algorithms

Asynchronous methods for deep reinforcement learning