scispace - formally typeset
Open AccessPosted Content

Deep Reinforcement Learning from Self-Play in Imperfect-Information Games

Johannes Heinrich, +1 more
- 03 Mar 2016 - 
TLDR
In this paper, a scalable end-to-end approach to learn approximate Nash equilibria without prior domain knowledge is proposed. But this approach is not suitable for large-scale games of imperfect information.
Abstract
Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.

read more

Citations
More filters
Journal ArticleDOI

Mastering the game of Go without human knowledge

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
Journal ArticleDOI

A brief survey of deep reinforcement learning

TL;DR: This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.
Journal ArticleDOI

Applications of Deep Reinforcement Learning in Communications and Networking: A Survey

TL;DR: This paper presents a comprehensive literature review on applications of deep reinforcement learning (DRL) in communications and networking, and presents applications of DRL for traffic routing, resource sharing, and data collection.
Book ChapterDOI

Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms

TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.
References
More filters
Journal ArticleDOI

Human-level control through deep reinforcement learning

TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Journal ArticleDOI

Mastering the game of Go with deep neural networks and tree search

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Posted Content

Continuous control with deep reinforcement learning

TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Book

The theory of learning in games

TL;DR: Fudenberg and Levine as discussed by the authors developed an alternative explanation that equilibrium arises as the long-run outcome of a process in which less than fully rational players grope for optimality over time.
Journal ArticleDOI

Some studies in machine learning using the game of checkers

TL;DR: In this article, two machine learning procedures have been investigated in some detail using the game of checkers, and enough work has been done to verify the fact that a computer can be programmed so that it will lear...
Related Papers (5)