Open AccessPosted Content
Deep Reinforcement Learning from Self-Play in Imperfect-Information Games
Johannes Heinrich,David Silver +1 more
TLDR
In this paper, a scalable end-to-end approach to learn approximate Nash equilibria without prior domain knowledge is proposed. But this approach is not suitable for large-scale games of imperfect information.Abstract:
Many real-world applications can be described as large-scale games of imperfect information. To deal with these challenging domains, prior work has focused on computing Nash equilibria in a handcrafted abstraction of the domain. In this paper we introduce the first scalable end-to-end approach to learning approximate Nash equilibria without prior domain knowledge. Our method combines fictitious self-play with deep reinforcement learning. When applied to Leduc poker, Neural Fictitious Self-Play (NFSP) approached a Nash equilibrium, whereas common reinforcement learning methods diverged. In Limit Texas Holdem, a poker game of real-world scale, NFSP learnt a strategy that approached the performance of state-of-the-art, superhuman algorithms based on significant domain expertise.read more
Citations
More filters
Journal ArticleDOI
Mastering the game of Go without human knowledge
David Silver,Julian Schrittwieser,Karen Simonyan,Ioannis Antonoglou,Aja Huang,Arthur Guez,Thomas Hubert,Lucas Baker,Matthew Lai,Adrian Bolton,Yutian Chen,Timothy P. Lillicrap,Fan Hui,Laurent Sifre,George van den Driessche,Thore Graepel,Demis Hassabis +16 more
TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.
Journal ArticleDOI
A brief survey of deep reinforcement learning
TL;DR: This survey will cover central algorithms in deep RL, including the deep Q-network (DQN), trust region policy optimization (TRPO), and asynchronous advantage actor critic, and highlight the unique advantages of deep neural networks, focusing on visual understanding via RL.
Journal ArticleDOI
Applications of Deep Reinforcement Learning in Communications and Networking: A Survey
Nguyen Cong Luong,Dinh Thai Hoang,Shimin Gong,Dusit Niyato,Ping Wang,Ying-Chang Liang,Dong In Kim +6 more
TL;DR: This paper presents a comprehensive literature review on applications of deep reinforcement learning (DRL) in communications and networking, and presents applications of DRL for traffic routing, resource sharing, and data collection.
Posted Content
Dota 2 with Large Scale Deep Reinforcement Learning
Christopher Berner,Greg Brockman,Brooke Chan,Vicki Cheung,Przemyslaw Debiak,Christy Dennison,David Farhi,Quirin Fischer,Shariq Hashme,Christopher Hesse,Rafal Jozefowicz,Scott Gray,Catherine Olsson,Jakub Pachocki,Michael Petrov,Henrique Ponde de Oliveira Pinto,Jonathan Raiman,Tim Salimans,Jeremy Schlatter,Jonas Schneider,Szymon Sidor,Ilya Sutskever,Jie Tang,Filip Wolski,Susan Zhang +24 more
TL;DR: By defeating the Dota 2 world champion (Team OG), OpenAI Five demonstrates that self-play reinforcement learning can achieve superhuman performance on a difficult task.
Book ChapterDOI
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms
TL;DR: This chapter reviews the theoretical results of MARL algorithms mainly within two representative frameworks, Markov/stochastic games and extensive-form games, in accordance with the types of tasks they address, i.e., fully cooperative, fully competitive, and a mix of the two.
References
More filters
Journal ArticleDOI
Human-level control through deep reinforcement learning
Volodymyr Mnih,Koray Kavukcuoglu,David Silver,Andrei Rusu,Joel Veness,Marc G. Bellemare,Alex Graves,Martin Riedmiller,Andreas K. Fidjeland,Georg Ostrovski,Stig Petersen,Charles Beattie,Amir Sadik,Ioannis Antonoglou,Helen King,Dharshan Kumaran,Daan Wierstra,Shane Legg,Demis Hassabis +18 more
TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Journal ArticleDOI
Mastering the game of Go with deep neural networks and tree search
David Silver,Aja Huang,Chris J. Maddison,Arthur Guez,Laurent Sifre,George van den Driessche,Julian Schrittwieser,Ioannis Antonoglou,Veda Panneershelvam,Marc Lanctot,Sander Dieleman,Dominik Grewe,John Nham,Nal Kalchbrenner,Ilya Sutskever,Timothy P. Lillicrap,Madeleine Leach,Koray Kavukcuoglu,Thore Graepel,Demis Hassabis +19 more
TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Posted Content
Continuous control with deep reinforcement learning
Timothy P. Lillicrap,Jonathan J. Hunt,Alexander Pritzel,Nicolas Heess,Tom Erez,Yuval Tassa,David Silver,Daan Wierstra +7 more
TL;DR: This work presents an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces, and demonstrates that for many of the tasks the algorithm can learn policies end-to-end: directly from raw pixel inputs.
Book
The theory of learning in games
Drew Fudenberg,David K. Levine +1 more
TL;DR: Fudenberg and Levine as discussed by the authors developed an alternative explanation that equilibrium arises as the long-run outcome of a process in which less than fully rational players grope for optimality over time.
Journal ArticleDOI
Some studies in machine learning using the game of checkers
TL;DR: In this article, two machine learning procedures have been investigated in some detail using the game of checkers, and enough work has been done to verify the fact that a computer can be programmed so that it will lear...
Related Papers (5)
Human-level control through deep reinforcement learning
Mastering the game of Go with deep neural networks and tree search
David Silver,Aja Huang,Chris J. Maddison,Arthur Guez,Laurent Sifre,George van den Driessche,Julian Schrittwieser,Ioannis Antonoglou,Veda Panneershelvam,Marc Lanctot,Sander Dieleman,Dominik Grewe,John Nham,Nal Kalchbrenner,Ilya Sutskever,Timothy P. Lillicrap,Madeleine Leach,Koray Kavukcuoglu,Thore Graepel,Demis Hassabis +19 more