Topic

Computer Go

About: Computer Go is a research topic. Over the lifetime, 267 publications have been published within this topic receiving 23873 citations. The topic is also known as: Computer Baduk & Computer Weiqi.

...read moreread less

Papers published on a yearly basis

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Mastering the game of Go with deep neural networks and tree search

[...]

David Silver¹, Aja Huang¹, Chris J. Maddison¹, Arthur Guez¹, Laurent Sifre¹, George van den Driessche¹, Julian Schrittwieser¹, Ioannis Antonoglou¹, Veda Panneershelvam¹, Marc Lanctot¹, Sander Dieleman¹, Dominik Grewe¹, John Nham¹, Nal Kalchbrenner¹, Ilya Sutskever¹, Timothy P. Lillicrap¹, Madeleine Leach¹, Koray Kavukcuoglu¹, Thore Graepel¹, Demis Hassabis¹ - Show less +16 more•Institutions (1)

Google¹

28 Jan 2016-Nature

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.

...read moreread less

Abstract: The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

...read moreread less

14,377 citations

Journal Article•DOI•

Mastering the game of Go without human knowledge

[...]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy P. Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel, Demis Hassabis - Show less +13 more

19 Oct 2017-Nature

TL;DR: An algorithm based solely on reinforcement learning is introduced, without human data, guidance or domain knowledge beyond game rules, that achieves superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo.

...read moreread less

Abstract: A long-standing goal of artificial intelligence is an algorithm that learns, tabula rasa, superhuman proficiency in challenging domains. Recently, AlphaGo became the first program to defeat a world champion in the game of Go. The tree search in AlphaGo evaluated positions and selected moves using deep neural networks. These neural networks were trained by supervised learning from human expert moves, and by reinforcement learning from self-play. Here we introduce an algorithm based solely on reinforcement learning, without human data, guidance or domain knowledge beyond game rules. AlphaGo becomes its own teacher: a neural network is trained to predict AlphaGo’s own move selections and also the winner of AlphaGo’s games. This neural network improves the strength of the tree search, resulting in higher quality move selection and stronger self-play in the next iteration. Starting tabula rasa, our new program AlphaGo Zero achieved superhuman performance, winning 100–0 against the previously published, champion-defeating AlphaGo. Starting from zero knowledge and without human data, AlphaGo Zero was able to teach itself to play Go and to develop novel strategies that provide new insights into the oldest of games. To beat world champions at the game of Go, the computer program AlphaGo has relied largely on supervised learning from millions of human expert moves. David Silver and colleagues have now produced a system called AlphaGo Zero, which is based purely on reinforcement learning and learns solely from self-play. Starting from random moves, it can reach superhuman level in just a couple of days of training and five million games of self-play, and can now beat all previous versions of AlphaGo. Because the machine independently discovers the same fundamental principles of the game that took humans millennia to conceptualize, the work suggests that such principles have some universal character, beyond human bias.

...read moreread less

7,818 citations

Journal Article•DOI•

A Survey of Monte Carlo Tree Search Methods

[...]

Cameron Browne¹, Edward J. Powley², Daniel Whitehouse², Simon M. Lucas³, Peter I. Cowling², Philipp Rohlfshagen³, S. Tavener¹, Diego Perez³, Spyridon Samothrakis³, Simon Colton¹ - Show less +6 more•Institutions (3)

Imperial College London¹, University of Bradford², University of Essex³

03 Feb 2012-IEEE Transactions on Computational Intelligence and AI in Games

TL;DR: A survey of the literature to date of Monte Carlo tree search, intended to provide a snapshot of the state of the art after the first five years of MCTS research, outlines the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarizes the results from the key game and nongame domains.

...read moreread less

Abstract: Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work.

...read moreread less

2,682 citations

Journal Article•DOI•

Monte-Carlo tree search and rapid action value estimation in computer Go

[...]

Sylvain Gelly¹, David Silver²•Institutions (2)

University of Paris-Sud¹, University College London²

01 Jul 2011-Artificial Intelligence

TL;DR: The Monte-Carlo revolution in computer Go is surveyed, the key ideas that led to the success of MoGo and subsequent Go programs are outlined, and for the first time a comprehensive description, in theory and in practice, of this extended framework for Monte- Carlo tree search is provided.

...read moreread less

375 citations

Journal Article•DOI•

Computing “elo ratings” of move patterns in the game of go

[...]

Rémi Coulom

15 Jun 2007

TL;DR: A new Bayesian technique for supervised learning of move patterns from game records, based on a generalization of Elo ratings, which outperforms most previous pattern-learning algorithms, both in terms of mean log-evidence, and prediction rate.

...read moreread less

Abstract: Move patterns are an essential method to incorporate domain knowledge into Go-playing programs. This paper presents a new Bayesian technique for supervised learning of such patterns from game records, based on a generalization of Elo ratings. Each sample move in the training data is considered as a victory of a team of pattern features. Elo ratings of individual pattern features are computed from these victories, and can be used in previously unseen positions to compute a probability distribution over legal moves. In this approach, several pattern features may be combined, without an exponential cost in the number of features. Despite a very small number of training games (652), this algorithm outperforms most previous pattern-learning algorithms, both in terms of mean log-evidence (−2.69), and prediction rate (34.9%). A 19x19 Monte-Carlo program improved with these patterns reached the level of the strongest classical programs.

...read moreread less

316 citations

Collapse

Network Information

Performance

Metrics

267

Papers

31,204

Citations

No. of papers in the topic in previous years
Year	Papers
2021	3
2020	5
2019	4
2018	14
2017	15
2016	17

Computer Go

Papers published on a yearly basis

Papers

Trending Questions (2)

Network Information

Related Topics (5)

Performance

Metrics