PACHI: State of the Art Open Source Go Program

doi:10.1007/978-3-642-31866-5_3

Home
/
Papers
/
PACHI: State of the Art Open Source Go Program

Book Chapter•DOI•

PACHI: State of the Art Open Source Go Program

Petr Baudiš¹, Jean-loup Gailly¹•Institutions (1)

Charles University in Prague¹

20 Nov 2011-pp 24-38

TL;DR: A state of the art implementation of the Monte Carlo Tree Search algorithm for the game of Go and three notable original improvements: an adaptive time control algorithm, dynamic komi, and the usage of the criticality statistic are described.

read less

Abstract: We present a state of the art implementation of the Monte Carlo Tree Search algorithm for the game of Go. Our Pachi software is currently one of the strongest open source Go programs, competing at the top level with other programs and playing evenly against advanced human players. We describe our implementation and choice of published algorithms as well as three notable original improvements: (1) an adaptive time control algorithm, (2) dynamic komi, and (3) the usage of the criticality statistic. We also present new methods to achieve efficient scaling both in terms of multiple threads and multiple machines in a cluster.

...read moreread less

Citations

PDF

Open Access

More filters

Journal Article•DOI•

Mastering the game of Go with deep neural networks and tree search

[...]

David Silver¹, Aja Huang¹, Chris J. Maddison¹, Arthur Guez¹, Laurent Sifre¹, George van den Driessche¹, Julian Schrittwieser¹, Ioannis Antonoglou¹, Veda Panneershelvam¹, Marc Lanctot¹, Sander Dieleman¹, Dominik Grewe¹, John Nham¹, Nal Kalchbrenner¹, Ilya Sutskever¹, Timothy P. Lillicrap¹, Madeleine Leach¹, Koray Kavukcuoglu¹, Thore Graepel¹, Demis Hassabis¹ - Show less +16 more•Institutions (1)

Google¹

28 Jan 2016-Nature

TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.

...read moreread less

Abstract: The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

...read moreread less

14,377 citations

Journal Article•DOI•

Building machines that learn and think like people.

[...]

Brenden M. Lake¹, Tomer Ullman², Joshua B. Tenenbaum², Samuel J. Gershman³•Institutions (3)

New York University¹, Massachusetts Institute of Technology², Harvard University³

01 Jan 2017-Behavioral and Brain Sciences

TL;DR: In this article, a review of recent progress in cognitive science suggests that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it.

...read moreread less

Abstract: Recent progress in artificial intelligence has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats that of humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it. Specifically, we argue that these machines should (1) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (2) ground learning in intuitive theories of physics and psychology to support and enrich the knowledge that is learned; and (3) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes toward these goals that can combine the strengths of recent neural network advances with more structured cognitive models.

...read moreread less

2,010 citations

Journal Article•

Parallel Monte-Carlo tree search

[...]

Guillaume Chaslot¹, Mark H. M. Winands¹, H. Jaap van den Herik¹•Institutions (1)

Maastricht University¹

01 Jan 2008-Lecture Notes in Computer Science

TL;DR: In this paper, three parallelization methods for Monte-Carlo Tree Search (MCTS) are discussed: leaf parallelization, root parallelization and tree parallelization (tree parallelization requires two techniques: adequately handling of local mutexes and virtual loss).

...read moreread less

Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search method that started a revolution in the field of Computer Go. Parallelizing MCTS is an important way to increase the strength of any Go program. In this article, we discuss three parallelization methods for MCTS: leaf parallelization, root parallelization,and tree parallelization. To be effective tree parallelization requires two techniques: adequately handling of (1) local mutexes and (2) virtual loss. Experiments in 1313 Go reveal that in the program Mango root parallelization may lead to the best results for a specific time setting and specific program parame- ters. However, as soon as the selection mechanism is able to handle more adequately the balance of exploitation and exploration, tree paralleliza- tion should have attention too and could become a second choice for parallelizing MCTS. Preliminary experiments on the smaller 99 board provide promising prospects for tree parallelization.

...read moreread less

198 citations

Proceedings Article•

Training Deep Convolutional Neural Networks to Play Go

[...]

Christopher Clark¹, Amos Storkey²•Institutions (2)

Allen Institute for Artificial Intelligence¹, University of Edinburgh²

06 Jul 2015

TL;DR: In this paper, the authors train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players and achieve state-of-the-art performance.

...read moreread less

Abstract: Mastering the game of Go has remained a longstanding challenge to the field of AI. Modern computer Go programs rely on processing millions of possible future positions to play well, but intuitively a stronger and more 'humanlike' way to play the game would be to rely on pattern recognition rather than brute force computation. Following this sentiment, we train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players. To solve this problem we introduce a number of novel techniques, including a method of tying weights in the network to 'hard code' symmetries that are expected to exist in the target function, and demonstrate in an ablation study they considerably improve performance. Our final networks are able to achieve move prediction accuracies of 41.1% and 44.4% on two different Go datasets, surpassing previous state of the art on this task by significant margins. Additionally, while previous move prediction systems have not yielded strong Go playing programs, we show that the networks trained in this work acquired high levels of skill. Our convolutional neural networks can consistently defeat the well known Go program GNU Go and win some games against state of the art Go playing program Fuego while using a fraction of the play time.

...read moreread less

149 citations

Posted Content•

Move Evaluation in Go Using Deep Convolutional Neural Networks

[...]

Chris J. Maddison¹, Aja Huang², Ilya Sutskever², David Silver²•Institutions (2)

University of Toronto¹, Google²

20 Dec 2014-arXiv: Learning

TL;DR: A large 12-layer convolutional neural network is trained by supervised learning from a database of human professional games that beats the traditional search program GnuGo in 97% of games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move.

...read moreread less

Abstract: The game of Go is more challenging than other board games, due to the difficulty of constructing a position or move evaluation function. In this paper we investigate whether deep convolutional networks can be used to directly represent and learn this knowledge. We train a large 12-layer convolutional neural network by supervised learning from a database of human professional games. The network correctly predicts the expert move in 55% of positions, equalling the accuracy of a 6 dan human player. When the trained convolutional network was used directly to play games of Go, without any search, it beat the traditional search program GnuGo in 97% of games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move.

...read moreread less

120 citations

Cites background from "PACHI: State of the Art Open Source..."

...…to bias the search towards more promising states in both the search tree and during rollouts (Coulom, 2007; Gelly & Silver, 2011; Enzenberger et al., 2010; Huang et al., 2011), and it is widely believed that this knowledge is the major bottleneck towards further progress (Huang & Müller, 2013)....
[...]
...…output, without any search, it equalled the performance of state-of-the-art Monte-Carlo search programs (such as Pachi) that are given 10,000 rollouts per move (i.e., programs that combine handcrafted or shallow prior knowledge with a search that simulates two million positions), and the…...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Progressive Strategies for Monte-Carlo Tree Search

[...]

Guillaume M. J-B. Chaslot¹, Mark H. M. Winands¹, H. Jaap van den Herik¹, Jos W. H. M. Uiterwijk¹, Bruno Bouzy² - Show less +1 more•Institutions (2)

Maastricht University¹, University of Paris²

01 Nov 2008-New Mathematics and Natural Computation

TL;DR: Two progressive strategies for MCTS are introduced, called progressive bias and progressive unpruning, which enable the use of relatively time-expensive heuristic knowledge without speed reduction.

...read moreread less

Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search guided by the results of Monte-Carlo simulations. In this article, we introduce two progressive strategies for MCTS, called progressive bias and progressive unpruning. They enable the use of relatively time-expensive heuristic knowledge without speed reduction. Progressive bias directs the search according to heuristic knowledge. Progressive unpruning first reduces the branching factor, and then increases it gradually again. Experiments assess that the two progressive strategies significantly improve the level of our Go program Mango. Moreover, we see that the combination of both strategies performs even better on larger board sizes.

...read moreread less

458 citations

"PACHI: State of the Art Open Source..." refers background in this paper

...(This is similar to the progressive bias [7], but not equivalent....
[...]

of UCT with Patterns in Monte-Carlo Go

[...]

Sylvain Gelly, Yizao Wang, Olivier Teytaud

01 Jan 2006

TL;DR: MoGo as mentioned in this paper is the first computer Go program using UCB1 for multi-armed bandit problem and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo, which is now a top level Go program on $9\times9$ and $13\times13$ Go boards.

...read moreread less

Abstract: Algorithm UCB1 for multi-armed bandit problem has already been extended to Algorithm UCT (Upper bound Confidence for Tree) which works for minimax tree search. We have developed a Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT. We explain our modification of UCT for Go application and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo. UCT combined with pruning techniques for large Go board is discussed, as well as parallelization of UCT. MoGo is now a top level Go program on $9\times9$ and $13\times13$ Go boards.

...read moreread less

316 citations

Modiﬁcation of UCT with Patterns in Monte-Carlo Go

[...]

Sylvain Gelly, Yizao Wang, Rémi Munos, Olivier Teytaud

01 Jan 2006

TL;DR: A Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT, is developed, and the modification of UCT for Go application is explained and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo.

...read moreread less

276 citations

"PACHI: State of the Art Open Source..." refers methods or result in this paper

...We use the Mogo-like rule-based policy [12] that puts emphasis on localized sequences and matching of 3× 3 “shape” board patterns....
[...]
...• Points neighboring the last two moves are (with p = 1) matched for 3×3 board patterns centered at these points similar to patterns presented in [12], extended with information on “in atari” status of stones....
[...]

Journal Article•

Parallel Monte-Carlo tree search

[...]

Guillaume Chaslot¹, Mark H. M. Winands¹, H. Jaap van den Herik¹•Institutions (1)

Maastricht University¹

01 Jan 2008-Lecture Notes in Computer Science

...read moreread less

198 citations

Journal Article•DOI•

Fuego—An Open-Source Framework for Board Games and Go Engine Based on Monte Carlo Tree Search

[...]

Markus Enzenberger, Martin Müller¹, Broderick Arneson¹, Richard B. Segal²•Institutions (2)

University of Alberta¹, IBM²

14 Oct 2010-IEEE Transactions on Computational Intelligence and AI in Games

TL;DR: An overview of the development and current state of the FUEGO project is given, which describes the reusable components of the software framework and specific algorithms used in the Go engine.

...read moreread less

Abstract: FUEGO is both an open-source software framework and a state-of-the-art program that plays the game of Go. The framework supports developing game engines for full-information two-player board games, and is used successfully in a substantial number of projects. The FUEGO Go program became the first program to win a game against a top professional player in 9 × 9 Go. It has won a number of strong tournaments against other programs, and is competitive for 19 × 19 as well. This paper gives an overview of the development and current state of the FUEGO project. It describes the reusable components of the software framework and specific algorithms used in the Go engine.

...read moreread less

183 citations

"PACHI: State of the Art Open Source..." refers background or methods in this paper

...1 [9] with four Pachi threads, fixed 20,000 playouts per move....
[...]
...7 Simulated moves played closer to the node are given higher weight as in Fuego [9]....
[...]
...performing both the tree search and simulations in parallel on a shared tree and performing lock-free tree updates [9]....
[...]