scispace - formally typeset
Search or ask a question
Book ChapterDOI

PACHI: State of the Art Open Source Go Program

20 Nov 2011-pp 24-38
TL;DR: A state of the art implementation of the Monte Carlo Tree Search algorithm for the game of Go and three notable original improvements: an adaptive time control algorithm, dynamic komi, and the usage of the criticality statistic are described.
Abstract: We present a state of the art implementation of the Monte Carlo Tree Search algorithm for the game of Go. Our Pachi software is currently one of the strongest open source Go programs, competing at the top level with other programs and playing evenly against advanced human players. We describe our implementation and choice of published algorithms as well as three notable original improvements: (1) an adaptive time control algorithm, (2) dynamic komi, and (3) the usage of the criticality statistic. We also present new methods to achieve efficient scaling both in terms of multiple threads and multiple machines in a cluster.
Citations
More filters
Journal ArticleDOI
28 Jan 2016-Nature
TL;DR: Using this search algorithm, the program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0.5, the first time that a computer program has defeated a human professional player in the full-sized game of Go.
Abstract: The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses ‘value networks’ to evaluate board positions and ‘policy networks’ to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of stateof-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

14,377 citations

Journal ArticleDOI
TL;DR: In this article, a review of recent progress in cognitive science suggests that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it.
Abstract: Recent progress in artificial intelligence has renewed interest in building systems that learn and think like people. Many advances have come from using deep neural networks trained end-to-end in tasks such as object recognition, video games, and board games, achieving performance that equals or even beats that of humans in some respects. Despite their biological inspiration and performance achievements, these systems differ from human intelligence in crucial ways. We review progress in cognitive science suggesting that truly human-like learning and thinking machines will have to reach beyond current engineering trends in both what they learn and how they learn it. Specifically, we argue that these machines should (1) build causal models of the world that support explanation and understanding, rather than merely solving pattern recognition problems; (2) ground learning in intuitive theories of physics and psychology to support and enrich the knowledge that is learned; and (3) harness compositionality and learning-to-learn to rapidly acquire and generalize knowledge to new tasks and situations. We suggest concrete challenges and promising routes toward these goals that can combine the strengths of recent neural network advances with more structured cognitive models.

2,010 citations

Journal Article
TL;DR: In this paper, three parallelization methods for Monte-Carlo Tree Search (MCTS) are discussed: leaf parallelization, root parallelization and tree parallelization (tree parallelization requires two techniques: adequately handling of local mutexes and virtual loss).
Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search method that started a revolution in the field of Computer Go. Parallelizing MCTS is an important way to increase the strength of any Go program. In this article, we discuss three parallelization methods for MCTS: leaf parallelization, root parallelization,and tree parallelization. To be effective tree parallelization requires two techniques: adequately handling of (1) local mutexes and (2) virtual loss. Experiments in 1313 Go reveal that in the program Mango root parallelization may lead to the best results for a specific time setting and specific program parame- ters. However, as soon as the selection mechanism is able to handle more adequately the balance of exploitation and exploration, tree paralleliza- tion should have attention too and could become a second choice for parallelizing MCTS. Preliminary experiments on the smaller 99 board provide promising prospects for tree parallelization.

198 citations

Proceedings Article
06 Jul 2015
TL;DR: In this paper, the authors train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players and achieve state-of-the-art performance.
Abstract: Mastering the game of Go has remained a longstanding challenge to the field of AI. Modern computer Go programs rely on processing millions of possible future positions to play well, but intuitively a stronger and more 'humanlike' way to play the game would be to rely on pattern recognition rather than brute force computation. Following this sentiment, we train deep convolutional neural networks to play Go by training them to predict the moves made by expert Go players. To solve this problem we introduce a number of novel techniques, including a method of tying weights in the network to 'hard code' symmetries that are expected to exist in the target function, and demonstrate in an ablation study they considerably improve performance. Our final networks are able to achieve move prediction accuracies of 41.1% and 44.4% on two different Go datasets, surpassing previous state of the art on this task by significant margins. Additionally, while previous move prediction systems have not yielded strong Go playing programs, we show that the networks trained in this work acquired high levels of skill. Our convolutional neural networks can consistently defeat the well known Go program GNU Go and win some games against state of the art Go playing program Fuego while using a fraction of the play time.

149 citations

Posted Content
TL;DR: A large 12-layer convolutional neural network is trained by supervised learning from a database of human professional games that beats the traditional search program GnuGo in 97% of games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move.
Abstract: The game of Go is more challenging than other board games, due to the difficulty of constructing a position or move evaluation function. In this paper we investigate whether deep convolutional networks can be used to directly represent and learn this knowledge. We train a large 12-layer convolutional neural network by supervised learning from a database of human professional games. The network correctly predicts the expert move in 55% of positions, equalling the accuracy of a 6 dan human player. When the trained convolutional network was used directly to play games of Go, without any search, it beat the traditional search program GnuGo in 97% of games, and matched the performance of a state-of-the-art Monte-Carlo tree search that simulates a million positions per move.

120 citations


Cites background from "PACHI: State of the Art Open Source..."

  • ...…to bias the search towards more promising states in both the search tree and during rollouts (Coulom, 2007; Gelly & Silver, 2011; Enzenberger et al., 2010; Huang et al., 2011), and it is widely believed that this knowledge is the major bottleneck towards further progress (Huang & Müller, 2013)....

    [...]

  • ...…output, without any search, it equalled the performance of state-of-the-art Monte-Carlo search programs (such as Pachi) that are given 10,000 rollouts per move (i.e., programs that combine handcrafted or shallow prior knowledge with a search that simulates two million positions), and the…...

    [...]

References
More filters
Journal ArticleDOI
TL;DR: Two progressive strategies for MCTS are introduced, called progressive bias and progressive unpruning, which enable the use of relatively time-expensive heuristic knowledge without speed reduction.
Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search guided by the results of Monte-Carlo simulations. In this article, we introduce two progressive strategies for MCTS, called progressive bias and progressive unpruning. They enable the use of relatively time-expensive heuristic knowledge without speed reduction. Progressive bias directs the search according to heuristic knowledge. Progressive unpruning first reduces the branching factor, and then increases it gradually again. Experiments assess that the two progressive strategies significantly improve the level of our Go program Mango. Moreover, we see that the combination of both strategies performs even better on larger board sizes.

458 citations


"PACHI: State of the Art Open Source..." refers background in this paper

  • ...(This is similar to the progressive bias [7], but not equivalent....

    [...]

01 Jan 2006
TL;DR: MoGo as mentioned in this paper is the first computer Go program using UCB1 for multi-armed bandit problem and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo, which is now a top level Go program on $9\times9$ and $13\times13$ Go boards.
Abstract: Algorithm UCB1 for multi-armed bandit problem has already been extended to Algorithm UCT (Upper bound Confidence for Tree) which works for minimax tree search. We have developed a Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT. We explain our modification of UCT for Go application and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo. UCT combined with pruning techniques for large Go board is discussed, as well as parallelization of UCT. MoGo is now a top level Go program on $9\times9$ and $13\times13$ Go boards.

316 citations

01 Jan 2006
TL;DR: A Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT, is developed, and the modification of UCT for Go application is explained and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo.
Abstract: Algorithm UCB1 for multi-armed bandit problem has already been extended to Algorithm UCT (Upper bound Confidence for Tree) which works for minimax tree search. We have developed a Monte-Carlo Go program, MoGo, which is the first computer Go program using UCT. We explain our modification of UCT for Go application and also the intelligent random simulation with patterns which has improved significantly the performance of MoGo. UCT combined with pruning techniques for large Go board is discussed, as well as parallelization of UCT. MoGo is now a top level Go program on $9\times9$ and $13\times13$ Go boards.

276 citations


"PACHI: State of the Art Open Source..." refers methods or result in this paper

  • ...We use the Mogo-like rule-based policy [12] that puts emphasis on localized sequences and matching of 3× 3 “shape” board patterns....

    [...]

  • ...• Points neighboring the last two moves are (with p = 1) matched for 3×3 board patterns centered at these points similar to patterns presented in [12], extended with information on “in atari” status of stones....

    [...]

Journal Article
TL;DR: In this paper, three parallelization methods for Monte-Carlo Tree Search (MCTS) are discussed: leaf parallelization, root parallelization and tree parallelization (tree parallelization requires two techniques: adequately handling of local mutexes and virtual loss).
Abstract: Monte-Carlo Tree Search (MCTS) is a new best-first search method that started a revolution in the field of Computer Go. Parallelizing MCTS is an important way to increase the strength of any Go program. In this article, we discuss three parallelization methods for MCTS: leaf parallelization, root parallelization,and tree parallelization. To be effective tree parallelization requires two techniques: adequately handling of (1) local mutexes and (2) virtual loss. Experiments in 1313 Go reveal that in the program Mango root parallelization may lead to the best results for a specific time setting and specific program parame- ters. However, as soon as the selection mechanism is able to handle more adequately the balance of exploitation and exploration, tree paralleliza- tion should have attention too and could become a second choice for parallelizing MCTS. Preliminary experiments on the smaller 99 board provide promising prospects for tree parallelization.

198 citations

Journal ArticleDOI
TL;DR: An overview of the development and current state of the FUEGO project is given, which describes the reusable components of the software framework and specific algorithms used in the Go engine.
Abstract: FUEGO is both an open-source software framework and a state-of-the-art program that plays the game of Go. The framework supports developing game engines for full-information two-player board games, and is used successfully in a substantial number of projects. The FUEGO Go program became the first program to win a game against a top professional player in 9 × 9 Go. It has won a number of strong tournaments against other programs, and is competitive for 19 × 19 as well. This paper gives an overview of the development and current state of the FUEGO project. It describes the reusable components of the software framework and specific algorithms used in the Go engine.

183 citations


"PACHI: State of the Art Open Source..." refers background or methods in this paper

  • ...1 [9] with four Pachi threads, fixed 20,000 playouts per move....

    [...]

  • ...7 Simulated moves played closer to the node are given higher weight as in Fuego [9]....

    [...]

  • ...performing both the tree search and simulations in parallel on a shared tree and performing lock-free tree updates [9]....

    [...]