scispace - formally typeset
Search or ask a question

Showing papers presented at "Advances in Computer Games in 2011"


Book ChapterDOI
20 Nov 2011
TL;DR: A state of the art implementation of the Monte Carlo Tree Search algorithm for the game of Go and three notable original improvements: an adaptive time control algorithm, dynamic komi, and the usage of the criticality statistic are described.
Abstract: We present a state of the art implementation of the Monte Carlo Tree Search algorithm for the game of Go. Our Pachi software is currently one of the strongest open source Go programs, competing at the top level with other programs and playing evenly against advanced human players. We describe our implementation and choice of published algorithms as well as three notable original improvements: (1) an adaptive time control algorithm, (2) dynamic komi, and (3) the usage of the criticality statistic. We also present new methods to achieve efficient scaling both in terms of multiple threads and multiple machines in a cluster.

79 citations


Book ChapterDOI
20 Nov 2011
TL;DR: The CLOP principle, which stands for Confident Local OPtimization, is a new approach to local regression that overcomes all problems in a straightforward and efficient way and outperforms all other tested algorithms.
Abstract: Artificial intelligence in games often leads to the problem of parameter tuning Some heuristics may have coefficients, and they should be tuned to maximize the win rate of the program A possible approach is to build local quadratic models of the win rate as a function of program parameters Many local regression algorithms have already been proposed for this task, but they are usually not sufficiently robust to deal automatically and efficiently with very noisy outputs and non-negative Hessians The CLOP principle, which stands for Confident Local OPtimization, is a new approach to local regression that overcomes all these problems in a straightforward and efficient way CLOP discards samples of which the estimated value is confidently inferior to the mean of all samples Experiments demonstrate that, when the function to be optimized is smooth, this method outperforms all other tested algorithms

37 citations


Book ChapterDOI
20 Nov 2011
TL;DR: A method to solve 6×5 boards based on race patterns and an extension of (JLPNS) is described, which would solve race-based board games usually played on a 8×8 board.
Abstract: breakthrough is a recent race-based board game usually played on a 8×8 board. We describe a method to solve 6×5 boards based on (1) race patterns and (2) an extension of (JLPNS).

30 citations


Book ChapterDOI
20 Nov 2011
TL;DR: In this article, a technique called Playout Search was proposed to improve the reliability of playouts in Monte-Carlo Tree Search (MCTS) for Chinese Checkers and Focus games.
Abstract: Over the past few years, Monte-Carlo Tree Search (MCTS) has become a popular search technique for playing multi-player games. In this paper we propose a technique called Playout Search. This enhancement allows the use of small searches in the playout phase of MCTS in order to improve the reliability of the playouts. We investigate max\(^{\textrm{\scriptsize{n}}}\), Paranoid, and BRS for Playout Search and analyze their performance in two deterministic perfect-information multi-player games: Focus and Chinese Checkers. The experimental results show that Playout Search significantly increases the quality of the playouts in both games. However, it slows down the speed of the playouts, which outweighs the benefit of better playouts if the thinking time for the players is small. When the players are given a sufficient amount of thinking time, Playout Search employing Paranoid search is a significant improvement in the 4-player variant of Focus and the 3-player variant of Chinese Checkers.

19 citations


Book ChapterDOI
20 Nov 2011
TL;DR: The distribution of the model’s Intrinsic Performance Ratings can therefore be used to compare populations that have limited interaction, such as between players in a national chess federation and FIDE, and ascertain relative drift in their respective rating systems.
Abstract: This paper studies the population of chess players and the distribution of their performances measured by Elo ratings and by computer analysis of moves. Evidence that ratings have remained stable since the inception of the Elo system in the 1970’s is given in three forms: (1) by showing that the population of strong players fits a straightforward logistic-curve model without inflation, (2) by plotting players’ average error against the FIDE category of tournaments over time, and (3) by skill parameters from a model that employs computer analysis keeping a nearly constant relation to Elo rating across that time. The distribution of the model’s Intrinsic Performance Ratings can therefore be used to compare populations that have limited interaction, such as between players in a national chess federation and FIDE, and ascertain relative drift in their respective rating systems.

19 citations


Book ChapterDOI
20 Nov 2011
TL;DR: Results in Othello, Havannah, and Go show that Accelerated UCT is not only more effective than previous approaches but also improves the strength of Fuego, which is one of the best computer Go programs.
Abstract: Monte-Carlo Tree Search (MCTS) is a successful approach for improving the performance of game-playing programs. This paper presents the Accelerated UCT algorithm, which overcomes a weakness of MCTS caused by deceptive structures which often appear in game tree search. It consists in using a new backup operator that assigns higher weights to recently visited actions, and lower weights to actions that have not been visited for a long time. Results in Othello, Havannah, and Go show that Accelerated UCT is not only more effective than previous approaches but also improves the strength of Fuego, which is one of the best computer Go programs.

17 citations


Book ChapterDOI
20 Nov 2011
TL;DR: This paper proposes a variation of EXP3 to exploit the fact that a solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions.
Abstract: Finding an approximation of a Nash equilibrium in matrix games is an important topic that reaches beyond the strict application to matrix games. A bandit algorithm commonly used to approximate a Nash equilibrium is EXP3 [3]. However, the solution to many problems is often sparse, yet EXP3 inherently fails to exploit this property. To the best knowledge of the authors, there exist only an offline truncation proposed by [9] to handle such issue. In this paper, we propose a variation of EXP3 to exploit the fact that a solution is sparse by dynamically removing arms; the resulting algorithm empirically performs better than previous versions. We apply the resulting algorithm to an MCTS program for the Urban Rivals card game.

12 citations


Book ChapterDOI
20 Nov 2011
TL;DR: This article shows how the performance of a Monte-Carlo Tree Search (MCTS) player for Havannah can be improved by guiding the search in the playout and selection steps of MCTS and initialize the visit and win counts of the new nodes based on pattern knowledge.
Abstract: This article shows how the performance of a Monte-Carlo Tree Search (MCTS) player for Havannah can be improved by guiding the search in the playout and selection steps of MCTS. To improve the playout step of the MCTS algorithm, we used two techniques to direct the simulations, Last-Good-Reply (LGR) and N-grams. Experiments reveal that LGR gives a significant improvement, although it depends on which LGR variant is used. Using N-grams to guide the playouts also achieves a significant increase in the winning percentage. Combining N-grams with LGR leads to a small additional improvement. To enhance the selection step of the MCTS algorithm, we initialize the visit and win counts of the new nodes based on pattern knowledge. By biasing the selection towards joint/neighbor moves, local connections, and edge/corner connections, a significant improvement in the performance is obtained. Experiments show that the best overall performance is obtained when combining the visit-and-win-count initialization with LGR and N-grams. In the best case, a winning percentage of 77.5% can be achieved against the default MCTS program.

12 citations


Book ChapterDOI
20 Nov 2011
TL;DR: The basic structure and its strengths and weaknesses of EinStein Wurfelt Nicht! are described with the idea of comparing it to existing mini-max based programs and comparing the MCTS version to a pure MC version.
Abstract: EinStein Wurfelt Nicht! is a game that has elements of strategy, tactics, and chance Reasonable evaluation functions can be found for this game and, indeed, there are some strong mini-max based programs for EinStein Wurfelt Nicht! We have constructed an MCTS program to play this game We describe its basic structure and its strengths and weaknesses with the idea of comparing it to existing mini-max based programs and comparing the MCTS version to a pure MC version

12 citations


Book ChapterDOI
20 Nov 2011
TL;DR: It is shown that the objective function has multiple local minima and the global minimum point indicates reasonable feature values, and the function is continuous with a practically computable numerical accuracy.
Abstract: The landscape of an objective function for supervised learning of evaluation functions is numerically investigated for a limited number of feature variables. Despite the importance of such learning methods, the properties of the objective function are still not well known because of its complicated dependence on millions of tree-search values. This paper shows that the objective function has multiple local minima and the global minimum point indicates reasonable feature values. Moreover, the function is continuous with a practically computable numerical accuracy. However, the function has non-partially differentiable points on the critical boundaries. It is shown that an existing iterative method is able to minimize the functions from random initial values with great stability, but it has the possibility to end up with a non-reasonable local minimum point if the initial random values are far from the desired values. Furthermore, the obtained minimum points are shown to form a funnel structure.

11 citations


Book ChapterDOI
20 Nov 2011
TL;DR: This paper presents the Dynamic Graph Reliability (DGR) optimization problem and the game Go-Moku as examples and demonstrates remarkable flexibility in polynomial reduction, such that many interesting practical problems can be elegantly modeled as QIPs.
Abstract: Quantified linear programs (QLPs) are linear programs with mathematical variables being either existentially or universally quantified. The integer variant (Quantified linear integer program, QIP) is PSPACE-complete, and can be interpreted as a two-person zero-sum game. Additionally, it demonstrates remarkable flexibility in polynomial reduction, such that many interesting practical problems can be elegantly modeled as QIPs. Indeed, the PSPACE-completeness guarantees that all PSPACE-complete problems such as games like Othello, Go-Moku, and Amazons, can be described with the help of QIPs, with only moderate overhead. In this paper, we present the Dynamic Graph Reliability (DGR) optimization problem and the game Go-Moku as examples.

Book ChapterDOI
20 Nov 2011
TL;DR: The study showed that behavior capture is a viable alternative to existing manual scripting methods and that HMMs produced the most highly ranked variation with respect to overall believability.
Abstract: We propose a method of generating natural-looking behaviors for virtual characters using a data-driven method called behavior capture. We describe the techniques for capturing trainer-generated traces, for generalizing these traces, and for using the traces to generate behaviors during game-play. Hidden Markov Models (HMMs) are used as one of the generalization techniques for behavior generation. We compared our proposed method to other existing methods by creating a scene with a set of six variations in a computer game, each using a different method for behavior generation, including our proposed method. We conducted a study in which participants watched the variations and ranked them according to a set of criteria for evaluating behaviors. The study showed that behavior capture is a viable alternative to existing manual scripting methods and that HMMs produced the most highly ranked variation with respect to overall believability.

Book ChapterDOI
20 Nov 2011
TL;DR: This paper applies temporal difference (TD) learning to Connect6, and successfully uses TD(0) to improve the strength of a Connect6 program, NCTU6, which has a convincing performance in removing winning/losing moves via threat-space search in TD learning.
Abstract: In this paper, we apply temporal difference (TD) learning to Connect6, and successfully use TD(0) to improve the strength of a Connect6 program, NCTU6 The program won several computer Connect6 tournaments and also many man-machine Connect6 tournaments from 2006 to 2011 From our experiments, the best improved version of TD learning achieves about a 58% win rate against the original NCTU6 program This paper discusses three implementation issues that improve the program The program has a convincing performance in removing winning/losing moves via threat-space search in TD learning

Book ChapterDOI
20 Nov 2011
TL;DR: This video explains the design and development process that went into creating the 3D world of Pokemon Go, and some of the techniques used in the development of the game were new and innovative.
Abstract: Games are complex pieces of software which give life to animated virtual worlds. Game developers carefully search the difficult balance between quality and efficiency in their games.

Book ChapterDOI
20 Nov 2011
TL;DR: This paper discusses gradients of search values with a parameter vector θ in an evaluation function and shows when the min-max value is partially differentiable and how the substitution may introduce errors.
Abstract: This paper discusses gradients of search values with a parameter vector θ in an evaluation function. Recent learning methods for evaluation functions in computer shogi are based on minimization of an objective function with search results. The gradients of the evaluation function at the leaf position of a principal variation (PV) are used to make an easy substitution of the gradients of the search result. By analyzing the variations of the min-max value, we show (1) when the min-max value is partially differentiable and (2) how the substitution may introduce errors. Experiments on a shogi program with about a million parameters show how frequently such errors occur, as well as how effective the substitutions for parameter tuning are in practice.

Book ChapterDOI
20 Nov 2011
TL;DR: By holding strategies fixed across each training iteration, it is shown how CFR training iterations may be transformed from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time.
Abstract: Using the bluffing dice game Dudo as a challenge domain, we abstract information sets by an imperfect recall of actions. Even with such abstraction, the standard Counterfactual Regret Minimization (CFR) algorithm proves impractical for Dudo, since the number of recursive visits to the same abstracted information sets increase exponentially with the depth of the game graph. By holding strategies fixed across each training iteration, we show how CFR training iterations may be transformed from an exponential-time recursive algorithm into a polynomial-time dynamic-programming algorithm, making computation of an approximate Nash equilibrium for the full 2-player game of Dudo possible for the first time.

Book ChapterDOI
20 Nov 2011
TL;DR: An artificial player playing the Voronoi game is described and a new set of experimental results shows that a player using MCTS and geometrical knowledge outperforms a player without knowledge.
Abstract: Monte-Carlo Tree Search (MCTS) is a powerful tool in games with a finite branching factor. The paper describes an artificial player playing the Voronoi game, a game with an infinite branching factor. First, it shows how to use MCTS on a discretization of the Voronoi game, and the effects of enhancements such as RAVE and Gaussian processes (GP). Then a set of experimental results shows that MCTS with UCB+RAVE or with UCB+GP are good first solutions for playing the Voronoi game without domain-dependent knowledge. Moreover, the paper shows how the playing level can be greatly improved by using geometrical knowledge about Voronoi diagrams. The balance of diagrams is the key concept. A new set of experimental results shows that a player using MCTS and geometrical knowledge outperforms a player without knowledge.

Book ChapterDOI
20 Nov 2011
TL;DR: This work investigates the use of Meta-Monte-Carlo-Tree-Search, for building a huge 7x7 opening book, and reports the twenty wins that were obtained recently in7x7 Go against pros.
Abstract: Solving board games is a hard task, in particular for games in which classical tools such as alpha-beta and proof-number-search are somehow weak. In particular, Go is not solved (in any sense of solving, even the weakest) beyond 6x6. We here investigate the use of Meta-Monte-Carlo-Tree-Search, for building a huge 7x7 opening book. In particular, we report the twenty wins (out of twenty games) that were obtained recently in 7x7 Go against pros; we also show that in one of the games, with no human error, the pro might have won.

Book ChapterDOI
20 Nov 2011
TL;DR: From the results of the experiments, the general structure of good move groups and the parameters to use for enhancing the playing strength are arrived at.
Abstract: The UCT (Upper Confidence Bounds applied to Trees) algorithm has allowed for significant improvements in a number of games, most notably the game of Go. Move groups is a modification that greatly reduces the branching factor at the cost of increased search depth and as such may be used to enhance the performance of UCT. From the results of the experiments, we arrive at the general structure of good move groups and the parameters to use for enhancing the playing strength.

Book ChapterDOI
20 Nov 2011
TL;DR: This paper improves upon the training method that was used in the previous approach for the two backgammon variants popular in Greece and neighboring countries, Plakoto and Fevga, and shows that the proposed methods result both in faster learning as well as better performance.
Abstract: Palamedes is an ongoing project for building expert playing bots that can play backgammon variants. As in all successful modern backgammon programs, it is based on neural networks trained using temporal difference learning. This paper improves upon the training method that we used in our previous approach for the two backgammon variants popular in Greece and neighboring countries, Plakoto and Fevga. We show that the proposed methods result both in faster learning as well as better performance. We also present insights into the selection of the features in our experiments that can be useful to temporal difference learning in other games as well.

Book ChapterDOI
20 Nov 2011
TL;DR: In a game, pure Monte Carlo search with parameter T means that for each feasible move T random games are generated and the move with the best average score is played.
Abstract: In a game, pure Monte Carlo search with parameter T means that for each feasible move T random games are generated. The move with the best average score is played. We call a game “Monte Carlo perfect” when this straightforward procedure converges to perfect play for each position, when T goes to infinity. Many popular games like Go, Hex, and Amazons are NOT Monte Carlo perfect.

Book ChapterDOI
20 Nov 2011
TL;DR: This paper uses an adaptive resolution R to enhance the min-max search with alpha-beta pruning technique, and shows that the value returned by the modified algorithm, called Negascout-with-resolution, differs from that of the original version by at most R.
Abstract: In this paper, we use an adaptive resolution R to enhance the min-max search with alpha-beta pruning technique, and show that the value returned by the modified algorithm, called Negascout-with-resolution, differs from that of the original version by at most R. Guidelines are given to explain how the resolution should be chosen to obtain the best possible outcome. Our experimental results demonstrate that Negascout-with-resolution yields a significant performance improvement over the original algorithm on the domains of random trees and real game trees in Chinese chess.

Book ChapterDOI
20 Nov 2011
TL;DR: It is proved that optimal play by both players leads to a draw in variants of Connect-Four played on an infinite board by introducing never-losing strategies for both players.
Abstract: In this paper, we present the newly obtained solution for variants of Connect-Four played on an infinite board. We proved this result by introducing never-losing strategies for both players. The strategies consist of a combination of paving patterns, which are follow-up, follow-in-CUP, and a few others. By employing the strategies, both players can block their opponents to achieve the winning condition. This means that optimal play by both players leads to a draw in these games.

Book ChapterDOI
20 Nov 2011
TL;DR: The key idea is to examine how critical certain positions are to White in achieving the win, and an algorithm is defined to help analyse uniqueness in endgame positions objectively.
Abstract: Some 50,000 Win Studies in Chess challenge White to find an effectively unique route to a win. Judging the impact of less than absolute uniqueness requires both technical analysis and artistic judgment. Here, for the first time, an algorithm is defined to help analyse uniqueness in endgame positions objectively. The key idea is to examine how critical certain positions are to White in achieving the win. The algorithm uses sub-n-man endgame tables (EGTs) for both Chess and relevant, adjacent variants of Chess. It challenges authors of EGT generators to generalise them to create EGTs for these chess variants. It has already proved efficient and effective in an implementation for Starchess, itself a variant of chess. The approach also addresses a number of similar questions arising in endgame theory, games, and compositions.

Book ChapterDOI
20 Nov 2011
TL;DR: Current technologies and systems do not tap in the full potential of affective approaches in games, but affect in games can be harnessed as a supportive and easy to use input method.
Abstract: Natural game input devices, such as Microsoft’s Kinect or Sony’s Playstation Move, have become increasingly popular and allow a direct mapping of player performance in regard to actions in the game world Games have been developed that enable players to interact with their avatars and other game objects via gestures and/or voice input However, current technologies and systems do not tap in the full potential of affective approaches Affect in games can be harnessed as a supportive and easy to use input method

Book ChapterDOI
Jiao Wang1, Shiyuan Li1, Jitong Chen1, Xin Wei1, Huizhan Lv1, Xinhe Xu1 
20 Nov 2011
TL;DR: The results of the experiments show that the use of 4*4-Patterns can improve MCTS in 19*19 Go to some extent, in particular when supported by 4* 4-Pattern libraries generated by Bayesian learning.
Abstract: The paper proposes a new model of pattern, namely the 4*4-Pattern, to improve MCTS (Monte-Carlo Tree Search) in computer Go. A 4*4-Pattern provides a larger coverage space and more essential information than the original 3*3-Pattern. Nevertheless the latter is currently widely used. Due to the lack of a central symmetry, it takes greater challenges to apply a 4*4-Pattern compared to a 3*3-Pattern. Many details of a 4*4-Pattern implementation are presented, including classification, multiple matching, coding sequences, and fast lookup. Additionally, Bayesian 4*4-Pattern learning is introduced, and 4*4-Pattern libraries are automatically generated from a vast amount of professional game records. The results of our experiments show that the use of 4*4-Patterns can improve MCTS in 19*19 Go to some extent, in particular when supported by 4*4-Pattern libraries generated by Bayesian learning.

Book ChapterDOI
20 Nov 2011
TL;DR: In Go and Hex, the effect of a blunder is examined at various stages of a game for each fixed move number to determine the expected blunder cost at that point.
Abstract: In Go and Hex, we examine the effect of a blunder — here, a random move — at various stages of a game. For each fixed move number, we run a self-play tournament to determine the expected blunder cost at that point.

Book ChapterDOI
20 Nov 2011
TL;DR: This paper investigates the effects that various time-management strategies have on the playing strength in Go, including semi-dynamic strategies that decide about time allocation for each search before it is started, and dynamic strategies that influence the duration of each move search while it is already running.
Abstract: The dominant approach for programs playing the game of Go is nowadays Monte-Carlo Tree Search (MCTS). While MCTS allows for fine-grained time control, little has been published on time management for MCTS programs under tournament conditions. This paper investigates the effects that various time-management strategies have on the playing strength in Go. We consider strategies taken from the literature as well as newly proposed and improved ones. We investigate both semi-dynamic strategies that decide about time allocation for each search before it is started, and dynamic strategies that influence the duration of each move search while it is already running. In our experiments, two domain-independent enhanced strategies, EARLY-C and CLOSE-N, are tested; each of them provides a significant improvement over the state of the art.

Book ChapterDOI
20 Nov 2011
TL;DR: A new approach is developed that computes approximate equilibrium strategies in Jotto, a popular word game of imperfect information, in which the player does not explicitly represent a strategy, but rather appeal to an oracle that quickly outputs a sample move from the strategy’s distribution.
Abstract: We develop a new approach that computes approximate equilibrium strategies in Jotto, a popular word game. Jotto is an extremely large two-player game of imperfect information; its game tree has many orders of magnitude more states than games previously studied, including no-limit Texas Hold’em. To address the fact that the game is so large, we propose a novel strategy representation called oracular form, in which we do not explicitly represent a strategy, but rather appeal to an oracle that quickly outputs a sample move from the strategy’s distribution. Our overall approach is based on an extension of the fictitious play algorithm to this oracular setting. We demonstrate the superiority of our computed strategies over the strategies computed by a benchmark algorithm, both in terms of head-to-head and worst-case performance.