scispace - formally typeset
Search or ask a question
Journal ArticleDOI

GIB: imperfect information in a computationally challenging game

01 Jan 2001-Journal of Artificial Intelligence Research (AI Access Foundation)-Vol. 14, Iss: 1, pp 303-358
TL;DR: GIB, the program being described, involves five separate technical advances: partition search, the practical application of Monte Carlo techniques to realistic problems, a focus on achievable sets to solve problems inherent in the Monte Carlo approach, an extension of alpha-beta pruning from total orders to arbitrary distributive lattices, and the use of squeaky wheel optimization to find approximately optimal solutions to cardplay problems.
Abstract: This paper investigates the problems arising in the construction of a program to play the game of contract bridge. These problems include both the difficulty of solving the game's perfect information variant, and techniques needed to address the fact that bridge is not, in fact, a perfect information game. GIB, the program being described, involves five separate technical advances: partition search, the practical application of Monte Carlo techniques to realistic problems, a focus on achievable sets to solve problems inherent in the Monte Carlo approach, an extension of alpha-beta pruning from total orders to arbitrary distributive lattices, and the use of squeaky wheel optimization to find approximately optimal solutions to cardplay problems. GIB is currently believed to be of approximately expert caliber, and is currently the strongest computer bridge program in the world.

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI
TL;DR: A survey of the literature to date of Monte Carlo tree search, intended to provide a snapshot of the state of the art after the first five years of MCTS research, outlines the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarizes the results from the key game and nongame domains.
Abstract: Monte Carlo tree search (MCTS) is a recently proposed search method that combines the precision of tree search with the generality of random sampling. It has received considerable interest due to its spectacular success in the difficult problem of computer Go, but has also proved beneficial in a range of other domains. This paper is a survey of the literature to date, intended to provide a snapshot of the state of the art after the first five years of MCTS research. We outline the core algorithm's derivation, impart some structure on the many variations and enhancements that have been proposed, and summarize the results from the key game and nongame domains to which MCTS methods have been applied. A number of open research questions indicate that the field is ripe for future work.

2,682 citations


Cites background from "GIB: imperfect information in a com..."

  • ...Hence we aim to improve the reader’s understanding of how MCTS can be applied to new research questions and problem domains....

    [...]

Journal ArticleDOI
TL;DR: The design considerations and architecture of the poker program Poki are described, which is a program capable of playing reasonably strong poker, but there remains considerable research to be done to play at world-class level.

299 citations


Cites background from "GIB: imperfect information in a com..."

  • ...When testing the new betting strategy in online games, it was much less successful against reasonably strong human opposition, who were able to adapt quickly....

    [...]

Posted Content
TL;DR: This work discusses deep reinforcement learning in an overview style, focusing on contemporary work, and in historical contexts, with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources.
Abstract: We discuss deep reinforcement learning in an overview style. We draw a big picture, filled with details. We discuss six core elements, six important mechanisms, and twelve applications, focusing on contemporary work, and in historical contexts. We start with background of artificial intelligence, machine learning, deep learning, and reinforcement learning (RL), with resources. Next we discuss RL core elements, including value function, policy, reward, model, exploration vs. exploitation, and representation. Then we discuss important mechanisms for RL, including attention and memory, unsupervised learning, hierarchical RL, multi-agent RL, relational RL, and learning to learn. After that, we discuss RL applications, including games, robotics, natural language processing (NLP), computer vision, finance, business management, healthcare, education, energy, transportation, computer systems, and, science, engineering, and art. Finally we summarize briefly, discuss challenges and opportunities, and close with an epilogue.

239 citations

Journal ArticleDOI
TL;DR: Three new information set MCTS (ISMCTS) algorithms are presented which handle different sources of hidden information and uncertainty in games, instead of searching minimax trees of game states, the ISMCTS algorithms search trees of information sets, more directly analyzing the true structure of the game.
Abstract: Monte Carlo tree search (MCTS) is an AI technique that has been successfully applied to many deterministic games of perfect information. This paper investigates the application of MCTS methods to games with hidden information and uncertainty. In particular, three new information set MCTS (ISMCTS) algorithms are presented which handle different sources of hidden information and uncertainty in games. Instead of searching minimax trees of game states, the ISMCTS algorithms search trees of information sets, more directly analyzing the true structure of the game. These algorithms are tested in three domains with different characteristics, and it is demonstrated that our new algorithms outperform existing approaches to handling hidden information and uncertainty in games.

169 citations


Cites background from "GIB: imperfect information in a com..."

  • ...Ginsberg’s Intelligent Bridge Player (GIB) system [3] applies determinization to create an AI player for the card game Bridge which plays at the level of human experts....

    [...]

  • ...In many games, the number of states within an information set can be large: for example, there are 52 8 10 possible orderings of a standard deck of cards, each of which may have a corresponding state in the initial information set of a card game....

    [...]

Proceedings Article
09 Aug 2003
TL;DR: This paper shows how to tweak existing protocols to make manipulation hard, while leaving much of the original nature of the protocol intact, and produces the first results in voting settings where manipulation is in a higher complexity class than NP (presuming PSPACE ≠ NP).
Abstract: Voting is a general method for preference aggregation in multiagent settings, but seminal results have shown that all (nondictatorial) voting protocols are manipulable. One could try to avoid manipulation by using voting protocols where determining a beneficial manipulation is hard computationally. A number of recent papers study the complexity of manipulating existing protocols. This paper is the first work to take the next step of designing new protocols that are especially hard to manipulate. Rather than designing these new protocols from scratch, we instead show how to tweak existing protocols to make manipulation hard, while leaving much of the original nature of the protocol intact. The tweak studied consists of adding one elimination preround to the election. Surprisingly, this extremely simple and universal tweak makes typical protocols hard to manipulate! The protocols become NP-hard, NP-hard, or PSPACE-hard to manipulate, depending on whether the schedule of the preround is determined before the votes are collected, after the votes are collected, or the scheduling and the vote collecting are interleaved, respectively. We prove general sufficient conditions on the protocols for this tweak to introduce the hardness, and show that the most common voting protocols satisfy those conditions. These are the first results in voting settings where manipulation is in a higher complexity class than NP (presuming PSPACE ≠ NP).

164 citations


Additional excerpts

  • ...The only voting protocol for which CONSTRUCTIVE-MANIPULATION is known to be NP-hard is the STV protocol[Bartholdi and Orlin, 1991].2...

    [...]

References
More filters
Book
01 Jan 1978
TL;DR: In this paper, the authors define two definitions of Lattices and describe how to describe them, and how to use them to describe lattice geometry, including polynomials, identities, and infinities.
Abstract: I First Concepts.- 1 Two Definitions of Lattices.- 2 How to Describe Lattices.- 3 Some Algebraic Concepts.- 4 Polynomials, Identities, and Inequalities.- 5 Free Lattices.- 6 Special Elements.- Further Topics and References.- Problems.- II Distributive Lattices.- 1 Characterization and Representation Theorems.- 2 Polynomials and Freeness.- 3 Congruence Relations.- 4 Boolean Algebras.- 5 Topological Representation.- 6 Pseudocomplementation.- Further Topics and References.- Problems.- III Congruences and Ideals.- 1 Weak Projectivity and Congruences.- 2 Distributive, Standard, and Neutral Elements.- 3 Distributive, Standard, and Neutral Ideals.- 4 Structure Theorems.- Further Topics and References.- Problems.- IV Modular and Semimodular Lattices.- 1 Modular Lattices.- 2 Semimodular Lattices.- 3 Geometric Lattices.- 4 Partition Lattices.- 5 Complemented Modular Lattices.- Further Topics and References.- Problems.- V Varieties of Lattices.- 1 Characterizations of Varieties.- 2 The Lattice of Varieties of Lattices.- 3 Finding Equational Bases.- 4 The Amalgamation Property.- Further Topics and References.- Problems.- VI Free Products.- 1 Free Products of Lattices.- 2 The Structure of Free Lattices.- 3 Reduced Free Products.- 4 Hopfian Lattices.- Further Topics and References.- Problems.- Concluding Remarks.- Table of Notation.- A Retrospective.- 1 Major Advances.- 2 Notes on Chapter I.- 3 Notes on Chapter II.- 4 Notes on Chapter III.- 5 Notes on Chapter IV.- 6 Notes on Chapter V.- 7 Notes on Chapter VI.- 8 Lattices and Universal Algebras.- B Distributive Lattices and Duality by B. Davey, II. Priestley.- 1 Introduction.- 2 Basic Duality.- 3 Distributive Lattices with Additional Operations.- 4 Distributive Lattices with V-preserving Operators, and Beyond.- 5 The Natural Perspective.- 6 Congruence Properties.- 7 Freeness, Coproducts, and Injectivity.- C Congruence Lattices by G. Gratzer, E. T. Schmidt.- 1 The Finite Case.- 2 The General Case.- 3 Complete Congruences.- D Continuous Geometry by F. Wehrung.- 1 The von Neumann Coordinatization Theorem.- 2 Continuous Geometries and Related Topics.- E Projective Lattice Geometries by M. Greferath, S. Schmidt.- 1 Background.- 2 A Unified Approach to Lattice Geometry.- 3 Residuated Maps.- F Varieties of Lattices by P. Jipsen, H. Rose.- 1 The Lattice A.- 2 Generating Sets of Varieties.- 3 Equational Bases.- 4 Amalgamation and Absolute Retracts.- 5 Congruence Varieties.- G Free Lattices by R. Frecse.- 1 Whitman's Solutions Basic Results.- 2 Classical Results.- 3 Covers in Free Lattices.- 4 Semisingular Elements and Tschantz's Theorem.- 5 Applications and Related Areas.- H Formal Concept Analysis by B. Cantor and R. Wille.- 1 Formal Contexts and Concept Lattices.- 2 Applications.- 3 Sublattices and Quotient Lattices.- 4 Subdirect Products and Tensor Products.- 5 Lattice Properties.- New Bibliography.

2,294 citations

Journal ArticleDOI
TL;DR: In this paper, a rule-based system for computer-aided circuit analysis, called EL, is presented, which is written in a rule language called ARS, and implemented by ARS as pattern-directed invocation demons monitoring an associative data base.

805 citations

Journal Article
TL;DR: The technique developed is a variant of dependency-directed backtracking that uses only polynomial space while still providing useful control information and retaining the completeness guarantees provided by earlier approaches.
Abstract: Because of their occasional need to return to shallow points in a search tree, existing backtracking methods can sometimes erase meaningful progress toward solving a search problem. In this paper, we present a method by which backtrack points can be moved deeper in the search space, thereby avoiding this difficulty. The technique developed is a variant of dependency-directed backtracking that uses only polynomial space while still providing useful control information and retaining the completeness guarantees provided by earlier approaches.

525 citations


"GIB: imperfect information in a com..." refers background in this paper

  • ...This problem has been substantially addressed in the work on dynamic backtracking (Ginsberg, 1993) and its successors such as relsat (Bayardo & Miranker, 1996), where polynomial limits are placed on the number of nogoods being maintained....

    [...]

  • ...This problem hasbeen substantially addressed in the work on dynami ba ktra king (Ginsberg, 1993) and itssu essors su h as relsat (Bayardo & Miranker, 1996), where polynomial limits are pla edon the number of nogoods being maintained....

    [...]

Journal ArticleDOI
TL;DR: In this paper, a greedy algorithm is used to construct a solution which is then analyzed to find the trouble spots, i.e., those elements, that, if improved, are likely to improve the objective function score.
Abstract: We describe a general approach to optimization which we term "Squeaky Wheel" Optimization (SWO). In SWO, a greedy algorithm is used to construct a solution which is then analyzed to find the trouble spots, i.e., those elements, that, if improved, are likely to improve the objective function score. The results of the analysis are used to generate new priorities that determine the order in which the greedy algorithm constructs the next solution. This Construct/Analyze/Prioritize cycle continues until some limit is reached, or an acceptable solution is found. SWO can be viewed as operating on two search spaces: solutions and prioritizations. Successive solutions are only indirectly related, via the re-prioritization that results from analyzing the prior solution. Similarly, successive prioritizations are generated by constructing and analyzing solutions. This "coupled search" has some interesting properties, which we discuss. We report encouraging experimental results on two domains, scheduling problems that arise in fiber-optic cable manufacturing, and graph coloring problems. The fact that these domains are very different supports our claim that SWO is a general technique for optimization.

241 citations

01 Jan 1999

173 citations


"GIB: imperfect information in a com..." refers methods in this paper

  • ...Lind-Nielsen, J. (2000)....

    [...]

  • ...In order to make this inferen e as eÆ ient as possible, the disjun tions themselves wererepresented as binary de ision diagrams, or bdd's (Lind-Nielsen, 2000)....

    [...]

  • ...There are a varietyof publi domain implementations of bdd's available, and we used one provided by Lind-Nielsen (Lind-Nielsen, 2000).17The resulting implementation solves small endings (perhaps 16 ards left in total) qui klybut for larger endings, the running times ome to be dominated by the bdd…...

    [...]

  • ...In order to make this inference as efficient as possible, the disjunctions themselves were represented as binary decision diagrams, or bdd’s (Lind-Nielsen, 2000)....

    [...]

  • ...There are a varietyof publi domain implementations of bdd's available, and we used one provided by Lind-Nielsen (Lind-Nielsen, 2000).17The resulting implementation solves small endings (perhaps 16 ards left in total) qui klybut for larger endings, the running times ome to be dominated by the bdd omputations;this is hardly surprising, sin e the size of individual bdds an be exponential in the sizeof S (the number of possible distributions of the unseen ards)....

    [...]