scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Learning to bid in bridge

01 Jun 2006-Machine Learning (Kluwer Academic Publishers)-Vol. 63, Iss: 3, pp 287-327
TL;DR: A new decision-making algorithm that allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information is presented.
Abstract: Bridge bidding is considered to be one of the most difficult problems for game-playing programs. It involves four agents rather than two, including a cooperative agent. In addition, the partial observability of the game makes it impossible to predict the outcome of each action. In this paper we present a new decision-making algorithm that is capable of overcoming these problems. The algorithm allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information. The paper also presents a learning framework that uses the above decision-making algorithm for co-training of partners. The agents refine their selection strategies during training and continuously exchange their refined strategies. The refinement is based on inductive learning applied to examples accumulated for classes of states with conflicting actions. The algorithm was empirically evaluated on a set of bridge deals. The pair of agents that co-trained significantly improved their bidding performance to a level surpassing that of the current state-of-the-art bidding algorithm.

Content maybe subject to copyright    Report

Citations
More filters
Book ChapterDOI
01 Jan 2012
TL;DR: The basic reinforcement learning algorithms are rarely sufficient for high-level gameplay, so it is essential to discuss the additional ideas, ways of inserting domain knowledge, implementation decisions that are necessary for scaling up.
Abstract: Reinforcement learning and games have a long and mutually beneficial common history. From one side, games are rich and challenging domains for testing reinforcement learning algorithms. From the other side, in several games the best computer players use reinforcement learning. The chapter begins with a selection of games and notable reinforcement learning implementations.Without any modifications, the basic reinforcement learning algorithms are rarely sufficient for high-level gameplay, so it is essential to discuss the additional ideas, ways of inserting domain knowledge, implementation decisions that are necessary for scaling up. These are reviewed in sufficient detail to understand their potentials and their limitations. The second part of the chapter lists challenges for reinforcement learning in games, together with a review of proposed solution methods. While this listing has a game-centric viewpoint, and some of the items are specific to games (like opponent modelling), a large portion of this overview can provide insight for other kinds of applications, too. In the third part we review how reinforcement learning can be useful in game development and find its way into commercial computer games. Finally, we provide pointers for more in-depth reviews of specific games and solution approaches.

47 citations

01 Jan 2007
TL;DR: The most important achievements of this field are highlighted and some important recent advance in game-playing applications are summarized.
Abstract: Game-playing applications offer various challenges for machine learning, including opening book learning, learning of evaluation functions, player modeling, and others. In this paper, we briefly highlight the most important achievements of this field and summarize some important recent advance.

26 citations


Cites background from "Learning to bid in bridge"

  • ...Traditionally, the field is concerned with learning in strategy games such as tic-tac-toe [19], checkers [22, 23], backgammon [34], chess [3, 5, 10, 21], Go [28], Othello [7], poker [4], or bridge [2]....

    [...]

Proceedings ArticleDOI
01 Apr 2007
TL;DR: It is shown that a special form of a neural network, called a self-organizing map (SOM), can be used to effectively bid no trump hands and is an ideal mechanism for modeling the imprecise and ambiguous nature of the game.
Abstract: Multiplayer games with imperfect information, such as Bridge, are especially challenging for game theory researchers. Although several algorithmic techniques have been successfully applied to the card play phase of the game, bidding requires a much different approach. We have shown that a special form of a neural network, called a self-organizing map (SOM), can be used to effectively bid no trump hands. The characteristic boundary that forms between resulting neighboring nodes in a SOM is an ideal mechanism for modeling the imprecise and ambiguous nature of the game

19 citations


Cites background from "Learning to bid in bridge"

  • ...As with other attempts at using machine learning for Bridge bidding [8], the first step is to produce training examples....

    [...]

Posted Content
TL;DR: In this paper, a deep reinforcement learning model was proposed to learn to bid automatically based on the raw card data for bridge zero-sum games without the aid of human domain knowledge.
Abstract: Bridge is among the zero-sum games for which artificial intelligence has not yet outperformed expert human players. The main difficulty lies in the bidding phase of bridge, which requires cooperative decision making under partial information. Existing artificial intelligence systems for bridge bidding rely on and are thus restricted by human-designed bidding systems or features. In this work, we propose a pioneering bridge bidding system without the aid of human domain knowledge. The system is based on a novel deep reinforcement learning model, which extracts sophisticated features and learns to bid automatically based on raw card data. The model includes an upper-confidence-bound algorithm and additional techniques to achieve a balance between exploration and exploitation. Our experiments validate the promising performance of our proposed model. In particular, the model advances from having no knowledge about bidding to achieving superior performance when compared with a champion-winning computer bridge program that implements a human-designed bidding system.

17 citations

Proceedings Article
01 Jan 2015
TL;DR: A novel learning framework to let a computer program learn its own bidding decisions is proposed and it is found that it performs competitively to the champion computer bridge program that mimics human bidding decisions.
Abstract: Contract bridge is an example of an incomplete information game for which computers typically do not perform better than expert human bridge players. In particular, the typical bidding decisions of human bridge players are difficult to mimic with a computer program, and thus automatic bridge bidding remains to be a challenging research problem. Currently, the possibility of automatic bidding without mimicking human players has not been fully studied. In this work, we take an initiative to study such a possibility for the specific problem of bidding without competition. We propose a novel learning framework to let a computer program learn its own bidding decisions. The framework transforms the bidding problem into a learning problem, and then solves the problem with a carefully designed model that consists of cost-sensitive classifiers and upper-confidence-bound algorithms. We validate the proposed model and find that it performs competitively to the champion computer bridge program that mimics human bidding decisions.

13 citations


Cites methods from "Learning to bid in bridge"

  • ...For the first initiative toward allowing the machine to learn its own bidding system automatically, we use settings similar to existing works (Amit and Markovitch 2006; DeLooze and Downey 2007), and study the sub-problem of bidding without competition....

    [...]

  • ...By first constructing a decision network from a bidding system, Amit and Markovitch (2006) propose a Monte Carlo Sampling approach to decision making in the presence of conflicting bids....

    [...]

References
More filters
Journal ArticleDOI
TL;DR: In this article, two machine learning procedures have been investigated in some detail using the game of checkers, and enough work has been done to verify the fact that a computer can be programmed so that it will lear...
Abstract: Two machine-learning procedures have been investigated in some detail using the game of checkers. Enough work has been done to verify the fact that a computer can be programmed so that it will lear...

2,845 citations

Journal ArticleDOI
TL;DR: Enough work has been done to verify the fact that a computer can be programmed so that it will learn to play a better game of checkers than can be played by the person who wrote the program.
Abstract: Two machine-learning procedures have been investigated in some detail using the game of checkers. Enough work has been done to verify the fact that a computer can be programmed so that it will learn to play a better game of checkers than can be played by the person who wrote the program. Further-more, it can learn to do this in a remarkably short period of time (8 or 10 hours of machine-playing time) when given only the rules of the game, a sense of direction, and a redundant and incomplete list of parameters which are thought to have something to do with the game, but whose correct signs and relative weights are unknown and unspecified. The principles of machine learning verified by these experiments are, of course, applicable to many other situations.

1,191 citations

Journal ArticleDOI
TL;DR: The design considerations and architecture of the poker program Poki are described, which is a program capable of playing reasonably strong poker, but there remains considerable research to be done to play at world-class level.

299 citations


"Learning to bid in bridge" refers methods in this paper

  • ...The group developed two main poker systems, Poki (Billings et al., 1998, 1999, 2002) and PsOpti (Billings et al., 2003)....

    [...]

  • ...Such an approach was taken by Poki (Billings et al., 2002)....

    [...]

  • ...The reason why we could not use Poki’s method is the difference in the size of the space of hidden information....

    [...]

  • ...Poki evaluates hands by computing their expected return....

    [...]

Proceedings Article
09 Aug 2003
TL;DR: The computation of the first complete approximations of game-theoretic optimal strategies for full-scale poker is addressed, and linear programming solutions to the abstracted game are used to create substantially improved poker-playing programs.
Abstract: The computation of the first complete approximations of game-theoretic optimal strategies for full-scale poker is addressed. Several abstraction techniques are combined to represent the game of 2-player Texas Hold'em, having size O(1018), using closely related models each having size O(1O7). Despite the reduction in size by a factor of 100 billion, the resulting models retain the key properties and structure of the real game. Linear programming solutions to the abstracted game are used to create substantially improved poker-playing programs, able to defeat strong human players and be competitive against world-class opponents.

237 citations


"Learning to bid in bridge" refers background or methods in this paper

  • ...For larger spaces, several works extend the above approach by generalizing over states (Billings et al., 2003; Sen & Arora, 1997; Donkers, 2003)....

    [...]

  • ...The group developed two main poker systems, Poki (Billings et al., 1998, 1999, 2002) and PsOpti (Billings et al., 2003)....

    [...]

  • ...Such an approach was applied by Shi and Littman (2001), and by Billings et al. (2003), to the domain of poker....

    [...]

  • ...Some of the above works include algorithms for learning opponent agent models on the basis of their past behavior (Billings et al., 2003; Davidson et al., 2000; Sen and Arora, 1997; Bruce et al., 2002; Markovitch & Reger, 2005)....

    [...]

  • ...PsOpti performs a full expansion of the game tree but reduces the size of the set of possible states by partitioning the set into equivalence classes....

    [...]

Journal ArticleDOI
TL;DR: A description is provided of EPAM-III, a theory in the form of a computer program for simulating human verbal learning, along with a summary of the empirical evidence for its validity, including PANDEMONIUM-like systems and dataflow nets.

205 citations


"Learning to bid in bridge" refers methods in this paper

  • ...Our method of representing and learning a selection strategy by means of a decision net is related to early work on discrimination nets (Feigenbaum & Simon, 1984)....

    [...]