Learning to bid in bridge

doi:10.1007/S10994-006-6225-2

Home
/
Papers
/
Learning to bid in bridge

Journal Article•DOI•

Learning to bid in bridge

Asaf Amit¹, Shaul Markovitch¹•Institutions (1)

Technion – Israel Institute of Technology¹

01 Jun 2006-Machine Learning (Kluwer Academic Publishers)-Vol. 63, Iss: 3, pp 287-327

TL;DR: A new decision-making algorithm that allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information is presented.

read less

Abstract: Bridge bidding is considered to be one of the most difficult problems for game-playing programs. It involves four agents rather than two, including a cooperative agent. In addition, the partial observability of the game makes it impossible to predict the outcome of each action. In this paper we present a new decision-making algorithm that is capable of overcoming these problems. The algorithm allows models to be used for both opponent agents and partners, while utilizing a novel model-based Monte Carlo sampling method to overcome the problem of hidden information. The paper also presents a learning framework that uses the above decision-making algorithm for co-training of partners. The agents refine their selection strategies during training and continuously exchange their refined strategies. The refinement is based on inductive learning applied to examples accumulated for classes of states with conflicting actions. The algorithm was empirically evaluated on a set of bridge deals. The pair of agents that co-trained significantly improved their bidding performance to a level surpassing that of the current state-of-the-art bidding algorithm.

...read moreread less

Content maybe subject to copyright Report

Citations

PDF

Open Access

More filters

Book Chapter•DOI•

Reinforcement Learning in Games

[...]

Istvan Szita¹•Institutions (1)

University of Alberta¹

01 Jan 2012

TL;DR: The basic reinforcement learning algorithms are rarely sufficient for high-level gameplay, so it is essential to discuss the additional ideas, ways of inserting domain knowledge, implementation decisions that are necessary for scaling up.

...read moreread less

Abstract: Reinforcement learning and games have a long and mutually beneficial common history. From one side, games are rich and challenging domains for testing reinforcement learning algorithms. From the other side, in several games the best computer players use reinforcement learning. The chapter begins with a selection of games and notable reinforcement learning implementations.Without any modifications, the basic reinforcement learning algorithms are rarely sufficient for high-level gameplay, so it is essential to discuss the additional ideas, ways of inserting domain knowledge, implementation decisions that are necessary for scaling up. These are reviewed in sufficient detail to understand their potentials and their limitations. The second part of the chapter lists challenges for reinforcement learning in games, together with a review of proposed solution methods. While this listing has a game-centric viewpoint, and some of the items are specific to games (like opponent modelling), a large portion of this overview can provide insight for other kinds of applications, too. In the third part we review how reinforcement learning can be useful in game development and find its way into commercial computer games. Finally, we provide pointers for more in-depth reviews of specific games and solution approaches.

...read moreread less

47 citations

Recent Advances in Machine Learning and Game Playing

[...]

Johannes Fürnkranz

01 Jan 2007

TL;DR: The most important achievements of this field are highlighted and some important recent advance in game-playing applications are summarized.

...read moreread less

Abstract: Game-playing applications offer various challenges for machine learning, including opening book learning, learning of evaluation functions, player modeling, and others. In this paper, we briefly highlight the most important achievements of this field and summarize some important recent advance.

...read moreread less

26 citations

Cites background from "Learning to bid in bridge"

...Traditionally, the field is concerned with learning in strategy games such as tic-tac-toe [19], checkers [22, 23], backgammon [34], chess [3, 5, 10, 21], Go [28], Othello [7], poker [4], or bridge [2]....
[...]

Proceedings Article•DOI•

Bridge Bidding with Imperfect Information

[...]

Lori L. DeLooze¹, J. Downey²•Institutions (2)

United States Naval Academy¹, University of Central Arkansas²

01 Apr 2007

TL;DR: It is shown that a special form of a neural network, called a self-organizing map (SOM), can be used to effectively bid no trump hands and is an ideal mechanism for modeling the imprecise and ambiguous nature of the game.

...read moreread less

Abstract: Multiplayer games with imperfect information, such as Bridge, are especially challenging for game theory researchers. Although several algorithmic techniques have been successfully applied to the card play phase of the game, bidding requires a much different approach. We have shown that a special form of a neural network, called a self-organizing map (SOM), can be used to effectively bid no trump hands. The characteristic boundary that forms between resulting neighboring nodes in a SOM is an ideal mechanism for modeling the imprecise and ambiguous nature of the game

...read moreread less

19 citations

Cites background from "Learning to bid in bridge"

...As with other attempts at using machine learning for Bridge bidding [8], the first step is to produce training examples....
[...]

Posted Content•

Automatic Bridge Bidding Using Deep Reinforcement Learning

[...]

Chih-Kuan Yeh¹, Hsuan-Tien Lin²•Institutions (2)

Carnegie Mellon University¹, National Taiwan University²

12 Jul 2016-arXiv: Artificial Intelligence

TL;DR: In this paper, a deep reinforcement learning model was proposed to learn to bid automatically based on the raw card data for bridge zero-sum games without the aid of human domain knowledge.

...read moreread less

Abstract: Bridge is among the zero-sum games for which artificial intelligence has not yet outperformed expert human players. The main difficulty lies in the bidding phase of bridge, which requires cooperative decision making under partial information. Existing artificial intelligence systems for bridge bidding rely on and are thus restricted by human-designed bidding systems or features. In this work, we propose a pioneering bridge bidding system without the aid of human domain knowledge. The system is based on a novel deep reinforcement learning model, which extracts sophisticated features and learns to bid automatically based on raw card data. The model includes an upper-confidence-bound algorithm and additional techniques to achieve a balance between exploration and exploitation. Our experiments validate the promising performance of our proposed model. In particular, the model advances from having no knowledge about bidding to achieving superior performance when compared with a champion-winning computer bridge program that implements a human-designed bidding system.

...read moreread less

17 citations

Proceedings Article•

Contract Bridge Bidding by Learning.

[...]

Chun-Yen Ho¹, Hsuan-Tien Lin¹•Institutions (1)

National Taiwan University¹

01 Jan 2015

TL;DR: A novel learning framework to let a computer program learn its own bidding decisions is proposed and it is found that it performs competitively to the champion computer bridge program that mimics human bidding decisions.

...read moreread less

Abstract: Contract bridge is an example of an incomplete information game for which computers typically do not perform better than expert human bridge players. In particular, the typical bidding decisions of human bridge players are difficult to mimic with a computer program, and thus automatic bridge bidding remains to be a challenging research problem. Currently, the possibility of automatic bidding without mimicking human players has not been fully studied. In this work, we take an initiative to study such a possibility for the specific problem of bidding without competition. We propose a novel learning framework to let a computer program learn its own bidding decisions. The framework transforms the bidding problem into a learning problem, and then solves the problem with a carefully designed model that consists of cost-sensitive classifiers and upper-confidence-bound algorithms. We validate the proposed model and find that it performs competitively to the champion computer bridge program that mimics human bidding decisions.

...read moreread less

13 citations

Cites methods from "Learning to bid in bridge"

...For the first initiative toward allowing the machine to learn its own bidding system automatically, we use settings similar to existing works (Amit and Markovitch 2006; DeLooze and Downey 2007), and study the sub-problem of bidding without competition....
[...]
...By first constructing a decision network from a bidding system, Amit and Markovitch (2006) propose a Monte Carlo Sampling approach to decision making in the presence of conflicting bids....
[...]

1
2
3
4
…
5
6
7

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

Some studies in machine learning using the game of checkers

[...]

SamuelA. L.

01 Jul 1959-Ibm Journal of Research and Development

TL;DR: In this article, two machine learning procedures have been investigated in some detail using the game of checkers, and enough work has been done to verify the fact that a computer can be programmed so that it will lear...

...read moreread less

2,845 citations

Journal Article•DOI•

Some studies in machine learning using the game of checkers

[...]

Arthur L. Samuel

01 Jan 2000-Ibm Journal of Research and Development

TL;DR: Enough work has been done to verify the fact that a computer can be programmed so that it will learn to play a better game of checkers than can be played by the person who wrote the program.

...read moreread less

Abstract: Two machine-learning procedures have been investigated in some detail using the game of checkers. Enough work has been done to verify the fact that a computer can be programmed so that it will learn to play a better game of checkers than can be played by the person who wrote the program. Further-more, it can learn to do this in a remarkably short period of time (8 or 10 hours of machine-playing time) when given only the rules of the game, a sense of direction, and a redundant and incomplete list of parameters which are thought to have something to do with the game, but whose correct signs and relative weights are unknown and unspecified. The principles of machine learning verified by these experiments are, of course, applicable to many other situations.

...read moreread less

1,191 citations

Journal Article•DOI•

The challenge of poker

[...]

Darse Billings¹, Aaron Davidson¹, Jonathan Schaeffer¹, Duane Szafron¹•Institutions (1)

University of Alberta¹

24 Jan 2002-Artificial Intelligence

TL;DR: The design considerations and architecture of the poker program Poki are described, which is a program capable of playing reasonably strong poker, but there remains considerable research to be done to play at world-class level.

...read moreread less

299 citations

"Learning to bid in bridge" refers methods in this paper

...The group developed two main poker systems, Poki (Billings et al., 1998, 1999, 2002) and PsOpti (Billings et al., 2003)....
[...]
...Such an approach was taken by Poki (Billings et al., 2002)....
[...]
...The reason why we could not use Poki’s method is the difference in the size of the space of hidden information....
[...]
...Poki evaluates hands by computing their expected return....
[...]

Proceedings Article•

Approximating game-theoretic optimal strategies for full-scale poker

[...]

Darse Billings¹, Neil Burch¹, Aaron Davidson¹, Robert C. Holte¹, Jonathan Schaeffer¹, Terence Schauenberg¹, Duane Szafron¹ - Show less +3 more•Institutions (1)

University of Alberta¹

09 Aug 2003

TL;DR: The computation of the first complete approximations of game-theoretic optimal strategies for full-scale poker is addressed, and linear programming solutions to the abstracted game are used to create substantially improved poker-playing programs.

...read moreread less

Abstract: The computation of the first complete approximations of game-theoretic optimal strategies for full-scale poker is addressed. Several abstraction techniques are combined to represent the game of 2-player Texas Hold'em, having size O(1018), using closely related models each having size O(1O7). Despite the reduction in size by a factor of 100 billion, the resulting models retain the key properties and structure of the real game. Linear programming solutions to the abstracted game are used to create substantially improved poker-playing programs, able to defeat strong human players and be competitive against world-class opponents.

...read moreread less

237 citations

"Learning to bid in bridge" refers background or methods in this paper

...For larger spaces, several works extend the above approach by generalizing over states (Billings et al., 2003; Sen & Arora, 1997; Donkers, 2003)....
[...]
...The group developed two main poker systems, Poki (Billings et al., 1998, 1999, 2002) and PsOpti (Billings et al., 2003)....
[...]
...Such an approach was applied by Shi and Littman (2001), and by Billings et al. (2003), to the domain of poker....
[...]
...Some of the above works include algorithms for learning opponent agent models on the basis of their past behavior (Billings et al., 2003; Davidson et al., 2000; Sen and Arora, 1997; Bruce et al., 2002; Markovitch & Reger, 2005)....
[...]
...PsOpti performs a full expansion of the game tree but reduces the size of the set of possible states by partitioning the set into equivalence classes....
[...]

Journal Article•DOI•

EPAM-like models of recognition and learning

[...]

Edward A. Feigenbaum¹, Herbert A. Simon²•Institutions (2)

Stanford University¹, Carnegie Mellon University²

01 Oct 1984-Cognitive Science

TL;DR: A description is provided of EPAM-III, a theory in the form of a computer program for simulating human verbal learning, along with a summary of the empirical evidence for its validity, including PANDEMONIUM-like systems and dataflow nets.

...read moreread less

205 citations

"Learning to bid in bridge" refers methods in this paper

...Our method of representing and learning a selection strategy by means of a decision net is related to early work on discrimination nets (Feigenbaum & Simon, 1984)....
[...]