Showing papers on "Stochastic game published in 1995"

PDF

Open Access

Journal Article•DOI•

Word-of-Mouth Communication and Social Learning

[...]

Glenn Ellison¹, Drew Fudenberg²•Institutions (2)

Massachusetts Institute of Technology¹, Harvard University²

01 Feb 1995-Quarterly Journal of Economics

TL;DR: This article studied the effect of word-of-mouth communication on the behavior of a population of identical players in a stochastic decision environment and found that the structure of the communication process determines whether all agents end up making identical choices, with less communication making this conformity more likely.

...read moreread less

Abstract: This paper studies the way that word-of-mouth communication aggregates the information of individual agents. We find that the structure of the communication process determines whether all agents end up making identical choices, with less communication making this conformity more likely. Despite the players' naive decision rules and the stochastic decision environment, word-of-mouth communication may lead all players to adopt the action that is on average superior. These socially efficient outcomes tend to occur when each agent samples only a few others. I. INTRODUCTION Economic agents must often make decisions without knowing the costs and benefits of the possible choices. Given the frequency with which such situations arise, it is understandable that agents often choose not to perform studies or experiments, but instead rely on whatever information they have obtained via casual word-of-mouth communication. Reliance on this sort of easily obtained information appears to be common in circumstances ranging from consumers choosing restaurants or auto mechanics to business managers evaluating alternative organizational structures. This paper studies two related environments in arguing that individuals' reliance on word-of-mouth communication has interesting implications for their aggregate behavior. First, motivated by the diffusion of new technologies, we consider a choice between two competing products with unequal qualities or payoffs, and show that the structure of communication is important in determining whether the population as a whole is likely to learn to use the superior product. Second, we consider a choice between two products or practices that are equally good, and ask whether consumers are likely to "herd" onto a single choice, or whether "diversity" will obtain even in the long run. We explore the implications of word-of-mouth communication in a simple nonstrategic environment. There is a large population of identical players, each of whom repeatedly chooses between two possible actions. Each player's payoff is determined by his own

...read moreread less

903 citations

Proceedings Article•DOI•

Gambling in a rigged casino: The adversarial multi-armed bandit problem

[...]

Peter Auer¹, Nicolò Cesa-Bianchi¹, Yoav Freund¹, Robert E. Schapire¹•Institutions (1)

University of California, Santa Cruz¹

23 Oct 1995

TL;DR: A solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs is given.

...read moreread less

Abstract: In the multi-armed bandit problem, a gambler must decide which arm of K non-identical slot machines to play in a sequence of trials so as to maximize his reward. This classical problem has received much attention because of the simple model it provides of the trade-off between exploration (trying out each arm to find the best one) and exploitation (playing the arm believed to give the best payoff). Past solutions for the bandit problem have almost always relied on assumptions about the statistics of the slot machines. In this work, we make no statistical assumptions whatsoever about the nature of the process generating the payoffs of the slot machines. We give a solution to the bandit problem in which an adversary, rather than a well-behaved stochastic process, has complete control over the payoffs. In a sequence of T plays, we prove that the expected per-round payoff of our algorithm approaches that of the best arm at the rate O(T/sup -1/3/), and we give an improved rate of convergence when the best arm has fairly low payoff. We also consider a setting in which the player has a team of "experts" advising him on which arm to play; here, we give a strategy that will guarantee expected payoff close to that of the best expert. Finally, we apply our result to the problem of learning to play an unknown repeated matrix game against an all-powerful adversary.

...read moreread less

807 citations

Patent•

Method and apparatus for playing a betting game including incorporating side betting which may be selected by a game player

[...]

Marvin A. Ornstein, Richard B. Hanbicki

22 Sep 1995

TL;DR: In this article, a chip receptacle is provided at each player's location of a blackjack table for accepting the side bet, and a player's key operated display selects a predetermined number of consecutive wins.

...read moreread less

Abstract: A betting apparatus incorporated into a game of chance enabling a player to make a side bet. A chip receptacle is provided at each player's location of a blackjack table for accepting the side bet. A player's key operated display selects a predetermined number of consecutive wins. A microprocessor cooperating with a sensor identifies the denomination of one or more chips placed in the chip receptacle and, together with a number of consecutive wins selected by the player, displays a payoff amount for a selected number of consecutive wins. The hands are played following conventional rules. The betting receptacle cover seals the chips after completion of a betting phase, under control of the dealer, and signals the beginning of a new game. Each player's location is provided with a Loss button, operated by the dealer when a player loses. A Push button may be provided for each player position when that player has a hand equal in value to a dealer's hand to indicate a tie. The microprocessor adds one to the consecutive win count display when a player wins a game, each time the dealer's game button is operated. When the number of consecutive wins displayed equals the number of consecutive wins selected, an audio/visual alarm indicates a win. Other embodiments incorporate the betting apparatus in all casino games, including table games, slot machines and video games. The chip receptacle may be substituted by a coin receptacle in slot machine and video games.

...read moreread less

305 citations

Journal Article•DOI•

How to Cope with Noise In the Iterated Prisoner's Dilemma

[...]

Jianzhong Wu¹, Robert Axelrod²•Institutions (2)

Chinese Academy of Sciences¹, University of Michigan²

01 Mar 1995-Journal of Conflict Resolution

TL;DR: This article showed that a contrite version of tit-for-tat is even more effective at quickly restoring mutual cooperation without the risk of exploitation when the other players have adapted to noise.

...read moreread less

Abstract: Noise in the form of random errors in implementing a choice is a common problem in real-world interactions. Recent research has identified three approaches to coping with noise: adding generosity to a reciprocating strategy; adding contrition to a reciprocating strategy; and using an entirely different strategy, Pavlov, based on the idea of switching choice whenever the previous payoff was low. Tournament studies, ecological simulation, and theoretical analysis demonstrate (1) a generous version of tit-for-tat is a highly effective strategy when the players it meets have not adapted to noise; (2) if the other players have adapted to noise, a contrite version of tit-for-tat is even more effective at quickly restoring mutual cooperation without the risk of exploitation; and (3) Pavlov is not robust.

...read moreread less

298 citations

Journal Article•DOI•

Zero-sum stochastic differential games and backward equations

[...]

Said Hamadène, J. P. Lepeltier

10 Mar 1995-Systems & Control Letters

TL;DR: In this article, the existence of a saddle point in the bounded case is obtained if the Isaacs' condition holds, and this technique is also a very simple approach for finding an optimal strategy in the case of controlled diffusions.

...read moreread less

266 citations

Journal Article•DOI•

A Folk Theorem for Stochastic Games

[...]

Prajit K. Dutta¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jun 1995-Journal of Economic Theory

TL;DR: In this paper, a folk theorem for stochastic games is presented, which subsumes a number of results obtained earlier and applies to a wide range of games studied in the economics literature.

...read moreread less

256 citations

Journal Article•DOI•

Lagrangians of physics and the game of Fisher-information transfer

[...]

B. Roy Frieden¹, Bernard H. Soffer²•Institutions (2)

University of Arizona¹, HRL Laboratories²

01 Sep 1995-Physical Review E

TL;DR: The EPI approach provides an understanding of the relationship between measurement and physical law and derives the Lagrangian that implies both the Klein-Gordon equation and the Dirac equation of quantum mechanics.

...read moreread less

Abstract: The Lagrangians of physics arise out of a mathematical game between a ``smart'' measurer and nature (personified by a demon). Each contestant wants to maximize his level of Fisher information I. The game is zero sum, by conservation of information in the closed system. The payoff of the game introduces a variational principle---extreme physical information (EPI)---which fixes both the Lagrangian and the physical constant of each scenario. The EPI approach provides an understanding of the relationship between measurement and physical law. EPI also defines a prescription for constructing Lagrangians. The prior knowledge required for this purpose is a rule of symmetry or conservation that implies a unitary transformation for which I remains invariant. As an example, when applied to the smart measurement of the space-time coordinate of a particle, the symmetry used is that between position-time space and momentum-energy space. Then the unitary transformation is the Fourier one, and EPI derives the following: the equivalence of energy, momentum, and mass; the constancy of Planck's parameter h; and the Lagrangian that implies both the Klein-Gordon equation and the Dirac equation of quantum mechanics.

...read moreread less

237 citations

Journal Article•DOI•

Games with Unique, Mixed Strategy Equilibria: An Experimental Study

[...]

Jack Ochs¹•Institutions (1)

University of Pittsburgh¹

01 Jul 1995-Games and Economic Behavior

TL;DR: In this paper, the results of an experiment studying the choices of subjects playing mixed extensions of three variants of simple 2 × 2 non-constant sum, strictly competitive games of the same form (Matching Pennies) are presented.

...read moreread less

216 citations

Journal Article•DOI•

Risk dominance and coordination failures in static games

[...]

Paul G. Straub¹•Institutions (1)

University of Rio Grande¹

01 Dec 1995-The Quarterly Review of Economics and Finance

TL;DR: In this article, the authors compare the use of Harsanyi and Selten's risk dominance and payoff dominance as equilibrium selection criteria, and present experimental evidence that suggests the existence of a payoff dominated risk dominant equilibrium is a necessary and sufficient condition for coordination failure.

...read moreread less

186 citations

Journal Article•DOI•

Revisiting Dynamic Duopoly with Consumer Switching Costs

[...]

A. Jorge Padilla¹•Institutions (1)

CEMFI¹

01 Dec 1995-Journal of Economic Theory

TL;DR: In this paper, the degree of collusiveness of a market with consumer switching costs is analyzed in an infinite-horizon model of duopolistic competition, where firms compete for the demand for a homogeneous good by setting prices simultaneously in each period.

...read moreread less

153 citations

Journal Article•DOI•

An evolutionary approach to pre-play communication

[...]

Yong-Gwan Kim, Joel SOBELi

01 Sep 1995-Econometrica

TL;DR: In this paper, the authors characterize the set of strategies that are stable with respect to a stochastic dynamic adaptive process in a finite two-player game played by a population of players.

...read moreread less

Abstract: We add a round of pre-play communication to a finite two-player game played by a population of players. Pre-play communication is cheap talk in the sense that it does not directly enter the payoffs. The paper characterizes the set of strategies that are stable with respect to a stochastic dynamic adaptive process. Periodically players have an opportunity to change their strategy with a strategy that is more successful against the current population. Any strategy that weakly improves upon the current poorest performer in the population enters with positive probability. When there is no conflict of interest between the players, only the efficient outcome is stable with respect to these dynamics. For general games the set of stable payoffs is typically large. Every efficient payoff recurs infinitely often.

...read moreread less

Journal Article•DOI•

Coordination in market entry games with symmetric players

[...]

James A. Sundali, Amnon Rapoport, Darryl A. Seale

01 Nov 1995-Organizational Behavior and Human Decision Processes

TL;DR: In this article, the results of two experiments designed to study tacit coordination in a class of market entry games with linear payoff functions, binary decisions, and zero entry costs are reported, in which each of n = 20 players must decide on each trial whether or not to enter a market whose capacity is public knowledge.

...read moreread less

Journal Article•DOI•

Learning in extensive-form games I. Self-confirming equilibria

[...]

Drew Fudenberg¹, David M. Kreps², David M. Kreps³•Institutions (3)

Harvard University¹, Tel Aviv University², Stanford University³

01 Jan 1995-Games and Economic Behavior

TL;DR: In this paper, a group of individuals repeatedly play a fixed extensive-form game, using past play to forecast future actions, and each player maximizes his own immediate expected payoff, believing that others' play corresponds to the historical frequencies of past play.

...read moreread less

Journal Article•DOI•

Perfect Equilibria in a Negotiation Model

[...]

Lutz-Alexander Busch, Quan Wen

01 May 1995-Econometrica

TL;DR: In this article, it was shown that the negotiation game can have multiple perfect equilibria, including inefficient ones, provided that players are sufficiently patient, and the length of delay depends only on the payoff structure of the disagreement game and not on the discount factor.

...read moreread less

Abstract: Rubinstein's alternating-offers bargaining model is enriched by assuming that players' payoffs in disagreement periods are determined by a normal form game. It is shown that such a model can have multiple perfect equilibria, including inefficient ones, provided that players are sufficiently patient. Delay is possible even though there is perfect information and the players are fully rational. The length of delay depends only on the payoff structure of the disagreement game and not on the discount factor. Not all feasible and individually rational payoffs of the disagreement game can be supported as average disagreement payoffs. Indeed, some negotiation games have a unique perfect equilibrium with immediate agreement.

...read moreread less

Journal Article•DOI•

A Subexponential Randomized Algorithm for the Simple Stochastic Game Problem

[...]

W. Ludwig¹•Institutions (1)

University of Wisconsin System¹

15 Feb 1995-Information & Computation

TL;DR: This work describes a randomized algorithm for the simple stochastic game problem that requires 2O(?n) expected operations for games with n vertices and is the first subexponential time algorithm for this problem.

...read moreread less

Abstract: We describe a randomized algorithm for the simple stochastic game problem that requires 2O(?n) expected operations for games with n vertices. This is the first subexponential time algorithm for this problem.

...read moreread less

Journal Article•DOI•

Order independent equilibria

[...]

Benny Moldovanu, Eyal Winter

01 Apr 1995-Games and Economic Behavior

TL;DR: In this paper, order independent equilibria (OIE) were introduced for noncooperative games of coalition formation, based on an underlying game in coalitional form, and a strategy profile is an OIE if, for any specification of first movers in the sequential game, it remains an equilibrium and leads to the same payoff.

...read moreread less

Journal Article•

Limited horizon forecast in repeated alternate games

[...]

Philippe Jehiel

01 Dec 1995-Journal of Economic Theory

TL;DR: In this article, the authors characterized hyperstable (n(1,n(2)-solutions for repeated alternating-move 2 x 2 games with finite action spaces, where the memory capacity of the players has no influence on the set of solutions.

...read moreread less

Journal Article•DOI•

Limited Horizon Forecast in Repeated Alternate Games

[...]

Phillippe Jéheil

01 Dec 1995-Journal of Economic Theory

TL;DR: In this article, the authors show that a (n 1, n 2 )-equilibrium solution always exists and is cyclical, and the memory capacity of the players has no influence on the set of solutions.

...read moreread less

Journal Article•DOI•

Bayesian Learning in Repeated Games

[...]

J.S Jordan¹•Institutions (1)

University of Minnesota¹

01 Apr 1995-Games and Economic Behavior

TL;DR: In this paper, the authors studied repeated games with a finite number of players, a fixed number of actions, discounted payoffs, and perfect recall, and the players' initial expectations were given by a common prior distribution over player types, a type being a discount rate and payoff matrix.

...read moreread less

Book•

New trends in dynamic games and applications

[...]

Geert Jan Olsder

01 Jan 1995

TL;DR: In this paper, the authors propose a linear pursuit evasion game with a state-constraint for a highly maneuverable evader and a turnpike theory for infinite-horizon Open-Loop Differential Games with Decoupled Controls.

...read moreread less

Abstract: I. Minimax control.- Expected Values, Feared Values, and Partial Information Optimal Control.- H?-Control of Nonlinear Singularly Perturbed Systems and Invariant Manifolds.- A Hybrid (Differential-Stochastic) Zero-Sum Game with Fast Stochastic Part.- H?-Control of Markovian Jump Systems and Solutions to Associated Piecewise-Deterministic Differential Games.- The Big Match on the Integers.- II. Pursuit evasion.- Synthesis of Optimal Strategies for Differential Games by Neural Networks.- A Linear Pursuit-Evasion Game with a State Constraint for a Highly Maneuverable Evader.- Three-Dimensional Air Combat: Numerical Solution of Complex Differential Games.- Control of Informational Sets in a Pursuit Problem.- Decision Support System for Medium Range Aerial Duels Combining Elements of Pursuit-Evasion Game Solutions with AI Techniques.- Optimal Selection of Observation Times in a Costly Information Game.- Pursuit Games with Costly Information: Application to the ASW Helicopter Versus Submarine Game.- Linear Avoidance in the Case of Interaction of Controlled Objects Groups.- III. Solution methods.- Convergence of Discrete Schemes for Discontinuous Value Functions of Pursuit-Evasion Games.- Undiscounted Zero Sum Differential Games with Stopping Times.- Guarantee Result in Differential Games with Terminal Payoff.- IV. Nonzero-sum games, theory.- Lyapunov Iterations for Solving Coupled Algebraic Riccati Equations of Nash Differential Games and the Algebraic Riccati Equation of Zero-Sum Games.- A Turnpike Theory for Infinite Horizon Open-Loop Differential Games with Decoupled Controls.- Team-Optimal Closed-Loop Stackelberg Strategies for Discrete-Time Descriptor Systems.- On Independence of Irrelevant Alternatives and Dynamic Programming in Dynamic Bargaining Games.- The Shapley Value for Differential Games.- V. Nonzero-sum games, applications.- Dynamic Game Theory and Management Strategy.- Endogenous Growth as a Dynamic Game.- Searching for Degenerate Dynamics in Animal Conflict Game Models involving Sexual Reproduction.

...read moreread less

Journal Article•DOI•

Stability by Mutation in Evolutionary Games

[...]

Immanuel M. Bomze, Reinhard Bürger

01 Nov 1995-Games and Economic Behavior

TL;DR: In this article, the authors introduce a dynamical model of mutation in evolutionary games, in which all possible mixtures of n pure strategies are admitted, and the case of n = 2 pure strategies is investigated in detail.

...read moreread less

Journal Article•DOI•

Endogenous Timing in Two-Player Games: A Counterexample

[...]

Rabah Amir¹•Institutions (1)

Congress of Racial Equality¹

01 May 1995-Games and Economic Behavior

TL;DR: In this paper, it was shown that monotonicity of the best-reponse functions in a two-player game is not sufficient to derive predictions about the order of moves.

...read moreread less

Journal Article•DOI•

Two-person ratio efficiency games

[...]

John J. Rousseau, John H. Semple

01 Mar 1995-Management Science

TL;DR: In this article, a class of two-person games with ratio payoff functions can be solved using equivalent primal-dual linear programming formulations, which may be used to conduct the efficiency evaluation currently done by the CCR ratio model of Data Envelopment Analysis (DEA).

...read moreread less

Abstract: This paper demonstrates that a class of two-person games with ratio payoff functions can be solved using equivalent primal-dual linear programming formulations The game’s solution contains specialized information which may be used to conduct the efficiency evaluation currently done by the CCR ratio model of Data Envelopment Analysis (DEA) Consequently a rigorous connection between DEA’s CCR model and the theory of games is established Interpretations of these new solutions are discussed in the context of current ongoing applications

...read moreread less

Journal Article•DOI•

Cooperation and effective computability

[...]

Luca Anderlini, Hamid Sabourian

01 Nov 1995-Econometrica

TL;DR: In this article, the authors consider the undiscounted repeated game obtained by the infinite repetition of such a two-player stage game and show that if supergame strategies are restricted to be computable within Church's thesis, the only pair of payoffs which survives any computable tremble with sufficiently large support is the Pareto-efficient pair.

...read moreread less

Abstract: A common interest game is a game in which there exists a unique pair of payoffs which strictly Pareto-dominates all other payoffs. We consider the undiscounted repeated game obtained by the infinite repetition of such a two-player stage game. We show that if supergame strategies are restricted to be computable within Church's thesis, the only pair of payoffs which survives any computable tremble with sufficiently large support is the Pareto-efficient pair. The result is driven by the ability of the players to use the early stages of the game to communicate their intention to play cooperatively in the future.

...read moreread less

Journal Article•DOI•

Zero-sum Markov games and worst-case optimal control of queueing systems

[...]

Eitan Altman¹, Arie Hordijk²•Institutions (2)

French Institute for Research in Computer Science and Automation¹, Leiden University²

01 Sep 1995-Queueing Systems

TL;DR: A survey of stochastic games in queues, where both tools and applications are considered, and the structural properties of best policies of the controller, worst-case policies of nature, and of the value function are illustrated.

...read moreread less

Abstract: Zero-sum stochastic games model situations where two persons, called players, control some dynamic system, and both have opposite objectives. One player wishes typically to minimize a cost which has to be paid to the other player. Such a game may also be used to model problems with a single controller who has only partial information on the system: the dynamic of the system may depend on some parameter that is unknown to the controller, and may vary in time in an unpredictable way. A worst-case criterion may be considered, where the unknown parameter is assumed to be chosen by “nature” (called player 1), and the objective of the controller (player 2) is then to design a policy that guarantees the best performance under worst-case behaviour of nature. The purpose of this paper is to present a survey of stochastic games in queues, where both tools and applications are considered. The first part is devoted to the tools. We present some existing tools for solving finite horizon and infinite horizon discounted Markov games with unbounded cost, and develop new ones that are typically applicable in queueing problems. We then present some new tools and theory of expected average cost stochastic games with unbounded cost. In the second part of the paper we present a survey on existing results on worst-case control of queues, and illustrate the structural properties of best policies of the controller, worst-case policies of nature, and of the value function. Using the theory developed in the first part of the paper, we extend some of the above results, which were known to hold for finite horizon costs or for the discounted cost, to the expected average cost.

...read moreread less

Journal Article•DOI•

An Efficient Approach for Pricing Spread Options

[...]

Neil D. Pearson¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

31 Aug 1995-Journal of Derivatives

TL;DR: In this article, an analytic expression for the integral of the option payoffs over the conditional density is available, and the remaining integration amounts to valuing the payoff function given by the results of the first integration.

...read moreread less

Abstract: Spread options are options whose payoff is based on the difference in the prices of two underlying assets. The price of a spread option is the (discounted) double integral of the option payoffs over the risk-neutral joint distribution of the terminal prices of the two underlying assets. Analytic expressions for the values of spread puts and calls in a Black-Scholes environment are not known, and various numerical algorithms must be used. This article presents an accurate and efficient approach for pricing European-style spread options on equities, foreign currencies, and commodities. The key to the approach is to recognize that the joint density of the terminal prices of the underlying assets can be factored into the product of univariate marginal and conditional densities, and that an analytic expression for the integral of the option payoffs over the conditional density is available. The remaining integration amounts to valuing the payoff function given by the results of the first integration. This payoff function is approximated by a portfolio of ordinary puts and calls, and valued accordingly. The approach is more accurate than existing bivariate binomial schemes, and fast enough for practical applications. It also allows for accurate and efficient computation of the partial derivatives of the option price, i.e., the Greek letter risks.

...read moreread less

Book Chapter•DOI•

The Complexity of Mean Payoff Games

[...]

Uri Zwick¹, Mike Paterson²•Institutions (2)

Tel Aviv University¹, University of Warwick²

24 Aug 1995

TL;DR: A pseudopolynomial time algorithm for the solution of mean payoff games, a family of perfect information games introduced by Ehrenfeucht and Mycielski, the decision problem for which is in NP ∩ co-NP.

...read moreread less

Abstract: We study the complexity of finding the values and optimal strategies of mean payoff games, a family of perfect information games introduced by Ehrenfeucht and Mycielski. We describe a pseudopolynomial time algorithm for the solution of such games, the decision problem for which is in NP ∩ co-NP. Finally, we describe a polynomial reduction from mean payoff games to the simple stochastic games studied by Condon. These games are also known to be in NP ∩ co-NP, but no polynomial or pseudo-polynomial time algorithm is known for them.

...read moreread less

Journal Article•DOI•

Equilibrium solutions for multiobjective bimatrix games incorporating fuzzy goals

[...]

Ichiro Nishizaki¹, Masatoshi Sakawa²•Institutions (2)

Setsunan University¹, Hiroshima University²

01 Aug 1995-Journal of Optimization Theory and Applications

TL;DR: In this paper, the degree of attainment of a fuzzy goal for games in fuzzy and multiobjective environments is examined and the equilibrium solution with respect to the degree is defined in terms of the degree this paper.

...read moreread less

Abstract: Equilibrium solutions in terms of the degree of attainment of a fuzzy goal for games in fuzzy and multiobjective environments are examined. We introduce a fuzzy goal for a payoff in order to incorporate ambiguity of human judgments and assume that a player tries to maximize his degree of attainment of the fuzzy goal. A fuzzy goal for a payoff and the equilibrium solution with respect to the degree of attainment of a fuzzy goal are defined. Two basic methods, one by weighting coefficients and the other by a minimum component, are employed to aggregate multiple fuzzy goals. When the membership functions are linear, computational methods for the equilibrium solutions are developed. It is shown that the equilibrium solutions are equal to the optimal solutions of mathematical programming problems in both cases. The relations between the equilibrium solutions for multiobjective bimatrix games incorporating fuzzy goals and the Pareto-optimal equilibrium solutions are considered.

...read moreread less

Journal Article•DOI•

The nonexistence of symmetric equilibria in anonymous games with compact action spaces

[...]

Kali P. Rath¹, Sun Yeneng², Yamashige Shinji³•Institutions (3)

University of Notre Dame¹, National University of Singapore², University of Toronto³

01 Jan 1995-Journal of Mathematical Economics

TL;DR: In an anonymous game, the payoff of a player depends upon the player's own action and the action distribution of all the players as discussed by the authors, and it is shown that if the set of actions is finite, or countably infinite and compact then there is a symmetric equilibrium distribution.

...read moreread less

Proceedings Article•DOI•

Efficient algorithms for learning to play repeated games against computationally bounded adversaries

[...]

Yoav Freund¹, Michael Kearns¹, Yishay Mansour², Dana Ron², Ronitt Rubinfeld², Robert E. Schapire² - Show less +2 more•Institutions (2)

Bell Labs¹, Alcatel-Lucent²

23 Oct 1995

TL;DR: This work examines games and adversaries for which the learning algorithm's past actions may strongly affect the adversary's future willingness to "cooperate" (that is, permit high payoff), and therefore require carefully planned actions on the part of the learning algorithms.

...read moreread less

Abstract: We examine the problem of learning to play various games optimally against resource-bounded adversaries, with an explicit emphasis on the computational efficiency of the learning algorithm. We are especially interested in providing efficient algorithms for games other than penny-matching (in which payoff is received for matching the adversary's action in the current round), and for adversaries other than the classically studied finite automata. In particular, we examine games and adversaries for which the learning algorithm's past actions may strongly affect the adversary's future willingness to "cooperate" (that is, permit high payoff), and therefore require carefully planned actions on the part of the learning algorithm. For example, in the game we call contract, both sides play O or 1 on each round, but our side receives payoff only if we play 1 in synchrony with the adversary; unlike penny-matching, playing O in synchrony with the adversary pays nothing. The name of the game is derived from the example of signing a contract, which becomes valid only if both parties sign (play 1).

...read moreread less