scispace - formally typeset
Search or ask a question

Showing papers on "Stochastic game published in 2004"


Book ChapterDOI
01 Jan 2004
TL;DR: Game theory is a powerful tool for analyzing situations in which the decisions of multiple agents affect each agent's payoff as discussed by the authors and deals with interactive optimization problems, such as games with imperfect information and auctions.
Abstract: Game theory (hereafter GT) is a powerful tool for analyzing situations in which the decisions of multiple agents affect each agent’s payoff. As such, GT deals with interactive optimization problems. While many economists in the past few centuries have worked on what can be considered game-theoretic models, John von Neumann and Oskar Morgenstern are formally credited as the fathers of modern game theory. Their classic book “Theory of Games and Economic Behavior”, von Neumann and Morgenstern (1944), summarizes the basic concepts existing at that time. GT has since enjoyed an explosion of developments, including the concept of equilibrium by Nash (1950), games with imperfect information by Kuhn (1953), cooperative games by Aumann (1959) and Shubik (1962) and auctions by Vickrey (1961), to name just a few. Citing Shubik (2002), “In the 50s ... game theory was looked upon as a curiosum not to be taken seriously by any behavioral scientist. By the late 1980s, game theory in the new industrial organization has taken over ... game theory has proved its success in many disciplines.”

691 citations


Journal ArticleDOI
TL;DR: The authors compare a binary-choice Trust game with a structurally identical, binary choice Risky Dictator game offering a good or a bad outcome, and elicit individuals' minimum acceptable probabilities (MAPs) of getting the good outcome such that they would prefer the gamble to the sure payoff.
Abstract: Using experiments, we examine whether the decision to trust a stranger in a one-shot interaction is equivalent to taking a risky bet, or if a trust decision entails an additional risk premium to balance the costs of trust betrayal. We compare a binary-choice Trust game with a structurally identical, binary-choice Risky Dictator game offering a good or a bad outcome. We elicit individuals’ minimum acceptable probabilities (MAPs) of getting the good outcome such that they would prefer the gamble to the sure payoff. First movers state higher MAPs in the Trust game than in situations where nature determines the outcome.

569 citations


Journal ArticleDOI
TL;DR: This paper analyzes a model of rational word-of-mouth learning, in which successive generations of agents make once-and-for-all choices between two alternatives, and investigates a range of biased sampling rules, such as those that over-represent popular or successful choices, and which ones favor global convergence towards efficiency.

444 citations


Journal ArticleDOI
TL;DR: This paper identifies games in which equilibria are approximately optimal in the sense that no other outcome achieves a significantly larger total payoff to the players—games in which optimization by individuals approximately optimizes the social good, in spite of the lack of coordination between players.

341 citations


Proceedings ArticleDOI
11 Jan 2004
TL;DR: The existence of optimal pure memoryless strategies together with the polynomial-time solution for the one-player case implies that the quantitative two-player stochastic parity game problem is in NP ∩ co-NP, which generalizes a result of Condon for Stochastic games with reachability objectives.
Abstract: We study perfect-information stochastic parity games. These are two-player nonterminating games which are played on a graph with turn-based probabilistic transitions. A play results in an infinite path and the conflicting goals of the two players are ω-regular path properties, formalized as parity winning conditions. The qualitative solution of such a game amounts to computing the set of vertices from which a player has a strategy to win with probability 1 (or with positive probability). The quantitative solution amounts to computing the value of the game in every vertex, i.e., the highest probability with which a player can guarantee satisfaction of his own objective in a play that starts from the vertex.For the important special case of one-player stochastic parity games (parity Markov decision processes) we give polynomial-time algorithms both for the qualitative and the quantitative solution. The running time of the qualitative solution is O(d · m3/2) for graphs with m edges and d priorities. The quantitative solution is based on a linear-programming formulation.For the two-player case, we establish the existence of optimal pure memoryless strategies. This has several important ramifications. First, it implies that the values of the games are rational. This is in contrast to the concurrent stochastic parity games of de Alfaro et al.; there, values are in general algebraic numbers, optimal strategies do not exist, and e-optimal strategies have to be mixed and with infinite memory. Second, the existence of optimal pure memoryless strategies together with the polynomial-time solution forone-player case implies that the quantitative two-player stochastic parity game problem is in NP ∩ co-NP. This generalizes a result of Condon for stochastic games with reachability objectives. It also constitutes an exponential improvement over the best previous algorithm, which is based on a doubly exponential procedure of de Alfaro and Majumdar for concurrent stochastic parity games and provides only e-approximations of the values.

196 citations


01 Feb 2004
TL;DR: In this paper, the authors studied subgame perfect equilibria of a leader game with a mixed extension of a finite game, where the leader commits to a mixed strategy, and showed that the leader payoff is unique and at least as large as any Nash payoff in the original game.
Abstract: A basic model of commitment is to convert a game in strategic form into a “leadership game” where one player commits to a strategy to which the other player chooses a best response, with payoffs as in the original game This paper studies subgame perfect equilibria of such leadership games for the mixed extension of a finite game, where the leader commits to a mixed strategy In a generic two-player game, the leader payoff is unique and at least as large as any Nash payoff in the original simultaneous game In non-generic two-player games, which are completely analyzed, the leader payoffs may form an interval, which as a set of payoffs is never worse than the Nash payoffs for the player who has the commitment power Furthermore, the set of payoffs to the leader is also at least as good as the set of correlated equilibrium payoffs These observations no longer hold in leadership games with three or more players The possible payoffs to the follower are shown to be arbitrary compared to the simultaneous game or the game where the players switch their roles of leader and follower Curiously, the follower payoff is not so arbitrary in typical

168 citations


Journal ArticleDOI
TL;DR: A formal model is created in which the actors develop a mental model of the value of stage-setting actions as a complex problem-solving task is repeated, and partial knowledge, either of particular states in the problem space or inefficient and circuitous routines through the space, is shown to be quite valuable.
Abstract: Many organizational actions need not have any immediate or direct payoff consequence but set the stage for subsequent actions that bring the organization toward some actual payoff. Learning in such settings poses the challenge of credit assignment (Minsky 1961), that is, how to assign credit for the overall outcome of a sequence of actions to each of the antecedent actions. To explore the process of learning in such contexts, we create a formal model in which the actors develop a mental model of the value of stage-setting actions as a complex problem-solving task is repeated. Partial knowledge, either of particular states in the problem space or inefficient and circuitous routines through the space, is shown to be quite valuable. Because of the interdependence of intelligent action when a sequence of actions must be identified, however, organizational knowledge is relatively fragile. As a consequence, while turnover may stimulate search and have largely benign implications in less interdependent task settings, it is very destructive of the organization's near-term performance when the learning problem requires a complementarity among the actors' knowledge.

120 citations


Proceedings Article
01 Dec 2004
TL;DR: A new set of criteria for learning algorithms in multi-agent systems is proposed, one that is more stringent and better justified than previous proposed criteria, and it is shown that the algorithm almost universally outperforms previous learning algorithms.
Abstract: We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly in repeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or max-imin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.

117 citations


Journal ArticleDOI
TL;DR: In this paper, the authors report on two sets of large-scale financial markets experiments that were designed to test the central proposition of modern asset pricing theory, namely, that risk premia are solely determined by covariance with aggregate risk.
Abstract: We report on two sets of large-scale financial markets experiments that were designed to test the central proposition of modern asset pricing theory, namely, that risk premia are solely determined by covariance with aggregate risk. We analyze the pricing within the framework suggested by two theoretical models, namely, the (general) Arrow and Debreu's complete-markets model, and the (more specific) Sharpe-Lintner-Mossin Capital Asset Pricing Model (CAPM). Completeness of the asset payoff structure justifies the former; the small (albeit non-negligible) risks justifies the latter. We observe swift convergence towards price patterns predicted in the Arrow and Debreu and CAPM models. This observation is significant, because subjects always lack the information to deliberately set asset prices using either model. In the first set of experiments, however, equilibration is not always robust, with markets temporarily veering away. We conjecture that this reflects our failure to control subjects' beliefs about the temporal independence of the payouts. Confirming this conjecture, the anomaly disappears in a second set of experiments, where states were drawn without replacement. We formally test whether CAPM and Arrow-Debreu equilibrium can be used to predict price movements in our experiments and confirm the hypothesis. When multiplying the subject payout tenfold (in real terms), to US $ 500 on average for a 3-h experiment, the results are unaltered, except for an increase in the recorded risk premia.

88 citations


Patent
30 Jun 2004
TL;DR: In this paper, a player selects one or more of the player-selectable game elements and the progressive game payoff is determined based on his or her selection, and a display displays a randomly selected outcome of the wagering game in response to receiving the wager amount from the player.
Abstract: A new type of progressive game can be used in conjunction with wagering games. A gaming terminal is capable of playing a progressive game that is triggered during or after the typical wagering game that is played at the gaming terminal. The gaming terminal includes an input device for receiving inputs from a player during the wagering game. Such inputs include a wager amount. A display displays a randomly selected outcome of the wagering game in response to receiving the wager amount from the player. In response to the progressive game being triggered, the display then displays a plurality of player-selectable game elements. The player selects one or more of the player-selectable game elements. The progressive game payoff is determined based on his or her selection.

86 citations


Proceedings ArticleDOI
07 Jul 2004
TL;DR: This talk will survey two graphical models for compactly representing single-shot, finite-action games in which a large number of agents contend for scarce resources, and presents algorithms for computing both symmetric and arbitrary equilibria of AGGs.
Abstract: Action-graph games (AGGs) are a fully expressive game representation which can compactly express both strict and context-specific independence between players' utility functions. Actions are represented as nodes in a graph G, and the payoff to an agent who chose the action s depends only on the numbers of other agents who chose actions connected to s. We present algorithms for computing both symmetric and arbitrary equilibria of AGGs using a continuation method. We analyze the worst-case cost of computing the Jacobian of the payoff function, the exponential-time bottleneck step, and in all cases achieve exponential speedup. When the in-degree of G is bounded by a constant and the game is symmetric, the Jacobian can be computed in polynomial time.

Journal ArticleDOI
TL;DR: In this paper, the payoff distribution procedure of a subgame consistent solution in stochastic differential games with transferable payoffs is derived analytically under different optimality principles.
Abstract: Subgame consistency is a fundamental element in the solution of cooperative stochastic differential games. In particular, it ensures that: (i) the extension of the solution policy to a later starting time and to any possible state brought about by the prior optimal behavior of the players would remain optimal; (ii) all players do not have incentive to deviate from the initial plan. In this paper, we develop a mechanism for the derivation of the payoff distribution procedures of subgame consistent solutions in stochastic differential games with transferable payoffs. The payoff distribution procedure of the subgame consistent solution can be identified analytically under different optimality principles. Demonstration of the use of the technique for specific optimality principles is shown with an explicitly solvable game. For the first time, analytically tractable solutions of cooperative stochastic differential games with subgame consistency are derived.

Journal ArticleDOI
TL;DR: In this article, the authors deal with some methodological aspects related to the discretization of a class of integro-differential equations modelling the evolution of the probability distribution over the microscopic state of a large system of interacting individuals.
Abstract: This paper deals with some methodological aspects related to the discretization of a class of integro-differential equations modelling the evolution of the probability distribution over the microscopic state of a large system of interacting individuals. The microscopic state includes both mechanical and socio-biological variables. The discretization of the microscopic state generates a class of dynamical systems defining the evolution of the densities of the discretized state. In general, this yields a system of partial differential equations replacing the continuous integro-differential equation. As an example, a specific application is discussed, which refers to modelling in the field of social dynamics. The derivation of the evolution equation needs the development of a stochastic game theory.

Posted Content
TL;DR: A noncooperative game is introduced, where the retailers decide on their order quantities individually, and it is shown that the set of payoff vectors resulting from strong Nash equilibria corresponds to the core of the associated cooperative game.

Journal ArticleDOI
TL;DR: In this paper, a game-theoretic approach is introduced to derive an economically and environmentally optimal status of an industrial ecosystem under uncertainty, where the possible conflicts of the profit and sustainability objectives of the member entities in the ecosystem are resolved.
Abstract: A well-designed and operated industrial ecological system should be able to utilize effectively the generated wastes from one member as the feed to another member. Nevertheless, due to heavy interactions among the member entities, particularly with various uncertainties, the coordinative material and energy reuse is a very complex task. In this paper, the issues of optimal operation of an industrial ecosystem under uncertainty are addressed. A game theory based approach is then introduced to derive an economically and environmentally optimal status of an industrial ecosystem. The effectiveness of the approach is demonstrated by tackling a case study problem, where the Nash Equilibrium for the profit payoff and sustainability payoff of the member entities is identified. The possible conflicts of the profit and sustainability objectives of the member entities in the ecosystem are resolved.

Journal ArticleDOI
TL;DR: In this paper, the authors specify a dynamic model in which agents adjust their decisions toward higher payoffs, subject to normal error, and this process generates a probability distribution of players' decisions that evolves over time according to the Fokker-Planck equation.
Abstract: We specify a dynamic model in which agents adjust their decisions toward higher payoffs, subject to normal error. This process generates a probability distribution of players' decisions that evolves over time according to the Fokker-Planck equation. The dynamic process is stable for all potential games, a class of payoff structures that includes several widely studied games. In equilibrium, the distributions that determine expected payoffs correspond to the distributions that arise from the logit function applied to those expected payoffs. This "logit equilibrium" forms a stochastic generalization of the Nash equilibrium and provides a possible explanation of anomalous laboratory data.

Journal ArticleDOI
TL;DR: It is shown that a large class of games are best-response equivalent to identical interest games, but are not potential games, and some existing potential game arguments can be extended.

Journal ArticleDOI
TL;DR: A (randomized) strategy for the second player that always guarantees him a payoff of at least ½ + α, for a constant α > 0 and every large enough n in the one-round Voronoi game.
Abstract: In the one-round Voronoi game, the first player chooses an n-point set W in a square Q, and then the second player places another n-point set B into Q. The payoff for the second player is the fraction of the area of Q occupied by the regions of the points of B in the Voronoi diagram of W \cup B. We give a (randomized) strategy for the second player that always guarantees him a payoff of at least ½ + α, for a constant α > 0 and every large enough n. This contrasts with the one-dimensional situation, with Q=[0,1], where the first player can always win more than ½.

Proceedings ArticleDOI
J. Cai1, U. Pooch1
26 Apr 2004
TL;DR: Analysis and experimental results show a routing protocol with the consideration of the incentives of individual nodes stimulates cooperation and improves network lifetime without significantly diminishing the performance of the whole network.
Abstract: Summary form only given. In wireless mobile ad hoc networks (MANET), energy is a scarce resource. Though cooperation is the basis of network services, due to the limited energy reserve of each node, there is no guarantee any given protocols would be followed by nodes managed by different authorities. Instead of treating the selfish nodes as a security concern and trying to eliminate them, we propose a novel way to encourage cooperative works - rewarding service providers according to their contributions. Nodes in a MANET can form coalitions to reduce aggregate transmission power on each hop along a route. The payment of each node in a coalition is determined by using Shapley Value, a well-known concept in game theory for allocating payoff for each member in a cooperative coalition. We present the contribution reward routing protocol with Shapley Value (CAP-SV). It achieves the objective of truthfulness. The performance of CAP-SV is studied by simulations using ns-2. Analysis and experimental results show a routing protocol with the consideration of the incentives of individual nodes stimulates cooperation and improves network lifetime without significantly diminishing the performance of the whole network.

Journal ArticleDOI
TL;DR: The set ofEquilibria with espionage is characterized as a sunset of the set of correlated equilibria, which turns out to depend on the difference between the Stackelberg equilibrium payoffs and the SPE payoffs.

Proceedings ArticleDOI
26 Sep 2004
TL;DR: In this article, an integrated system approach based on game theory for automotive electrical power and energy management systems is presented, where the game players are individual power sources, and the strategies of players are their alternating states.
Abstract: In this paper, we present an integrated system approach based on game theory for automotive electrical power and energy management systems. We apply this approach to a case study fuel cell hybrid electric vehicle (FC-HEV) by using the Simulink-based simulator ADVISOR. The case study fuel cell vehicle is rated at 80 kW peak and 25 kW average propulsion power, and consists of a fuel cell stack, a battery pack, an ultra capacitor, and two 35 kW induction motors. In this control strategy, the game 'players' are the individual power sources, and the 'strategies' of players are their alternating states. The objective of the players is to maximize their payoff, where the payoff is a function of the powertrain efficiency and vehicle performance. Simulation results indicate that the optimal power and energy management strategy can improve both fuel economy and performance.

Journal ArticleDOI
TL;DR: In this paper, a comparison between hierarchical contests and single-stage contests is made, and a condition is given that characterizes whether and when the aggregate equilibrium payoff of contestants is higher in the single stage contest, and when a singlestage contest is more likely to award the prize to the contestant who values it most highly.

Book ChapterDOI
22 Aug 2004
TL;DR: In this paper, the authors consider infinite antagonistic games over finite graphs and present conditions that, whenever satisfied by the payoff mapping, assure for both players positional (memoryless) optimal strategies.
Abstract: We consider infinite antagonistic games over finite graphs. We present conditions that, whenever satisfied by the payoff mapping, assure for both players positional (memoryless) optimal strategies. To verify the robustness of our conditions we show that all popular payoff mappings, such as mean payoff, discounted, parity as well as several other payoffs satisfy them.

Journal ArticleDOI
TL;DR: It is shown that a population with a dynamic learning rate can gain an increased average payoff in transient phases and can also exploit external noise, leading the system away from the Nash equilibrium, in a resonancelike fashion.
Abstract: We introduce an extension of the usual replicator dynamics to adaptive learning rates. We show that a population with a dynamic learning rate can gain an increased average payoff in transient phases and can also exploit external noise, leading the system away from the Nash equilibrium, in a resonancelike fashion. The payoff versus noise curve resembles the signal to noise ratio curve in stochastic resonance. Seen in this broad context, we introduce another mechanism that exploits fluctuations in order to improve properties of the system. Such a mechanism could be of particular interest in economic systems.

Journal ArticleDOI
TL;DR: In this article, a new approach to play games quantum mechanically is proposed, where two players who perform measurements in an EPR-type setting are considered, and payoff relations are defined as functions of correlations, without reference to classical or quantum mechanics.
Abstract: A new approach to play games quantum mechanically is proposed. We consider two players who perform measurements in an EPR-type setting. The payoff relations are defined as functions of correlations, i.e. without reference to classical or quantum mechanics. Classical bi-matrix games are reproduced if the input states are classical and perfectly anti-correlated, that is, for a classical correlation game. However, for a quantum correlation game, with an entangled singlet state as input, qualitatively different solutions are obtained. For example, the Prisoners' Dilemma acquires a Nash equilibrium if both players apply a mixed strategy. It appears to be conceptually impossible to reproduce the properties of quantum correlation games within the framework of classical games.

Journal ArticleDOI
TL;DR: In this article, the authors show that information provision about the other player's play increases coordination when there are messages, but otherwise has no effect in the 22 stag hunt games, where the safe choice always yields the same payoff and information about payoffs does not always identify the other players' action.

Journal ArticleDOI
TL;DR: In this paper, it was shown that if the informed player also controls the transitions, the game has a value, whereas if the uninformed player controls the transition, the maxmin value as well as the min-max value exist, but they may differ.
Abstract: We study stochastic games with incomplete information on one side, in which the transition is controlled by one of the players. We prove that if the informed player also controls the transitions, the game has a value, whereas if the uninformed player controls the transitions, the max-min value as well as the min-max value exist, but they may differ. We discuss the structure of the optimal strategies, and provide extensions to the case of incom- plete information on both sides.

Journal ArticleDOI
TL;DR: In this paper, it was shown that every two-player nonzero-sum stopping game admits an ǫ-equilibrium in randomized strategies for every > 0 in discrete time.
Abstract: We prove that every two-player nonzero–sum stopping game in discrete time admits an ɛ-equilibrium in randomized strategies for every ɛ>0. We use a stochastic variation of Ramsey’s theorem, which enables us to reduce the problem to that of studying properties of ɛ-equilibria in a simple class of stochastic games with finite state space.

Journal ArticleDOI
TL;DR: In this article, the authors introduce the efficient learning equilibrium (ELE), a normative approach to learning in noncooperative settings, where the learning algorithms themselves are required to be in equilibrium.

Journal ArticleDOI
TL;DR: It is demonstrated that aversion to complexity may provide a justification for the competitive outcome in a decentralised market game if complexity costs of implementing strategies enter players’ preferences and together with the standard payoff in the game.