scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: It is demonstrated experimentally that using these new utility functions can result in significantly improved performance over that of previously investigated COIN payoff utilities, over and above those previous utilities' superiority to the conventional team game utility.
Abstract: We consider the problem of designing (perhaps massively distributed) collectives of computational processes to maximize a provided "world utility" function. We consider this problem when the behavior of each process in the collective can be cast as striving to maximize its own payoff utility function. For such cases the central design issue is how to initialize/update those payoff utility functions of the individual processes so as to induce behavior of the entire collective having good values of the world utility. Traditional "team game" approaches to this problem simply assign to each process the world utility as its payoff utility function. In previous work we used the "Collective Intelligence" (COIN) framework to derive a better choice of payoff utility functions, one that results in world utility performance up to orders of magnitude superior to that ensuing from the use of the team game utility. In this paper, we extend these results using a novel mathematical framework. Under that new framework we review the derivation of the general class of payoff utility functions that both (i) are easy for the individual processes to try to maximize, and (ii) have the property that if good values of them are achieved, then we are assured a high value of world utility. These are the "Aristocrat Utility" and a new variant of the "Wonderful Life Utility" that was introduced in the previous COIN work. We demonstrate experimentally that using these new utility functions can result in significantly improved performance over that of previously investigated COIN payoff utilities, over and above those previous utilities' superiority to the conventional team game utility. These results also illustrate the substantial superiority of these payoff functions to perhaps the most natural version of the economics technique of "endogenizing externalities."

232 citations

Journal ArticleDOI
TL;DR: In this article, the authors report experiments designed to yield insight into the nature and robustness of reciprocal motives in the ultimatum game, where the first mover proposes a division of a fixed sum of money and the second mover either accepts this proposal or vetoes it.
Abstract: I. INTRODUCTION The most widely applied models in economics and game theory are based on the assumption of "self-regarding preferences," which are characterized by an exclusive concern about one's own material payoff. Models of self-regarding preferences capture behavior quite well in many contexts, including double auctions as in Smith (1982) and Davis and Holt (1993), one-sided auctions with independent private values as in Cox and Oaxaca (1996), procurement contracting as in Cox et al. (1996), and search as in Cox and Oaxaca (1989, 2000), Harrison and Morgan (1990), and Cason and Friedman (2003). But there is now a large body of literature that reports systematic inconsistencies with the implications of the self-regarding preferences model. (1) These replicable patterns of behavior are observed in experimental games involving decisions about the division of material payoffs among the participants. One explanation for the observed behavior that has received considerable attention is reciprocity. We report experiments designed to yield insight into the nature and robustness of reciprocal motives. By observing decisions in a group of related experiments we are able to discriminate between behavior motivated by reciprocity and behavior motivated by nonreciprocal other-regarding preferences over outcomes. Some treatments introduce the possibility of behavior motivated by positive reciprocity, whereas other treatments introduce the possibility of negatively reciprocal motivation. By "positive reciprocity" we mean a motivation to adopt a generous action that benefits someone else because that person's intentional behavior was perceived to be beneficial to oneself within the decision context of the experiment. Similarly, by "negative reciprocity" we mean a motivation to adopt a costly action that harms someone else because that person's intentional behavior was perceived to be harmful to oneself within the decision context of the experiment. Perhaps the most familiar experiment in the reciprocity literature is the ultimatum game. In this game, the first mover proposes a division of a fixed sum of money and the second mover either accepts this proposal or vetoes it. In the event of a veto, both players get a money payoff of zero. The self-regarding preferences model predicts extremely unequal payoffs for this game, with the first mover offering the second mover the smallest feasible positive amount of money and the second mover accepting this offer. However, observed behavior in the ultimatum game contrasts sharply with these predictions. Under a wide variety of conditions, first movers in ultimatum games tend to propose relatively equal splits. This has been observed by Guth et al. (1982), Hoffman and Spitzer (1985), Hoffman et al. (1994), and Bornstein and Yaniv (1998). (2) First movers may make generous proposals in ultimatum games because they have inequality-averse other-regarding preferences, as suggested by Fehr and Schmidt (1999) and Bolton and Ockenfels (2000), or altruistic other-regarding preferences, as suggested by Cox and Sadiraj (2005). Alternatively, first movers may make generous proposals because they are afraid that second movers will veto lopsided proposals. Second movers may veto such proposals because of inequality-averse preferences over outcomes or because of negative reciprocity. The implications for modeling behavior are different if the behavior is motivated by preferences over outcomes that are unconditional on perceived intentions than if it is motivated by negative reciprocity or fear of negative reciprocity. To discriminate among alternative motivations, we use a triadic experimental design that includes a mini-ultimatum game, which we call the Punishment mini-ultimatum game (Punishment-MUG), and two dictator control treatments. (3) Additional insight into the nature of alternative motives is gained from a systematic comparison of our data with data from the different experimental design of Falk et al. …

230 citations

Journal ArticleDOI
TL;DR: This work introduces a learning rule in which behavior is conditional on whether a player experiences an error of the first or second type, which implements Nash equilibrium behavior in any game with generic payoffs and at least one pure Nash equilibrium.

227 citations

Journal ArticleDOI
TL;DR: This work model cooperation in wireless networks through a game theoretical algorithm derived from a novel concept from coalitional game theory that enables the users to self-organize into independent disjoint coalitions and the resulting clustered network structure is characterized through novel stability notions.
Abstract: Cooperation in wireless networks allows single antenna devices to improve their performance by forming virtual multiple antenna systems. However, performing a distributed and fair cooperation constitutes a major challenge. In this work, we model cooperation in wireless networks through a game theoretical algorithm derived from a novel concept from coalitional game theory. A simple and distributed merge-and-split algorithm is constructed to form coalition groups among single antenna devices and to allow them to maximize their utilities in terms of rate while accounting for the cost of cooperation in terms of power. The proposed algorithm enables the users to self-organize into independent disjoint coalitions and the resulting clustered network structure is characterized through novel stability notions. In addition, we prove the convergence of the algorithm and we investigate how the network structure changes when different fairness criteria are chosen for apportioning the coalition worth among its members. Simulation results show that the proposed algorithm can improve the individual user's payoff up to 40.42% as well as efficiently cope with the mobility of the distributed users.

226 citations

Journal ArticleDOI
TL;DR: Using stochastic evolutionary game theory, where agents make mistakes when judging the payoffs and strategies of others, natural selection favors fairness, and across a range of parameters, the average strategy matches the observed behavior.
Abstract: Classical economic models assume that people are fully rational and selfish, while experiments often point to different conclusions. A canonical example is the Ultimatum Game: one player proposes a division of a sum of money between herself and a second player, who either accepts or rejects. Based on rational self-interest, responders should accept any nonzero offer and proposers should offer the smallest possible amount. Traditional, deterministic models of evolutionary game theory agree: in the one-shot anonymous Ultimatum Game, natural selection favors low offers and demands. Experiments instead show a preference for fairness: often responders reject low offers and proposers make higher offers than needed to avoid rejection. Here we show that using stochastic evolutionary game theory, where agents make mistakes when judging the payoffs and strategies of others, natural selection favors fairness. Across a range of parameters, the average strategy matches the observed behavior: proposers offer between 30% and 50%, and responders demand between 25% and 40%. Rejecting low offers increases relative payoff in pairwise competition between two strategies and is favored when selection is sufficiently weak. Offering more than you demand increases payoff when many strategies are present simultaneously and is favored when mutation is sufficiently high. We also perform a behavioral experiment and find empirical support for these theoretical findings: uncertainty about the success of others is associated with higher demands and offers; and inconsistency in the behavior of others is associated with higher offers but not predictive of demands. In an uncertain world, fairness finishes first.

223 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483