scispace - formally typeset
Search or ask a question

Showing papers on "Stochastic game published in 1998"


Journal ArticleDOI
TL;DR: In this article, a simplified prisoner's game is studied on a square lattice, where the players interacting with their neighbors can follow two strategies: to cooperate or to defect unconditionally, and the players updated in random sequence have a chance to adopt one of the neighboring strategies with a probability depending on the payoff difference.
Abstract: A simplified prisoner's game is studied on a square lattice when the players interacting with their neighbors can follow two strategies: to cooperate $(C)$ or to defect $(D)$ unconditionally. The players updated in random sequence have a chance to adopt one of the neighboring strategies with a probability depending on the payoff difference. Using Monte Carlo simulations and dynamical cluster techniques, we study the density $c$ of cooperators in the stationary state. This system exhibits a continuous transition between the two absorbing states when varying the value of temptation to defect. In the limits $\stackrel{\ensuremath{\rightarrow}}{c}0$ and 1 we have observed critical transitions belonging to the universality class of directed percolation.

1,323 citations


Journal ArticleDOI
TL;DR: It is proved that in general a quantum strategy is always at least as good as a classical one, and furthermore that when both players use quantum strategies there need not be any equilibrium, but if both are allowed mixed quantum strategiesthere must be.
Abstract: We consider game theory from the perspective of quantum algorithms. Strategies in classical game theory are either pure (deterministic) or mixed (probabilistic). We introduce these basic ideas in the context of a simple example, closely related to the traditional Matching Pennies game. While not every two-person zero-sum finite game has an equilibrium in the set of pure strategies, von Neumann showed that there is always an equilibrium at which each player follows a mixed strategy. A mixed strategy deviating from the equilibrium strategy cannot increase a player's expected payoff. We show, however, that in our example a player who implements a quantum strategy can increase his expected payoff, and explain the relation to efficient quantum algorithms. We prove that in general a quantum strategy is always at least as good as a classical one, and furthermore that when both players use quantum strategies there need not be any equilibrium, but if both are allowed mixed quantum strategies there must be.

748 citations


Journal ArticleDOI
TL;DR: In this article, the authors investigated several properties of the minority game and gave an analytical expression of σ 2/N in the N ⪡ 2 M region. But they did not consider the influence of identical players on their gain and on the systems performance.
Abstract: We investigate further several properties of the minority game we have recently introduced. We explain the origin of the phase transition and give an analytical expression of σ2/N in the N ⪡ 2 M region. The ability of the players to learn a given payoff is also analyzed, and we show that the Darwinian evolution process tends to a self-organized state, in particular, the lifetime distribution is a power-law with exponent −2. Furthermore, we study the influence of identical players on their gain and on the systems performance. Finally, we show that large brains always take advantage of small brains.

383 citations


Journal ArticleDOI
TL;DR: In this paper, the authors draw together theoretical and experimental evidence from game theory, evolutionary psychology, and experimental economics to develop a reciprocity framework for understanding the persistence of cooperative outcomes in the face of contrary individual incentives.
Abstract: I. INTRODUCTION Theorists have long studied the fundamental problem that cooperative, socially efficient outcomes generally cannot be supported as equilibria in finite games. The puzzle is the occurrence of cooperative behavior in the absence of immediate incentives to cooperate. For example, in two-person bargaining experiments, where noncooperative behavior does not result in efficient outcomes, we observe more cooperative behavior and greater efficiency than such environments are expected to produce. Similarly, in public good experiments with groups varying in size from four to 100 people, the participants tend to achieve much higher payoff levels than predicted by noncooperative theory. Moreover, examples of cooperative behavior achieved by decentralized means have a long history in the human experience. Anthropological and archaeological evidence suggest that sharing behavior is ubiquitous in tribal cultures that lack markets, monetary systems, or other means of storing and redistributing wealth (see, e.g., Cosmides and Tooby [1987; 1989]; Isaac [1978]; Kaplin and Hill [1985]; Tooby and De Vote [1987]; Trivers [1971]). In this paper we draw together theoretical and experimental evidence from game theory, evolutionary psychology, and experimental economics to develop a reciprocity framework for understanding the persistence of cooperative outcomes in the face of contrary individual incentives. The theory of repeated games with discounting or infinite time horizons allows for cooperative solutions, but does not yield conditions for predicting them (Fudenberg and Tirole [1993]). Recent research in evolutionary psychology (Cosmides and Tooby [1987; 1989; 1992]) suggests that humans may be evolutionarily predisposed to engage in social exchange using mental algorithms that identify and punish cheaters. Finally, a considerable body of research in experimental economics now identifies a number of environmental and institutional factors that promote cooperation even in the face of contrary individual incentives (Davis and Holt [1993]; Isaac and Walker [1988a,b; 1991]; Isaac, Walker and Thomas [1984]; Isaac, Walker and Williams [1991]). Moreover, these experimental results indicate that trust and trustworthiness play a much greater role than the evolutionary psychologists' punish-cheaters model would suggest. We hypothesize that humans' abilities to read one anothers' minds (Baron-Cohen [1995]) in social situations facilitates reciprocity. II. REPEATED GAMES Repeated-game theory offers two explanations of cooperation based on self-interest: self-enforcing equilibria and reputations. Self-enforcing equilibria are based on the idea that players can credibly punish noncooperative defections. The nagging problem with self-enforcing cooperative equilibria is that there are many equilibria in such games with cooperation being only one possibility. Experiments demonstrating that subjects cooperate in games with repeated play and relatively short finite horizons (Selten and Stoecker [1986]; Rapoport [1987]) suggest reputations are important in games with incomplete information (Kreps et al. [1982]). The idea is that if players are uncertain about other players' types, then the possibility emerges that players will mimic (develop a reputation as) a type different from their own. In circumstances where cooperation is mutually beneficial players have an incentive to mimic cooperative behavior. In the examples given by Kreps et al. [1982], players rationally compute strategies based on (utility or payoff) type uncertainty. They cooperate from the beginning until near the end of the game, and then defect. This is not, however, the pattern observed in experiments, where it is common for cooperation to develop out of repeated interactions; also, defection near the end is often not observed. The strength of the theory is that it is based on individual (but longer run) self-interest, and is parsimonious. …

306 citations


Journal ArticleDOI
TL;DR: In this paper, it was shown that the complexity of the payoff function for the Blackwell game is approximately the same as that of the perfect information game with Borel measurable payoff functions.
Abstract: Games of infinite length and perfect information have been studied for many years. There are numerous determinacy results for these games, and there is a wide body of work on consequences of their determinacy. Except for games with very special payoff functions, games of infinite length and imperfect information have been little studied. In 1969, David Blackwell [1] introduced a class of such games and proved a determinacy theorem for a subclass. During the intervening time, there has not been much progress in proving the determinacy of Blackwell's games. Orkin [17] extended Blackwell's result to a slightly wider class. Blackwell [2] found a new proof of his own result. Maitra and Sudderth [9, 10] improved Blackwell's result in a different direction from that of Orkin and also generalized to the case of stochastic games. Recently Vervoort [18] has obtained a substantial improvement. Nevertheless, almost all the basic questions have remained open. In this paper we associate with each Blackwell game a family of perfect information games, and we show that the (mixed strategy) determinacy of the former follows from the (pure strategy) determinacy of the latter. The complexity of the payoff function for the Blackwell game is approximately the same as the complexity of the payoff sets for the perfect information games. In particular, this means that the determinacy of Blackwell games with Borel measurable payoff functions follows from the known determinacy of perfect information games with Borel payoff sets.

298 citations


Journal ArticleDOI
TL;DR: In this article, the authors examined the effect of the available information in a 12-player market entry game and found that information concerning other players' payoff increases the number of entrants.

194 citations


Posted Content
TL;DR: In this paper, a dynamic model of endogenous coalition formation in cooperative games with transferable utility is presented, where players are boundedly rational and decide which of the existing coalitions to join, and demands a payoff.
Abstract: This paper presents a dynamic model of endogenous coalition formation in cooperative games with transferable utility. The players are boundedly rational. At each time step, a player decides which of the existing coalitions to join, and demands a payoff. These decisions are determined by a (non- cooperative) best-reply rule, given the coalition structure and allocation in the previous period. We show that absorbing states of the process exist if the game is essential. Further, if the players are allowed to experiment with myopically suboptimal strategies whenever there are potential gains from trade, an isomorphism between the set of absorbing states of the process and the set of core allocations can beestablished, and the process converges to one of these states with probability one whenever the core is non-empty. This result holds independently of the form of the characteristic function.

177 citations


Journal ArticleDOI
TL;DR: In this article, a class of non-cooperative market entry games with symmetric players, complete information, zero entry costs, and several randomly presented values of the market capacity is studied experimentally.
Abstract: Coordination behavior is studied experimentally in a class of noncooperative market entry games featuring symmetric players, complete information, zero entry costs, and several randomly presented values of the market capacity. Once the market capacity becomes publicly known, each player must decide privately whether to enter the market and receive a payoff, which increases linearly in the difference between the market capacity and the number of entrants, or stay out. Payoffs for staying out are either positive, giving rise to the domain of gains, or negative, giving rise to the domain of losses. The major findings are substantial individual differences that do not diminish with practice, aggregate behavior that is organized extremely well in both the domains of gains and losses by the Nash equilibrium solution, and variations in the population action strategies with repeated play of the stage game that are accounted for by a variant of an adaptive learning model due to Roth and Erev (1995).

129 citations


Posted Content
TL;DR: A survey of the evidence from interior-Nash public goods experiments can be found in this article, where the authors show that the persistence of contributions is merely a boundary condition, with residual noise keeping contributions from reaching the Nash equilibrium.
Abstract: Introduction The standard public goods experiment involves linear payoffs in which the unique Nash equilibrium is at the lower boundary, i.e. full free riding. Contributions in these experiments tend to decline toward the Nash equilibrium in most treatments, but contributions persist even after as many as 60 rounds. This observation raises the question of whether the persistence of contributions is merely a boundary condition, with residual noise keeping contributions from reaching the Nash equilibrium. In other experimental environments, behavior shows a tendency to differ from boundary equilibria, but is reasonably close to interior predictions (see Smith and Walker, 1993, for examples). One way to address this issue in public goods experiments is to modify the standard linear payoff structure so that the Nash equilibrium is located in the interior of the set of feasible contributions. This paper surveys the evidence from interior-Nash public goods experiments. In some papers, the internal equilibrium is a dominant strategy and in others it is not. The designs also differ in terms of where the equilibrium is located relative to the upper and lower boundaries of the decision space. These relatively new designs are important because they can be used to evaluate the effects of treatment variables (for example, endowments, group size, and information) when the data are not being pulled toward the boundary. In addition, moving the equilibria to the center of the set of feasible contributions tends to reduce or neutralize any bias due to decision errors. Before considering the interior Nash designs, it is useful to review the standard linear structure that produces a boundary equilibrium. An individual i who contributes xi to the public good out of an endowment of E units, thereby consumes E xi units of the private good. If the marginal value of the private good is a constant, v, and the individual's marginal value of total contributions to the public good is also a constant, m, then the individual's earnings are given by:

111 citations


Journal ArticleDOI
TL;DR: In this article, the authors consider a k-player sequential bargaining model in which both the cake size and the identity of the proposer are determined by a stochastic process.
Abstract: We consider a k-player sequential bargaining model in which both the cake size and the identity of the proposer are determined by a stochastic process. For the case where the cake is a simplex (of random size) and the players share a common discount factor, we establish the existence of a unique stationary subgame perfect payoff which is efficient and characterize the conditions under which agreement is delayed. We also investigate how the equilibrium payoffs depend on the order in which the players move and on the correlation between the identity of the proposer and the cake size.

106 citations


Journal ArticleDOI
TL;DR: The general ERC model is described, and it is shown that it predicts many of the key phenomena observed in the experiment, as well as standard game theoretic concepts.

Posted Content
01 Jan 1998
TL;DR: Fudenberg and Levine as discussed by the authors developed an alternative explanation that equilibrium arises as the long-run outcome of a process in which less than fully rational players grope for optimality over time.
Abstract: In economics, most noncooperative game theory has focused on equilibrium in games, especially Nash equilibrium and its refinements. The traditional explanation for when and why equilibrium arises is that it results from analysis and introspection by the players in a situation where the rules of the game, the rationality of the players, and the players' payoff functions are all common knowledge. Both conceptually and empirically, this theory has many problems. In The Theory of Learning in Games Drew Fudenberg and David Levine develop an alternative explanation that equilibrium arises as the long-run outcome of a process in which less than fully rational players grope for optimality over time. The models they explore provide a foundation for equilibrium theory and suggest useful ways for economists to evaluate and modify traditional equilibrium concepts.

Posted Content
TL;DR: In this paper, the authors investigate the role of decision costs and rewards on the accuracy of economic decisions and find that steep payoff schedules are necessary to optimize behavior only in cases where subjects must search out an optimal strategy rather than being able to deduce it from information provided.
Abstract: This paper reports on three laboratory experiments designed to investigate the roles of decision costs and rewards on the accuracy of economic decisions. The experimental vehicle is a purchase decision employing the Becker-DeGroot-Marshak (BDM) mechanism. The first experiment verifies the incentive-compatibility of the BDM in a pure induced-value setting; the second tests its performance under different information regimes and payoff schedules; the third addresses the role of feedback information. Steep payoff schedules are found to be necessary to optimizing behavior only in cases where subjects must search out an optimal strategy rather than being able to deduce it from information provided.

Journal ArticleDOI
TL;DR: The optimality of multidimensional perceptual categorization performance with unequal base rates and payoffs was examined and the hypothesis that base-rate information and payoff information are combined independently was tested.
Abstract: The optimality of multidimensional perceptual categorization performance with unequal base rates and payoffs was examined. In Experiment 1, observers learned simultaneously the category structures and base rates or payoffs. Observers showed conservative cutoff placement when payoffs were unequal and extreme cutoff placement when base rates were unequal. In Experiment 2, observers were trained on the category structures before the base-rate or payoff manipulation. Simultaneous base-rate and payoff manipulations tested the hypothesis that base-rate information and payoff information are combined independently. Observers showed (a) small suboptimalities in base-rate and payoff estimation, (b) no qualitative differences across base-rate and payoff conditions, and (c) support for the hypothesis that base-rate and payoff information is combined independently. Implications for current theories of base-rate and payoff learning are discussed.

Journal ArticleDOI
TL;DR: It is proved that a two- person, zero-sum stochastic game with arbitrary state and action spaces, a finitely additive law of motion and a bounded Borel measurable payoff has a value.
Abstract: We prove that a two-person, zero-sum stochastic game with arbitrary state and action spaces, a finitely additive law of motion and a bounded Borel measurable payoff has a value.

Journal ArticleDOI
TL;DR: In this paper, the existence and unicity of the solution to backward-forward stochastic differential equations under weaker monotonicity assumptions than those of Hu and Peng (1990) were studied.

Journal ArticleDOI
TL;DR: In this paper, it was shown that if both players are allowed to leave the negotiation after a rejection, in which case they obtain a payoff of zero, then there exist a continuum of subgame-perfect equilibrium outcomes, including some which involve significant delay.
Abstract: In this note we show that if in the standard Rubinstein model both players are allowed to leave the negotiation after a rejection, in which case they obtain a payoff of zero, then there exist a continuum of subgame-perfect equilibrium outcomes, including some which involve significant delay. We also fully characterize the case in which, upon quitting, the players can take an outside option of positive value.

Journal ArticleDOI
TL;DR: The main results imply that in two person repeated games, the set of equilibrium payoffs of a sequence of such games, G(n), n =1, 2, ..., converges as n goes to infinity to the individual rational and feasible payoff of the one shot game, whenever the bound on one of the two automata sizes is polynomial or subexponential in n.
Abstract: The paper studies the implications of bounding the complexity of the strategies players may select, on the set of equilibrium payoffs in repeated games. The complexity of a strategy is measured by the size of the minimal automation that can implement it. A finite automation has a finite number of states and an initial state. It prescribes the action to be taken as a function of the current state and a transition function changing the state of the automaton as a function of its current state and the present actions of the other players. The size of an automaton is its number of states. The main results imply in particular that in two person repeated games, the set of equilibrium payoffs of a sequence of such games, G(n), n =1, 2, ..., converges as n goes to infinity to the individual rational and feasible payoffs of the one shot game, whenever the bound on one of the two automata sizes is polynomial or subexponential in n and both, the length of the game and the bounds of the automata sizes are at least n. A special case of such result justifies cooperation in the finitely repeated prisoner¹s dilemma, without departure from strict utility maximization or complete information, but under the assumption that there are bounds (possibly very large) to the complexity of the strategies that the players may use.

Journal ArticleDOI
TL;DR: The Shapley value and the Banzhaf value have been axiomatized in various ways for TU games as discussed by the authors, and both of them have been shown to be optimal solutions for cooperative games with transferable utilities.
Abstract: A cooperative game with transferable utilities– or simply a TU-game – describes a situation in which players can obtain certain payoffs by cooperation. A solution concept for these games is a function which assigns to every such a game a distribution of payoffs over the players in the game. Famous solution concepts for TU-games are the Shapley value and the Banzhaf value. Both solution concepts have been axiomatized in various ways.

Journal ArticleDOI
TL;DR: In this paper, it was shown that continuous-time fictitious play converges uniformly at ratet − 1 in any finite two-person zero-sum game with payoff and strategy.

Journal ArticleDOI
TL;DR: In this paper, a general model of non-cooperating agents exploiting a renewable resource is considered, and it is shown that there exists a continuum of Markov-perfect Nash equilibria (MPNE).
Abstract: A general model of non-cooperating agents exploiting a renewable resource is considered. Assuming that the resource is sufficiently productive we prove that there exists a continuum of Markov-perfect Nash equilibria (MPNE). Although these equilibria lead to over-exploitation one can approximate the efficient solution by MPNE both in the state space and the payoff space. Furthermore, we derive a necessary and sufficient condition for maximal exploitation of the resource to qualify as a MPNE. This condition is satisfied if there are sufficiently many players, or if the players are sufficiently impatient, or if the capacity of each player is sufficiently high.

Journal ArticleDOI
TL;DR: In this article, the statistical properties of optimal mixed strategies of large matrix games with random payoff matrices were investigated and analytical expressions for the value of the game and the distribution of strategy strengths were derived.
Abstract: Matrix games constitute a fundamental problem of game theory and describe a situation of two players with completely conflicting interests. We show how methods from statistical mechanics can be used to investigate the statistical properties of optimal mixed strategies of large matrix games with random payoff matrices and derive analytical expressions for the value of the game and the distribution of strategy strengths. In particular the fraction of pure strategies not contributing to the optimal mixed strategy of a player is calculated. Both independently distributed as well as correlated elements of the payoff matrix are considered and the results are compared with numerical simulations.

Posted Content
Susan Athey1
TL;DR: In this paper, necessary and sufficient conditions for monotone comparative statics predictions in several classes of stochastic optimization problems were developed. Butler et al. considered two main classes of assumptions on primitives: single crossing properties and log-supermodularity.
Abstract: This paper develops necessary and sufficient conditions for monotone comparative statics predictions in several classes of stochastic optimization problems. The results are formulated so as to highlight the tradeoffs between assumptions about payoff functions and assumptions about probability distributions: they characterize "minimal sufficient conditions" on a pair of functions (for exaple, a utility function and a probability distribution) so that the expected utility satisfies necessary and sufficient conditions for comparative statics predictions. The paper considers two main classes of assumptions on primitives: single crossing properties and log-supermodularity. Single crossing properties arise naturally in portfolio investment problems and auction games. Log-supermodularity is closely related to several commonly studied economic properties, including decreasing absolute risk aversion, affiliation of random variables, and the monotone likelihood ratio property. The results are used to extend the existing literature on investment problems and games of incomplete information, including auction games and pricing games.

Journal ArticleDOI
TL;DR: In this paper, the authors show that two probability distributions are strategically close if and only if they assign similar ex ante probability to all events; and if it is approximate common knowledge that they assign similarly conditional probabilities to all players.

Journal ArticleDOI
TL;DR: In the cases studied here, with an increasing number of actions, cooperation is increasingly likely to become advantageous compared with pure self-interest, but self- interest can achieve all that cooperation could achieve in a nonnegligible fraction of cases.
Abstract: The relative merits of cooperation and self-interest in an ensemble of strategic interactions can be investigated by using finite random games. In finite random games, finitely many players have finite numbers of actions and independently and identically distributed (iid) random payoffs with continuous distribution functions. In each realization, players are shown the values of all payoffs and then choose their strategies simultaneously. Noncooperative self-interest is modeled by Nash equilibrium (NE). Cooperation is advantageous when a NE is Pareto-inefficient. In ordinal games, the numerical value of the payoff function gives each player’s ordinal ranking of payoffs. For a fixed number of players, as the number of actions of any player increases, the conditional probability that a pure strategic profile is not pure Pareto-optimal, given that it is a pure NE, apparently increases, but is bounded above strictly below 1. In games with transferable utility, the numerical payoff values may be averaged across actions (so that mixed NEs are meaningful) and added across players. In simulations of two-player games when both players have small, equal numbers of actions, as the number of actions increases, the probability that a NE (pure and mixed) attains the cooperative maximum declines rapidly; the gain from cooperation relative to the Nash high value decreases; and the gain from cooperation relative to the Nash low value rises dramatically. In the cases studied here, with an increasing number of actions, cooperation is increasingly likely to become advantageous compared with pure self-interest, but self-interest can achieve all that cooperation could achieve in a nonnegligible fraction of cases. These results can be interpreted in terms of cooperation in societies and mutualism in biology.

Journal ArticleDOI
TL;DR: In this paper, the authors examine self-enforcing bidder coordination in auctions in which bidding is costly, and identify conditions on the distribution of valuations that guarantee multiple equilibria.

Journal ArticleDOI
TL;DR: An alternative proof to a result of Mertens and Parthasarathy, stating that every n-player discounted stochastic game with general setup, and with a norm-continuous transition, has a subgame perfect equilibrium.
Abstract: We give an alternative proof to a result of Mertens and Parthasarathy, stating that every n-player discounted stochastic game with general setup, and with a norm-continuous transition, has a subgame perfect equilibrium.

Patent
30 Mar 1998
TL;DR: In this article, the authors present a wagering game that utilizes random events and their associated values, including a set of higher/lower hitting and standing rules in which a participant's successive event values are compared to determine the success or failure of a strategic decision.
Abstract: The invention comprises a wagering game that utilizes random events and their associated values. The teachings include a set of higher/lower hitting and standing rules in which a participant's successive event values are compared to determine the success or failure of a strategic decision. As a table game vs. a house dealer, the overall player's objective in a preferred embodiment is not to bust while achieving more hits than the dealer who plays by a fixed set of rules. Variations include a solitaire version, different payoff criteria and schedules, different definitions of what constitutes a successful hit, versions with a guaranteed-winner bonus round, and the introduction of jokers which may be helpful and/or harmful to the player's hand.

Journal ArticleDOI
TL;DR: In game theory, four dynamic processes converging towards an equilibrium are distinguished and ordered by way of agents' decreasing cognitive capacities as discussed by the authors, and each player has enough information to simulate perfectly the others' behavior and gets immediately to the equilibrium.
Abstract: In game theory, four dynamic processes converging towards an equilibrium are distinguished and ordered by way of agents' decreasing cognitive capacities. In the eductive process, each player has enough information to simulate perfectly the others' behavior and gets immediately to the equilibrium. In epistemic learning, each player updates his beliefs about others' future strategies, with regard to their sequentially observed actions. In behavioral learning, each player modifies his own strategies according to the observed payoffs obtained from his past actions. In the evolutionary process, each agent has a fixed strategy and reproduces in proportion to the utilities obtained through stochastic interactions. All along the spectrum, longer term dynamics makes up for weaker rationality, and physical relations substitute for mental interactions. Convergence, if any, is towards an always stronger equilibrium notion and selection of an equilibrium state becomes more sensitive to context and history. The processes can be mixed if associated to different periods, agents or mechanisms and deepened if obtained by formal reasoning principles.

Journal ArticleDOI
TL;DR: In this paper, the authors examined games played by a single large player and a large number of opponents who are small, but not anonymous, and they showed that the equilibrium set converges to that of the game where there is a continuum of small players.