scispace - formally typeset
Search or ask a question

Showing papers on "Stochastic game published in 2013"


Journal ArticleDOI
TL;DR: This paper proposes a Stackelberg game between utility companies and end-users to maximize the revenue of each utility company and the payoff of each user and derive analytical results for the StACkelberg equilibrium of the game and proves that a unique solution exists.
Abstract: Demand Response Management (DRM) is a key component in the smart grid to effectively reduce power generation costs and user bills. However, it has been an open issue to address the DRM problem in a network of multiple utility companies and consumers where every entity is concerned about maximizing its own benefit. In this paper, we propose a Stackelberg game between utility companies and end-users to maximize the revenue of each utility company and the payoff of each user. We derive analytical results for the Stackelberg equilibrium of the game and prove that a unique solution exists. We develop a distributed algorithm which converges to the equilibrium with only local information available for both utility companies and end-users. Though DRM helps to facilitate the reliability of power supply, the smart grid can be succeptible to privacy and security issues because of communication links between the utility companies and the consumers. We study the impact of an attacker who can manipulate the price information from the utility companies. We also propose a scheme based on the concept of shared reserve power to improve the grid reliability and ensure its dependability.

705 citations


Journal ArticleDOI
TL;DR: It is shown that in reasonably large populations, so-called zero-determinant strategies can act as catalysts for the evolution of cooperation, similar to tit-for-tat, but that they are not the stable outcome of natural selection.
Abstract: Iterated games are a fundamental component of economic and evolutionary game theory. They describe situations where two players interact repeatedly and have the ability to use conditional strategies that depend on the outcome of previous interactions, thus allowing for reciprocation. Recently, a new class of strategies has been proposed, so-called “zero-determinant” strategies. These strategies enforce a fixed linear relationship between one’s own payoff and that of the other player. A subset of those strategies allows “extortioners” to ensure that any increase in one player’s own payoff exceeds that of the other player by a fixed percentage. Here, we analyze the evolutionary performance of this new class of strategies. We show that in reasonably large populations, they can act as catalysts for the evolution of cooperation, similar to tit-for-tat, but that they are not the stable outcome of natural selection. In very small populations, however, extortioners hold their ground. Extortion strategies do particularly well in coevolutionary arms races between two distinct populations. Significantly, they benefit the population that evolves at the slower rate, an example of the so-called “Red King” effect. This may affect the evolution of interactions between host species and their endosymbionts.

265 citations


Journal ArticleDOI
TL;DR: Using stochastic evolutionary game theory, where agents make mistakes when judging the payoffs and strategies of others, natural selection favors fairness, and across a range of parameters, the average strategy matches the observed behavior.
Abstract: Classical economic models assume that people are fully rational and selfish, while experiments often point to different conclusions. A canonical example is the Ultimatum Game: one player proposes a division of a sum of money between herself and a second player, who either accepts or rejects. Based on rational self-interest, responders should accept any nonzero offer and proposers should offer the smallest possible amount. Traditional, deterministic models of evolutionary game theory agree: in the one-shot anonymous Ultimatum Game, natural selection favors low offers and demands. Experiments instead show a preference for fairness: often responders reject low offers and proposers make higher offers than needed to avoid rejection. Here we show that using stochastic evolutionary game theory, where agents make mistakes when judging the payoffs and strategies of others, natural selection favors fairness. Across a range of parameters, the average strategy matches the observed behavior: proposers offer between 30% and 50%, and responders demand between 25% and 40%. Rejecting low offers increases relative payoff in pairwise competition between two strategies and is favored when selection is sufficiently weak. Offering more than you demand increases payoff when many strategies are present simultaneously and is favored when mutation is sufficiently high. We also perform a behavioral experiment and find empirical support for these theoretical findings: uncertainty about the success of others is associated with higher demands and offers; and inconsistency in the behavior of others is associated with higher offers but not predictive of demands. In an uncertain world, fairness finishes first.

223 citations


Proceedings ArticleDOI
01 Dec 2013
TL;DR: A finite horizon, zero-sum, nonstationary stochastic game approach is employed to minimize the worst-case control and detection cost, and an optimal control policy for switching between control-cost optimal and secure (but cost-suboptimal) controllers in presence of replay attacks is obtained.
Abstract: The existing tradeoff between control system performance and the detection rate for replay attacks highlights the need to provide an optimal control policy that balances the security overhead with control cost. We employ a finite horizon, zero-sum, nonstationary stochastic game approach to minimize the worst-case control and detection cost, and obtain an optimal control policy for switching between control-cost optimal (but nonsecure) and secure (but cost-suboptimal) controllers in presence of replay attacks. To formulate the game, we quantify game parameters using knowledge of the system dynamics, controller design and utilized statistical detector. We show that the optimal strategy for the system exists, and present a suboptimal algorithm used to calculate the system's strategy by combining robust game techniques and a finite horizon stationary stochastic game algorithm. Our approach can be generalized for any system with multiple finite cost, time-invariant linear controllers/estimators/intrusion detectors.

173 citations


Journal ArticleDOI
TL;DR: It is shown that zero-determinant strategies with an informational advantage over other players that allows them to recognize each other can be evolutionarily stable (and able to exploit other players), however, such an advantage is bound to be short-lived as opposing strategies evolve to counteract the recognition.
Abstract: In iterated Prisoner’s Dilemma games, zero-determinant strategies are able to define the opponent’s payoff regardless of the opponent’s strategy. Here the authors show that zero-determinant strategies are not evolutionary stable in adapting populations, and instead evolve into non-coercive strategies.

163 citations


Book ChapterDOI
16 Mar 2013
TL;DR: The tool is based on the probabilistic model checker PRISM, benefiting from its existing user interface and simulator, whilst adding novel model checking algorithms for stochastic games, as well as functionality to synthesise optimal player strategies.
Abstract: We present PRISM-games, a model checker for stochastic multi-player games, which supports modelling, automated verification and strategy synthesis for probabilistic systems with competitive or cooperative behaviour. Models are described in a probabilistic extension of the Reactive Modules language and properties are expressed using rPATL, which extends the well-known logic ATL with operators to reason about probabilities, various reward-based measures, quantitative properties and precise bounds. The tool is based on the probabilistic model checker PRISM, benefiting from its existing user interface and simulator, whilst adding novel model checking algorithms for stochastic games, as well as functionality to synthesise optimal player strategies, explore or export them, and verify other properties under the specified strategy.

148 citations


Journal ArticleDOI
TL;DR: In this paper, a general and tractable framework for comparative static results in aggregative games is provided, and sufficient conditions under which positive shocks to individual players increase their own actions and have monotone effects on the aggregate.

140 citations


Journal ArticleDOI
Fangwen Fu1, Ulas C. Kozat1
TL;DR: It is proved that there exists one Nash equilibrium in the conjectural prices that is efficient, i.e., the sum-utility is maximized and the NO has the incentive to compute the equilibrium point and feedback to SPs.
Abstract: We propose a new framework for wireless network virtualization. In this framework, service providers (SPs) and the network operator (NO) are decoupled from each other: The NO is solely responsible for spectrum management, and SPs are responsible for quality-of-service (QoS) management for their own users. SPs compete for the shared wireless resources to satisfy their distinct service objectives and constraints. We model the dynamic interactions among SPs and the NO as a stochastic game. SPs bid for the resources via dynamically announcing their value functions. The game is regulated by the NO through: 1) sum-utility optimization under rate region constraints; 2) enforcement of Vickrey-Clarke-Groves (VCG) mechanism for pricing the instantaneous rate consumption; and 3) declaration of conjectured prices for future resource consumption. We prove that there exists one Nash equilibrium in the conjectural prices that is efficient, i.e., the sum-utility is maximized. Thus, the NO has the incentive to compute the equilibrium point and feedback to SPs. Given the conjectural prices and the VCG mechanism, we also show that SPs must reveal their truthful value functions at each step to maximize their long-term utilities. As another major contribution, we develop an online learning algorithm that allows the SPs to update the value functions and the NO to update the conjectural prices iteratively. Thus, the proposed framework can deal with unknown dynamics in traffic characteristics and channel conditions. We present simulation results to show the convergence to the Nash equilibrium prices under various dynamic traffic and channel conditions.

139 citations


Journal ArticleDOI
TL;DR: Using the semi-tensor product method, this paper investigates the algebraic formulation and strategy optimization for a class of evolutionary networked games with ''myopic best response adjustment'' rule, and presents a number of new results.

125 citations


Journal ArticleDOI
TL;DR: This paper develops exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develops efficient and exact algorithms based on them and demonstrates that they deliver significant speedups over the Monte Carlo approach.
Abstract: The Shapley value--probably the most important normative payoff division scheme in coalitional games--has recently been advocated as a useful measure of centrality in networks. However, although this approach has a variety of real-world applications (including social and organisational networks, biological networks and communication networks), its computational properties have not been widely studied. To date, the only practicable approach to compute Shapley value-based centrality has been via Monte Carlo simulations which are computationally expensive and not guaranteed to give an exact answer. Against this background, this paper presents the first study of the computational aspects of the Shapley value for network centralities. Specifically, we develop exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develop efficient (polynomial time) and exact algorithms based on them. We empirically evaluate these algorithms on two real-life examples (an infrastructure network representing the topology of the Western States Power Grid and a collaboration network from the field of astrophysics) and demonstrate that they deliver significant speedups over the Monte Carlo approach. For instance, in the case of unweighted networks our algorithms are able to return the exact solution about 1600 times faster than the Monte Carlo approximation, even if we allow for a generous 10% error margin for the latter method.

117 citations


Journal ArticleDOI
01 Nov 2013-PLOS ONE
TL;DR: Adaptive dynamics is used to study when evolution leads to extortion and when it leads to compliance in the iterated prisoner’s dilemma and shows that cooperation is most abundant in large populations, in which case average payoffs approach the social optimum.
Abstract: Direct reciprocity is a mechanism for the evolution of cooperation. For the iterated prisoner’s dilemma, a new class of strategies has recently been described, the so-called zero-determinant strategies. Using such a strategy, a player can unilaterally enforce a linear relationship between his own payoff and the co-player’s payoff. In particular the player may act in such a way that it becomes optimal for the co-player to cooperate unconditionally. In this way, a player can manipulate and extort his co-player, thereby ensuring that the own payoff never falls below the co-player’s payoff. However, using a compliant strategy instead, a player can also ensure that his own payoff never exceeds the co-player’s payoff. Here, we use adaptive dynamics to study when evolution leads to extortion and when it leads to compliance. We find a remarkable cyclic dynamics: in sufficiently large populations, extortioners play a transient role, helping the population to move from selfish strategies to compliance. Compliant strategies, however, can be subverted by altruists, which in turn give rise to selfish strategies. Whether cooperative strategies are favored in the long run critically depends on the size of the population; we show that cooperation is most abundant in large populations, in which case average payoffs approach the social optimum. Our results are not restricted to the case of the prisoners dilemma, but can be extended to other social dilemmas, such as the snowdrift game. Iterated social dilemmas in large populations do not lead to the evolution of strategies that aim to dominate their co-player. Instead, generosity succeeds.

Journal ArticleDOI
TL;DR: In this paper, the authors investigate social interaction effects in two experimental games, which represent the two broad classes of strategic situations mentioned above -a coordination game that possesses multiple equilibria in material payoffs and a cooperation game which has only one equilibrium in material payoff structure.
Abstract: I. INTRODUCTION It is a long-standing and fundamental problem of the social sciences to understand whether and in what way humans are influenced by the behavior exhibited by the members of the social group to which they belong. We speak of a "social interaction effect" if an individual changes his or her behavior as a function of his or her respective group members' behavior. Social interaction effects are economically important because they may be present in many decision domains. (1) From a theoretical viewpoint there are at least two potentially important sources for social interaction effects even in otherwise identical environments; both are studied in this paper. A social interaction effect can occur if the game that people play in their group has multiple equilibria which are because of the material payoff structure of the game. Behavior across groups can be different simply because different groups coordinate on different equilibria of the same game. A second and less straightforward source of social interaction effects concerns those interactions that operate via non-material psychological payoffs, such as conformism, social approval, fairness, reciprocity, or guilt aversion. These motives can induce players to adapt their behavior to that of others, even if the material payoff structure does not provide any incentive to do so. The identification of social interaction effects requires several problems to be overcome (Akerlof 1997; Manski 1993, 2000): (1) identifying the reference group for which social interaction effects are sought to be established, (2) circumventing the problem of self-selection of group members by investigating randomly composed groups, (3) controlling correlated effects that affect all group members in a similar way, and (4)controlling contextual effects such as exogenous social background characteristics of group members. In this paper we present the design of an experiment that circumvents these problems and therefore allows us to study the behavioral logic of social interactions. We argue that the experimental laboratory provides the researcher with a valuable tool to study social interactions because it guarantees more control than any other available data source (Falk and Heckman 2009). The ideal data set would observe the same individual at the same time in different groups or neighborhoods, which are identical--apart from different neighbors. Obviously, this is impossible in the field. In contrast, it is possible to come very close to this "counterfactual state" in the laboratory. In our experiment, we are able to observe decisions of the same subject at the same time in two economically identical environments. The only reason to behave differently in these two environments is the presence of social interactions, that is, the fact that a person is systematically and differentially affected by the behavior of his neighbors in the two environments. Our within-subjects two-group design circumvents the above-mentioned identification problems. Using the terminology of Manski (2000), in our study reference groups are well-defined; the setup avoids self-selection; subjects make simultaneous decisions in two economically identical environments, which controls for correlated effects, including experience; the decision problem is abstractly framed and decisions are taken anonymously, which avoids contextual effects. Moreover, our laboratory approach has the added advantage of eliminating measurement errors. We investigate social interaction effects in two experimental games, which represent the two broad classes of strategic situations mentioned above--a coordination game that possesses multiple equilibria in material payoffs and a cooperation game which has only one equilibrium in material payoffs. The coordination game we study is a version of the "minimum-effort game" (Van Huyck, Battalio, and Beil 1990; see Camerer 2003, chapter 7; Devetag and Ortmann 2007 for overviews). …

Journal ArticleDOI
TL;DR: In this paper, the authors use zero-sum Markov games to model the interactions subject to underlying uncertainties of real-world events and actions, and show how the defender can use deception as a defense mechanism.
Abstract: Electricity grids are critical infrastructures They are credible targets of active (eg, terrorist) attacks since their disruption may lead to sizable losses economically and in human lives It is thus crucial to develop decision support that can guide administrators in deploying defense resources for system security and reliability Prior work on the defense of critical infrastructures has typically used static or Stackelberg games These approaches view network interdictions as one-time events However, infrastructure protection is also a continual process in which the defender and attacker interact to produce dynamic states affecting their best actions, as witnessed in the continual attack and defense of transmission networks in Colombia and Yemen In this paper, we use zero-sum Markov games to model these interactions subject to underlying uncertainties of real-world events and actions We solve equilibrium mixed strategies of the players that maximize their respective minimum payoffs with a time-decayed metric We also show how the defender can use deception as a defense mechanism Using results for a 5-bus system, a WECC 9-bus system, and an IEEE standard 14-bus system, we illustrate that our game model can provide useful insights We also contrast our results with those of static games, and quantify the gain in defender payoff due to misinformation of the attacker

Journal ArticleDOI
TL;DR: A theoretical framework in which to cast the source identification problem is introduced and the ultimate achievable performance of the forensic analysis in the presence of an adversary aiming at deceiving it is derived.
Abstract: We introduce a theoretical framework in which to cast the source identification problem. Thanks to the adoption of a game-theoretic approach, the proposed framework permits us to derive the ultimate achievable performance of the forensic analysis in the presence of an adversary aiming at deceiving it. The asymptotic Nash equilibrium of the source identification game is derived under an assumption on the resources on which the forensic analyst may rely. The payoff at the equilibrium is analyzed, deriving the conditions under which a successful forensic analysis is possible and the error exponent of the false-negative error probability in such a case. The difficulty of deriving a closed-form solution for general instances of the game is alleviated by the introduction of an efficient numerical procedure for the derivation of the optimum attacking strategy. The numerical analysis is applied to a case study to show the kind of information it can provide.

Journal ArticleDOI
TL;DR: This paper model a DDoS attack as a one-shot, non-cooperative, zero-sum game, incorporating in the model a richer set of options available to the attacker compared to what has been previously achieved.

01 Jan 2013
TL;DR: A new characteristic of a security, its “information sensitivity” (IS), is introduced and is a sufficient statistic for expected utility maximization and a pricing factor if agents have a linear reference point utility function.
Abstract: In this paper we introduce a new characteristic of a security, its “information sensitivity” (IS). This measure has two components, the first component measures a security’s expected monetary loss in low payoff states relative to its price (“tail risks”) and the other component measures the expected monetary profit in high payoff states. We apply this measure in different illustrative applications. (i) IS captures the incentive of an agent to produce information about the payoff of the security. (ii) We use IS to solve an optimal security design problem and show that it is optimal for a buyer to purchase debt when he faces a seller who can acquire information and there is never information acquisition in equilibrium. Even if information cost is zero the optimal debt contract makes the seller indifferent between acquiring and not acquiring information. (iii) We use IS to formalize the notion that it is easier to buy than to sell a security. (iv) IS can explain the optimality of securitization. (v) IS is a sufficient statistic for expected utility maximization and a pricing factor if agents have a linear reference point utility function.

Proceedings ArticleDOI
12 Dec 2013
TL;DR: It is found that Minimax-Q learning is more suitable for an aggressive environment than Nash-Q while Friend-or-foe Q-learning can provide the best solution under distributed mobile ad hoc networking scenarios in which the centralized control can hardly be available.
Abstract: We introduce Competing Mobile Network Game (CMNG), a stochastic game played by cognitive radio networks that compete for dominating an open spectrum access. Differentiated from existing approaches, we incorporate both communicator and jamming nodes to form a network for friendly coalition, integrate antijamming and jamming subgames into a stochastic framework, and apply Q-learning techniques to solve for an optimal channel access strategy. We empirically evaluate our Q-learning based strategies and find that Minimax-Q learning is more suitable for an aggressive environment than Nash-Q while Friend-or-foe Q-learning can provide the best solution under distributed mobile ad hoc networking scenarios in which the centralized control can hardly be available.

Journal ArticleDOI
TL;DR: This work finds necessary conditions for the existence of a mean field equilibrium in stochastic dynamic games that exhibit strategic complementarities between players, and shows that there exist a “largest” and a ”smallest” equilibrium among all those where the equilibrium strategy used by a player is nondecreasing.
Abstract: We study a class of stochastic dynamic games that exhibit strategic complementarities between players; formally, in the games we consider, the payoff of a player has increasing differences between her own state and the empirical distribution of the states of other players. Such games can be used to model a diverse set of applications, including network security models, recommender systems, and dynamic search in markets. Stochastic games are generally difficult to analyze, and these difficulties are only exacerbated when the number of players is large (as might be the case in the preceding examples). We consider an approximation methodology called mean field equilibrium to study these games. In such an equilibrium, each player reacts to only the long-run average state of other players. We find necessary conditions for the existence of a mean field equilibrium in such games. Furthermore, as a simple consequence of this existence theorem, we obtain several natural monotonicity properties. We show that there exist a “largest” and a “smallest” equilibrium among all those where the equilibrium strategy used by a player is nondecreasing, and we also show that players converge to each of these equilibria via natural myopic learning dynamics; as we argue, these dynamics are more reasonable than the standard best-response dynamics. We also provide sensitivity results, where we quantify how the equilibria of such games move in response to changes in parameters of the game (for example, the introduction of incentives to players).

Journal ArticleDOI
TL;DR: Cox et al. as mentioned in this paper investigated whether social dilemmas are more serious when related to under-provision or over-appropriation in comparable environments, and constructed three pairs of provision and appropriation games within each pair.
Abstract: [Author Affiliation]James C Cox, * Experimental Economics Center and Department of Economics, 14 Marietta Street NW, Andrew Young School of Policy Studies, Georgia State University, Atlanta, GA 30303, USA; E-mail: jccox@gsu.edu; corresponding author .Elinor Ostrom, [dagger] Founder of the Vincent and Elinor Ostrom Workshop in Political Theory and Policy Analysis, Indiana University, Bloomington, IN 47405, USA, and the Center for the Study of Institutional Diversity, Arizona State University, Tempe, AZ 85287, USA. Deceased.Vjollca Sadiraj, [double dagger] Experimental Economics Center and Department of Economics, 14 Marietta Street NW, Andrew Young School of Policy Studies, Georgia State University, Atlanta, GA 30303, USA; E-mail: vsadiraj@gsu.edu .James M Walker, § The Vincent and Elinor Ostrom Workshop in Political Theory and Policy Analysis, Department of Economics, Indiana University, Wylie Hall 105, Bloomington, IN 47405, USA; E-mail: walkerj@indiana.edu .[Acknowledgment]Helpful comments and suggestions were provided by an anonymous referee. Financial support was provided by the National Science Foundation (grant numbers SES-0849590 and SES-0849551).1. IntroductionSocial dilemmas characterize settings in which a divergence exists between expected outcomes from individuals pursuing strategies based on narrow self-interest versus groups pursuing strategies based on the interests of the group as a whole. A large literature in several disciplines studies specific manifestations of social dilemma situations (Axelrod 1981; Gautschi 2000; Marshall 2004; Heibing, Yu, and Rauhut 2011). Two prominent areas in the economics literature are public goods games and trust games. These are typically surplus creation games in which the central question is whether free riding or absence of trust leads to an opportunity cost that a potential surplus is not created nor provided for a group. For example, in the one-period voluntary contributions public good game reported by Walker and Halloran (2004, table 2, p. 240), decision makers on average failed to create 47% of the feasible surplus.1 In the investment (or trust) game reported by Berg, Dickhaut, and McCabe (1995, figure 2, p. 130), decision makers on average failed to create 48% of the feasible surplus.2The ultimatum game is a well-known surplus destruction game. In a typical game, the entire surplus available to the two players is destroyed if the responder rejects a proposed split. In the seminal ultimatum game study reported by Guth, Schmittberger, and Schwarze (1982, tables 4, 5, p. 375), 10% of the feasible surplus was destroyed by "inexperienced" subjects. This figure increased to 29% with experienced subjects.3 Another, well-known example of a surplus destruction game is appropriation from a common-pool resource. Walker, Gardner, and Ostrom (1990, table II, p. 208) report data for a multiple-decision-round setting where players, on average, over-appropriated to the point of destroying the entire available surplus from the common pool, consistent with the outcome referred to as "the tragedy of the commons."4An open empirical question is whether social dilemmas are more serious when related to under-provision or over-appropriation in comparable environments. In the field, and most prior laboratory studies, critical differences exist in the opportunity sets that make direct comparisons between provision and appropriation social dilemmas infeasible. We address the question by constructing three pairs of provision and appropriation games. The two games within each pair are payoff equivalent.In the appropriation game in a payoff-equivalent pair, the value of the total endowment is (i) strictly greater than the value of the total endowment in the provision game but (ii) equal to the maximum attainable total payoff in the provision game. …

Posted Content
TL;DR: Mean-variance portfolio theory can apply to the streams of payoffs such as dividends following an initial investment, in place of one-period returns as discussed by the authors, which is especially useful when returns are not independent over time and investors have non-marketed income.
Abstract: Mean-variance portfolio theory can apply to the streams of payoffs such as dividends following an initial investment, in place of one-period returns. This description is especially useful when returns are not independent over time and investors have non-marketed income. Investors hedge their outside income streams, and then their optimal payoff is split between an indexed perpetuity - the risk-free payoff - and a long-run mean-variance efficient payoff. "Long-run" moments sum over time as well as states of nature. In equilibrium, long-run expected returns vary with long-run market betas and outside- income betas. State-variable hedges do not appear in optimal payoffs or this equilibrium.

Journal ArticleDOI
TL;DR: In this article, the authors consider a tax competition game with infinitely many players and show that an evolutionarily stable tax policy coincides with the competitive outcome of a tax-competitiveness game.
Abstract: Rather than about their absolute payoffs, governments in fiscal competition often seem to care about their performance relative to other governments. Moreover, they often appear to mimic policies observed elsewhere. I study such behavior in a standard tax competition game. Both with relative payoff concerns and for imitative policies, evolutionary stability for games with finitely many players is the appropriate solution concept. Independently of the number of jurisdictions involved, an evolutionarily stable tax policy coincides with the competitive outcome of a tax competition game with infinitely many players. It, thus, involves drastic efficiency losses.

Book ChapterDOI
26 Aug 2013
TL;DR: It is shown that finding a winning strategy is PSPACE-hard in general and undecidable for deterministic strategies, and it is proved that optimal strategies, if they exists, may require infinite memory and randomisation.
Abstract: We study two-player stochastic games, where the goal of one player is to satisfy a formula given as a positive boolean combination of expected total reward objectives and the behaviour of the second player is adversarial. Such games are important for modelling, synthesis and verification of open systems with stochastic behaviour. We show that finding a winning strategy is PSPACE-hard in general and undecidable for deterministic strategies. We also prove that optimal strategies, if they exists, may require infinite memory and randomisation. However, when restricted to disjunctions of objectives only, memoryless deterministic strategies suffice, and the problem of deciding whether a winning strategy exists is NP-complete. We also present algorithms to approximate the Pareto sets of achievable objectives for the class of stopping games.

Journal ArticleDOI
TL;DR: In this paper, repeated Bayesian games with communication and observable actions are studied, where the players' privately known payoffs evolve according to an irreducible Markov chain whose transitions are independent across players.
Abstract: We study repeated Bayesian games with communication and observable actions in which the players' privately known payoffs evolve according to an irreducible Markov chain whose transitions are independent across players. Our main result implies that, generically, any Pareto-efficient payoff vector above a stationary minmax value can be approximated arbitrarily closely in a perfect Bayesian equilibrium as the discount factor goes to 1. As an intermediate step, we construct an approximately efficient dynamic mechanism for long finite horizons without assuming transferable utility. [PUBLICATION ABSTRACT]

Proceedings ArticleDOI
01 Jun 2013
TL;DR: Borders are given on the ε-rank of a real matrix A, defined for any ε > 0 as the minimum rank over matrices that approximate every entry of A to within an additive ε.
Abstract: We study the e-rank of a real matrix A, defined for any e > 0 as the minimum rank over matrices that approximate every entry of A to within an additive e. This parameter is connected to other notions of approximate rank and is motivated by problems from various topics including communication complexity, combinatorial optimization, game theory, computational geometry and learning theory. Here we give bounds on the e-rank and use them for algorithmic applications. Our main algorithmic results are (a) polynomial-time additive approximation schemes for Nash equilibria for 2-player games when the payoff matrices are positive semidefinite or have logarithmic rank and (b) an additive PTAS for the densest subgraph problem for similar classes of weighted graphs. We use combinatorial, geometric and spectral techniques; our main new tool is an algorithm for efficiently covering a convex body with translates of another convex body.

Journal ArticleDOI
TL;DR: In this article, a new characterization of the class of convex combinations of the Shapley value and the equal division solution is presented. But this characterization is restricted to TU-games.

Journal ArticleDOI
TL;DR: In this article, the authors give an explicit representation of the lowest cost strategy (or "cost-efficient" strategy) to achieve a given payoff distribution, and highlight the connections between cost-efficiency and dependence (copulas).
Abstract: In this paper, we give an explicit representation of the lowest cost strategy (or "cost-efficient" strategy) to achieve a given payoff distribution. For any inefficient strategy, we are able to construct financial derivatives which dominate in the sense of first-order or second-order stochastic dominance. We highlight the connections between cost-efficiency and dependence (copulas). This allows us to extend the theory to deal with state-dependent constraints to better reflect real world preferences. We show in particular that path-dependent strategies (although inefficient in the Black Scholes setting) may become optimal in the presence of state-dependent constraints.

Journal ArticleDOI
TL;DR: This paper introduces a concept of uncertain bimatrix game within the framework of uncertainty theory, and three solution concepts of uncertain equilibrium strategies as well as their existence theorem are proposed.
Abstract: In real-world games, the players are often lack of the information about the other players' (or even his own) payoffs. Assuming that all entries of payoff matrices are uncertain variables, this paper introduces a concept of uncertain bimatrix game. Within the framework of uncertainty theory, three solution concepts of uncertain equilibrium strategies as well as their existence theorem are proposed. Furthermore, a sufficient and necessary condition is presented for finding the uncertain equilibrium strategies. Finally, an example is provided for illustrating the usefulness of the theory developed in this paper.

Journal ArticleDOI
TL;DR: A Bayesian game model is extended to a dynamic game model for which a Nash-stable coalitional structure is obtained in each subgame and another solution concept, namely, the Bayesian core, which guarantees that no mobile node has an incentive to leave the grand coalition.
Abstract: Cooperative packet delivery can improve the data delivery performance in wireless networks by exploiting the mobility of the nodes, especially in networks with intermittent connectivity, high delay and error rates such as wireless mobile delay-tolerant networks (DTNs). For such a network, we study the problem of rational coalition formation among mobile nodes to cooperatively deliver packets to other mobile nodes in a coalition. Such coalitions are formed by mobile nodes which can be either well behaved or misbehaving in the sense that the well-behaved nodes always help each other for packet delivery, while the misbehaving nodes act selfishly and may not help the other nodes. A Bayesian coalitional game model is developed to analyze the behavior of mobile nodes in coalition formation in presence of this uncertainty of node behavior (i.e., type). Given the beliefs about the other mobile nodes' types, each mobile node makes a decision to form a coalition, and thus the coalitions in the network vary dynamically. A solution concept called Nash-stability is considered to find a stable coalitional structure in this coalitional game with incomplete information. We present a distributed algorithm and a discrete-time Markov chain (DTMC) model to find the Nash-stable coalitional structures. We also consider another solution concept, namely, the Bayesian core, which guarantees that no mobile node has an incentive to leave the grand coalition. The Bayesian game model is extended to a dynamic game model for which we propose a method for each mobile node to update its beliefs about other mobile nodes' types when the coalitional game is played repeatedly. The performance evaluation results show that, for this dynamic Bayesian coalitional game, a Nash-stable coalitional structure is obtained in each subgame. Also, the actual payoff of each mobile node is close to that when all the information is completely known. In addition, the payoffs of the mobile nodes will be at least as high as those when they act alone (i.e., the mobile nodes do not form coalitions).

Posted Content
TL;DR: There exist Markov strategies which solve the problem when the authors restrict attention to the long term average payoff for the iterated Prisoner’s Dilemma and these good strategies effectively stabilize cooperative behavior.
Abstract: For the iterated Prisoner’s Dilemma, there exist Markov strategies which solve the problem when we restrict attention to the long term average payoff When used by both players these assure the cooperative payoff for each of them Neither player can benefit by moving unilaterally any other strategy, ie these are Nash equilibria In addition, if a player uses instead an alternative which decreases the opponent’s payoff below the cooperative level, then his own payoff is decreased as well Thus, if we limit attention to the long term payoff, these good strategies effectively stabilize cooperative behavior We characterize these good strategies and analyze their role in evolutionary dynamics

Journal ArticleDOI
TL;DR: In this paper, the authors prove an equilibrium existence theorem for games with discontinuous payoffs and convex and compact strategy spaces, which generalizes the classical result of Reny (1999), as well as the recent paper of McLennan, Monteiro, and Tourky (2011).
Abstract: In this note, we prove an equilibrium existence theorem for games with discontinuous payoffs and convex and compact strategy spaces. It generalizes the classical result of Reny (1999), as well as the recent paper of McLennan, Monteiro, and Tourky (2011). Our conditions are simple and easy to verify. Importantly, examples of spatial location models show that our conditions allow for economically meaningful payoff discontinuities, that are not covered by other conditions in the literature.