The convergence of JSFP to a pure Nash equilibrium in congestion games, or equivalently in finite potential games, when players use some inertia in their decisions and in both cases of with or without exponential discounting of the historical data.
Abstract:
We consider multi-player repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. In these ldquolarge-scalerdquo games, players are inherently faced with limitations in both their observational and computational capabilities. Accordingly, players in large-scale games need to make their decisions using algorithms that accommodate limitations in information gathering and processing. This disqualifies some of the well known decision making models such as ldquoFictitious Playrdquo (FP), in which each player must monitor the individual actions of every other player and must optimize over a high dimensional probability space. We will show that Joint Strategy Fictitious Play (JSFP), a close variant of FP, alleviates both the informational and computational burden of FP. Furthermore, we introduce JSFP with inertia, i.e., a probabilistic reluctance to change strategies, and establish the convergence to a pure Nash equilibrium in all generalized ordinal potential games in both cases of averaged or exponentially discounted historical data. We illustrate JSFP with inertia on the specific class of congestion games, a subset of generalized ordinal potential games. In particular, we illustrate the main results on a distributed traffic routing problem and derive tolling procedures that can lead to optimized total traffic congestion.
TL;DR: This work extends existing learning algorithms to accommodate restricted action sets caused by the limitations of agent capabilities and group based decision making, and introduces a new class of games called sometimes weakly acyclic games for time-varying objective functions and action sets, and provides distributed algorithms for convergence to an equilibrium.
TL;DR: In this article, a game-theoretical approach is proposed to solve the problem of autonomous vehicle-target assignment, where a group of vehicles are expected to optimally assign themselves to a set of targets.
TL;DR: The goal is to use these behavioral models as a prescriptive control approach in distributed multi-agent systems where the guaranteed limiting behavior would represent a desirable operating condition.
TL;DR: It is shown that with the proposed games, global optimization is achieved with local information, specifically, the local altruistic game maximized the network throughput and the local congestion game minimizes the network collision level.
TL;DR: A systematic methodology for designing local agent objective functions that guarantees an equivalence between the resulting Nash equilibria and the optimizers of the system level objective and that the resulting game possesses an inherent structure that can be exploited in distributed learning, e.g., potential games.
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
TL;DR: In this article, the authors present the methods of discrete choice analysis and their applications in the modeling of transportation systems and present a complete travel demand model system presented in chapter 11, which is intended as a graduate level text and a general professional reference.
TL;DR: In this book the authors investigate the nonlinear dynamics of the self-regulation of social and economic behavior, and of the closely related interactions among species in ecological communities.
TL;DR: In this paper, the authors present a general formulation of non-cooperative finite games: N-Person nonzero-sum games, Pursuit-Evasion games, and Stackelberg Equilibria of infinite dynamic games.
TL;DR: In this article, the authors present the first textbook that fully explains the neuro-dynamic programming/reinforcement learning methodology, which is a recent breakthrough in the practical application of neural networks and dynamic programming to complex problems of planning, optimal decision making, and intelligent control.
Q1. What contributions have the authors mentioned in the paper "Joint strategy fictitious play with inertia for potential games" ?
The authors consider multi-player repeated games involving a large number of players with large strategy spaces and enmeshed utility structures. The authors will show that Joint Strategy Fictitious Play ( JSFP ), a close variant of FP, alleviates both the informational and computational burden of FP. Furthermore, the authors introduce JSFP with inertia, i. e., a probabilistic reluctance to change strategies, and establish the convergence to a pure Nash equilibrium in all generalized ordinal potential games in both cases of averaged or exponentially discounted historical data. The authors illustrate JSFP with inertia on the specific class of congestion games, a subset of generalized ordinal potential games. In particular, the authors illustrate the main results on a distributed traffic routing problem and derive tolling procedures that can lead to optimized total traffic congestion.
Q2. What is the probability distribution of regret matching?
In regret matching, once player computes his average regret for each action , he chooses an action , , according to the probability distribution defined asfor any , provided that the denominator above is positive; otherwise, is the uniform distribution over .
Q3. What is the proof of Proposition 3.1?
Theorem 3.1: In any finite generalized ordinal potential game in which no player is indifferent between distinct strategies as in Assumption 2.2, the action profiles generated by a fading memory JSFP with inertia process satisfying Assumption 2.1 converge to a pure Nash equilibrium almost surely.
Q4. What is the expected utility of a player’s action in a JSFP game?
When written in this form, JSFP appears to have a computational burden for each player that is even higher than that of FP, since tracking the empirical frequencies of the joint actions of the other players is more demanding for playerthan tracking the empirical frequencies of the actions of the other players individually, where denotes the set of probability distributions on a finite set .
Q5. What is the total congestion experienced by all drivers on the network?
The total congestion experienced by all drivers on the network isDefine a new congestion game where each driver’s utility takes the formwhere is the toll imposed on road which is a function of the number of users of road .
Q6. What is the resulting congestion game with tolls?
When the tolling scheme set forth in Proposition 4.1 is applied to the congestion game example considered previously, the resulting congestion game with tolls is a potential game in which no player is indifferent between distinct strategies.
Q7. What is the average utility of a player if they had used the same actions?
Substituting (4) into (5) results inwhich is the average utility player would have received if action had been chosen at every stage up to time and other players used the same actions.