scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: It is proved that all stable stationary points of the algorithm are Nash equilibria for the game and it is shown that the algorithm always converges to a desirable solution.
Abstract: A multi-person discrete game where the payoff after each play is stochastic is considered. The distribution of the random payoff is unknown to the players and further none of the players know the strategies or the actual moves of other players. A learning algorithm for the game based on a decentralized team of learning automata is presented. It is proved that all stable stationary points of the algorithm are Nash equilibria for the game. Two special cases of the game are also discussed, namely, game with common payoff and the relaxation labelling problem. The former has applications such as pattern recognition and the latter is a problem widely studied in computer vision. For the two special cases it is shown that the algorithm always converges to a desirable solution. >

316 citations

Journal ArticleDOI
TL;DR: It is shown how the network adaptation dynamics favors the emergence of cooperators with the highest payoff, and these "leaders" are shown to sustain the global cooperative steady state.
Abstract: Cooperative behavior among a group of agents is studied assuming adaptive interactions. Each agent plays a Prisoner's Dilemma game with its local neighbors, collects an aggregate payoff, and imitates the strategy of its best neighbor. Agents may punish or reward their neighbors by removing or sustaining the interactions, according to their satisfaction level and strategy played. An agent may dismiss an interaction, and the corresponding neighbor is replaced by another randomly chosen agent, introducing diversity and evolution to the network structure. We perform an extensive numerical and analytical study, extending results in M. G. Zimmermann, V. M. Eguiluz, and M. San Miguel, Phys. Rev. E 69, 065102(R) (2004). We show that the system typically reaches either a full-defective state or a highly cooperative steady state. The latter equilibrium solution is composed mostly by cooperative agents, with a minor population of defectors that exploit the cooperators. It is shown how the network adaptation dynamics favors the emergence of cooperators with the highest payoff. These "leaders" are shown to sustain the global cooperative steady state. Also we find that the average payoff of defectors is larger than the average payoff of cooperators. Whenever "leaders" are perturbed (e.g., by addition of noise), an unstable situation arises and global cascades with oscillations between the nearly full defection network and the fully cooperative outcome are observed.

313 citations

Journal ArticleDOI
TL;DR: In this paper, the authors extend the exploration of the dynamics of spatial evolutionary games in three distinct but related ways: deterministic versus stochastic rules, discrete versus continuous time, and different geometries of interaction in regular and random spatial arrays.
Abstract: We extend our exploration of the dynamics of spatial evolutionary games [Nowak & May 1992, 1993] in three distinct but related ways. We analyse, first, deterministic versus stochastic rules; second, discrete versus continuous time (see Hubermann & Glance [1993]); and, third, different geometries of interaction in regular and random spatial arrays. We show that spatial effects can change some of the intuitive concepts in evolutionary game theory: (i) equilibria among strategies are no longer necessarily characterised by equal average payoffs; (ii) the strategy with the higher average payoff can steadily converge towards extinction; (iii) strategies can become extinct even though their basic reproductive rate (at very low frequencies) is larger than one. The equilibrium properties of spatial games are instead determined by “local relative payoffs.” We characterise the conditions for coexistence between cooperators and defectors in the spatial prisoner’s dilemma game. We find that cooperation can be maintain...

312 citations

Journal ArticleDOI
TL;DR: The proposed stationary policy in the anti-jamming game is shown to achieve much better performance than the policy obtained from myopic learning, which only maximizes each stage's payoff, and a random defense strategy, since it successfully accommodates the environment dynamics and the strategic behavior of the cognitive attackers.
Abstract: Various spectrum management schemes have been proposed in recent years to improve the spectrum utilization in cognitive radio networks. However, few of them have considered the existence of cognitive attackers who can adapt their attacking strategy to the time-varying spectrum environment and the secondary users' strategy. In this paper, we investigate the security mechanism when secondary users are facing the jamming attack, and propose a stochastic game framework for anti-jamming defense. At each stage of the game, secondary users observe the spectrum availability, the channel quality, and the attackers' strategy from the status of jammed channels. According to this observation, they will decide how many channels they should reserve for transmitting control and data messages and how to switch between the different channels. Using the minimax-Q learning, secondary users can gradually learn the optimal policy, which maximizes the expected sum of discounted payoffs defined as the spectrum-efficient throughput. The proposed stationary policy in the anti-jamming game is shown to achieve much better performance than the policy obtained from myopic learning, which only maximizes each stage's payoff, and a random defense strategy, since it successfully accommodates the environment dynamics and the strategic behavior of the cognitive attackers.

310 citations

Journal ArticleDOI
TL;DR: The number of steps in a finite game is related to the least positive eigenvalue of the Laplace operator of the graph to show that the finiteness of the game and the terminating configuration are independent of the moves made.
Abstract: We analyse the following (solitaire) game: each node of a graph contains a pile of chips, and a move consists of selecting a node with at least as many chips on it as its degree, and letting it send one chip to each of its neighbors. The game terminates if there is no such node. We show that the finiteness of the game and the terminating configuration are independent of the moves made. If the number of chips is less than the number of edges, the game is always finite. If the number of chips is at least the number of edges, the game can be infinite for an appropriately chosen initial configuration. If the number of chips is more than twice the number of edges minus the number of nodes, then the game is always infinite. The independence of the finiteness and the terminating position follows from simple but powerful ‘exchange properties’ of the sequences of legal moves, and from some general results on ‘antimatroids with repetition’, i.e. languages having these exchange properties. We relate the number of steps in a finite game to the least positive eigenvalue of the Laplace operator of the graph.

310 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483