scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: This paper linearly decomposes the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies.
Abstract: With the cellular networks becoming increasingly agile, a major challenge lies in how to support diverse services for mobile users (MUs) over a common physical network infrastructure. Network slicing is a promising solution to tailor the network to match such service requests. This paper considers a system with radio access network (RAN)-only slicing, where the physical infrastructure is split into slices providing computation and communication functionalities. A limited number of channels are auctioned across scheduling slots to MUs of multiple service providers (SPs) (i.e., the tenants). Each SP behaves selfishly to maximize the expected long-term payoff from the competition with other SPs for the orchestration of channels, which provides its MUs with the opportunities to access the computation and communication slices. This problem is modelled as a stochastic game, in which the decision makings of a SP depend on the global network dynamics as well as the joint control policy of all SPs. To approximate the Nash equilibrium solutions, we first construct an abstract stochastic game with the local conjectures of channel auction among the SPs. We then linearly decompose the per-SP Markov decision process to simplify the decision makings at a SP and derive an online scheme based on deep reinforcement learning to approach the optimal abstract control policies. Numerical experiments show significant performance gains from our scheme.

117 citations

Proceedings Article
01 Dec 2004
TL;DR: A new set of criteria for learning algorithms in multi-agent systems is proposed, one that is more stringent and better justified than previous proposed criteria, and it is shown that the algorithm almost universally outperforms previous learning algorithms.
Abstract: We propose a new set of criteria for learning algorithms in multi-agent systems, one that is more stringent and (we argue) better justified than previous proposed criteria. Our criteria, which apply most straightforwardly in repeated games with average rewards, consist of three requirements: (a) against a specified class of opponents (this class is a parameter of the criterion) the algorithm yield a payoff that approaches the payoff of the best response, (b) against other opponents the algorithm's payoff at least approach (and possibly exceed) the security level payoff (or max-imin value), and (c) subject to these requirements, the algorithm achieve a close to optimal payoff in self-play. We furthermore require that these average payoffs be achieved quickly. We then present a novel algorithm, and show that it meets these new criteria for a particular parameter class, the class of stationary opponents. Finally, we show that the algorithm is effective not only in theory, but also empirically. Using a recently introduced comprehensive game theoretic test suite, we show that the algorithm almost universally outperforms previous learning algorithms.

117 citations

Journal ArticleDOI
TL;DR: This paper develops exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develops efficient and exact algorithms based on them and demonstrates that they deliver significant speedups over the Monte Carlo approach.
Abstract: The Shapley value--probably the most important normative payoff division scheme in coalitional games--has recently been advocated as a useful measure of centrality in networks. However, although this approach has a variety of real-world applications (including social and organisational networks, biological networks and communication networks), its computational properties have not been widely studied. To date, the only practicable approach to compute Shapley value-based centrality has been via Monte Carlo simulations which are computationally expensive and not guaranteed to give an exact answer. Against this background, this paper presents the first study of the computational aspects of the Shapley value for network centralities. Specifically, we develop exact analytical formulae for Shapley value-based centrality in both weighted and unweighted networks and develop efficient (polynomial time) and exact algorithms based on them. We empirically evaluate these algorithms on two real-life examples (an infrastructure network representing the topology of the Western States Power Grid and a collaboration network from the field of astrophysics) and demonstrate that they deliver significant speedups over the Monte Carlo approach. For instance, in the case of unweighted networks our algorithms are able to return the exact solution about 1600 times faster than the Monte Carlo approximation, even if we allow for a generous 10% error margin for the latter method.

117 citations

Journal ArticleDOI
TL;DR: In this paper, the authors used field data from the Swedish lowest unique positive integer (LUPI) game, where players pick positive integers and whoever chose the lowest unique number of players wins a fixed prize.
Abstract: Game theory is usually difficult to test precisely in the field because predictions typically depend sensitively on features that are not controlled or observed. We conduct one such test using field data from the Swedish lowest unique positive integer (LUPI) game. In the LUPI game, players pick positive integers and whoever chose the lowest unique number wins a fixed prize. Theoretical equilibrium predictions are derived assuming Poisson- distributed uncertainty about the number of players, and tested using both field and laboratory data. The field and lab data show similar patterns. Despite various deviations from equilibrium, there is a surprising degree of convergence toward equilibrium. Some of the deviations from equilibrium can be rationalized by a cognitive hierarchy model.

116 citations

Journal ArticleDOI
TL;DR: In this article, sufficient conditions are given for large replica games without side payments to have non-empty approximate cores for all sufficiently large replications, where the conditions are superadditivity, boundedness condition, and convexity of the payoff sets.

116 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483