scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: A class of decentralized tracking-type games is considered for large population multi-agent systems (MAS), described by stochastic discrete-time auto-regressive models with exogenous inputs, and coupled together through their individual dynamics and performance indexes by terms of the unknown population state average (PSA).

54 citations

Journal ArticleDOI
TL;DR: It is shown both problems are undecidable for multi-exit RMDPs, but are decidable for 1-RMDPs and 1-RSSGs, and more general model-checking problems with respect to linear-time temporal properties are undECidable even for a fixed property.
Abstract: We introduce Recursive Markov Decision Processes (RMDPs) and Recursive Simple Stochastic Games (RSSGs), which are classes of (finitely presented) countable-state MDPs and zero-sum turn-based (perfect information) stochastic games. They extend standard finite-state MDPs and stochastic games with a recursion feature. We study the decidability and computational complexity of these games under termination objectives for the two players: one player's goal is to maximize the probability of termination at a given exit, while the other player's goal is to minimize this probability. In the quantitative termination problems, given an RMDP (or RSSG) and probability p, we wish to decide whether the value of such a termination game is at least p (or at most p); in the qualitative termination problem we wish to decide whether the value is 1. The important 1-exit subclasses of these models, 1-RMDPs and 1-RSSGs, correspond in a precise sense to controlled and game versions of classic stochastic models, including multitype Branching Processes and Stochastic Context-Free Grammars, where the objective of the players is to maximize or minimize the probability of termination (extinction). We provide a number of upper and lower bounds for qualitative and quantitative termination problems for RMDPs and RSSGs. We show both problems are undecidable for multi-exit RMDPs, but are decidable for 1-RMDPs and 1-RSSGs. Specifically, the quantitative termination problem is decidable in PSPACE for both 1-RMDPs and 1-RSSGs, and is at least as hard as the square root sum problem, a well-known open problem in numerical computation. We show that the qualitative termination problem for 1-RMDPs (i.e., a controlled version of branching processes) can be solved in polynomial time both for maximizing and minimizing 1-RMDPs. The qualitative problem for 1-RSSGs is in NP ∩ coNP, and is at least as hard as the quantitative termination problem for Condon's finite-state simple stochastic games, whose complexity remains a well known open problem. Finally, we show that even for 1-RMDPs, more general (qualitative and quantitative) model-checking problems with respect to linear-time temporal properties are undecidable even for a fixed property.

54 citations

Journal ArticleDOI
TL;DR: This paper presents a theoretical framework for generalization, the first time that generalization is defined and analyzed rigorously in coevolutionary learning, and shows that a small sample of test strategies can be used to estimate the generalization performance.
Abstract: Coevolutionary learning involves a training process where training samples are instances of solutions that interact strategically to guide the evolutionary (learning) process. One main research issue is with the generalization performance, i.e., the search for solutions (e.g., input-output mappings) that best predict the required output for any new input that has not been seen during the evolutionary process. However, there is currently no such framework for determining the generalization performance in coevolutionary learning even though the notion of generalization is well-understood in machine learning. In this paper, we introduce a theoretical framework to address this research issue. We present the framework in terms of game-playing although our results are more general. Here, a strategy's generalization performance is its average performance against all test strategies. Given that the true value may not be determined by solving analytically a closed-form formula and is computationally prohibitive, we propose an estimation procedure that computes the average performance against a small sample of random test strategies instead. We perform a mathematical analysis to provide a statistical claim on the accuracy of our estimation procedure, which can be further improved by performing a second estimation on the variance of the random variable. For game-playing, it is well-known that one is more interested in the generalization performance against a biased and diverse sample of "good" test strategies. We introduce a simple approach to obtain such a test sample through the multiple partial enumerative search of the strategy space that does not require human expertise and is generally applicable to a wide range of domains. We illustrate the generalization framework on the coevolutionary learning of the iterated prisoner's dilemma (IPD) games. We investigate two definitions of generalization performance for the IPD game based on different performance criteria, e.g., in terms of the number of wins based on individual outcomes and in terms of average payoff. We show that a small sample of test strategies can be used to estimate the generalization performance. We also show that the generalization performance using a biased and diverse set of "good" test strategies is lower compared to the unbiased case for the IPD game. This is the first time that generalization is defined and analyzed rigorously in coevolutionary learning. The framework allows the evaluation of the generalization performance of any coevolutionary learning system quantitatively.

54 citations

Journal ArticleDOI
TL;DR: In this article, a two-player zero-sum stochastic differential game with asymmetric information on the random payoff was investigated and the authors proved that the game has a value and characterized this value in terms of dual viscosity solutions of some second order Hamilton-Jacobi equation.
Abstract: We investigate a two-player zero-sum stochastic differential game in which the players have an asymmetric information on the random payoff. We prove that the game has a value and characterize this value in terms of dual viscosity solutions of some second order Hamilton-Jacobi equation.

54 citations

Journal ArticleDOI
TL;DR: In this article, a generalization of a previous paper on an airport cost game to the case of an airport profit game is presented, where a fee schedule is obtained by subtracting the payoff vector from the vector of revenues, and it is proved that the fee schedule corresponding to the nucleolus is independent of the revenue vector.
Abstract: This paper represents a generalization of a previous paper on an “airport cost game” to the case of an “airport profit game”. A fee schedule in the airport profit game is obtained by subtracting the payoff vector from the vector of revenues. It is proved that the fee schedule corresponding to the nucleolus is independent of the revenue vector.

54 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483