scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Posted Content
TL;DR: In this paper, it was shown that computing an approximate Nash equilibrium in a game with n players requires quasi-polynomial time, in the sense that the payoff tensors need to be queried every time a Nash equilibrium is reached.
Abstract: We prove that there exists a constant $\epsilon>0$ such that, assuming the Exponential Time Hypothesis for PPAD, computing an $\epsilon$-approximate Nash equilibrium in a two-player (nXn) game requires quasi-polynomial time, $n^{\log^{1-o(1)} n}$. This matches (up to the o(1) term) the algorithm of Lipton, Markakis, and Mehta [LMM03]. Our proof relies on a variety of techniques from the study of probabilistically checkable proofs (PCP); this is the first time that such ideas are used for a reduction between problems inside PPAD. En route, we also prove new hardness results for computing Nash equilibria in games with many players. In particular, we show that computing an $\epsilon$-approximate Nash equilibrium in a game with n players requires $2^{\Omega(n)}$ oracle queries to the payoff tensors. This resolves an open problem posed by Hart and Nisan [HN13], Babichenko [Bab14], and Chen et al. [CCT15]. In fact, our results for n-player games are stronger: they hold with respect to the $(\epsilon,\delta)$-WeakNash relaxation recently introduced by Babichenko et al. [BPR16].

50 citations

Journal ArticleDOI
TL;DR: This work investigates the class of values that satisfy efficiency, symmetry, and weak monotonicity and it turns out that this class coincides with theclass of egalitarian Shapley values.

50 citations

Book ChapterDOI
27 Jun 2005
TL;DR: This work derives a simple and new forecasting strategy with regret at most order of Q*, the largest absolute value of any payoff, and devise a refined analysis of the weighted majority forecaster, which yields bounds of the same flavour.
Abstract: This work studies external regret in sequential prediction games with arbitrary payoffs (nonnegative or non-positive). External regret measures the difference between the payoff obtained by the forecasting strategy and the payoff of the best action. We focus on two important parameters: M, the largest absolute value of any payoff, and Q*, the sum of squared payoffs of the best action. Given these parameters we derive first a simple and new forecasting strategy with regret at most order of $\sqrt{Q^{*}({\rm ln}N)}+M {\rm ln} N$, where N is the number of actions. We extend the results to the case where the parameters are unknown and derive similar bounds. We then devise a refined analysis of the weighted majority forecaster, which yields bounds of the same flavour. The proof techniques we develop are finally applied to the adversarial multi-armed bandit setting, and we prove bounds on the performance of an online algorithm in the case where there is no lower bound on the probability of each action.

50 citations

Journal ArticleDOI
TL;DR: This paper showed that an attractive outside option enhances cooperation in the prisoner's dilemma game if the payoff for mutual defection is negative; while this tendency makes them stick to mutual defraction if its payoff is positive, subjects use probabilistic start and end effect behavior.
Abstract: Experiments in which subjects play simultaneously several finite two-person prisoner's dilemma supergames with and without an outside option reveal that: (i) an attractive outside option enhances cooperation in the prisoner's dilemma game, (ii) if the payoff for mutual defection is negative, subjects' tendency to avoid losses leads them to cooperate; while this tendency makes them stick to mutual defection if its payoff is positive, (iii) subjects use probabilistic start and endeffect behavior.

50 citations

Posted Content
TL;DR: In this paper, the authors present experimental results on humans playing a route choice game in a computer laboratory, which allow one to study decision behavior in repeated games beyond the Prisoner's Dilemma.
Abstract: In many social dilemmas, individuals tend to generate a situation with low payoffs instead of a system optimum (tragedy of the commons) Is the routing of traffic a similar problem? In order to address this question, we present experimental results on humans playing a route choice game in a computer laboratory, which allow one to study decision behavior in repeated games beyond the Prisoner's Dilemma We will focus on whether individuals manage to find a cooperative and fair solution compatible with the system-optimal road usage We find that individuals tend towards a user equilibrium with equal travel times in the beginning However, after many iterations, they often establish a coherent oscillatory behavior, as taking turns performs better than applying pure or mixed strategies The resulting behavior is fair and compatible with system-optimal road usage In spite of the complex dynamics leading to coordinated oscillations, we have identified mathematical relationships quantifying the observed transition process Our main experimental discoveries for 2- and 4-person games can be explained with a novel reinforcement learning model for an arbitrary number of persons, which is based on past experience and trial-and-error behavior Gains in the average payoff seem to be an important driving force for the innovation of time-dependent response patterns, ie the evolution of more complex strategies Our findings are relevant for decision support systems and routing in traffic or data networks

50 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483