scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors consider a differential game in which the joint choices of the two players influence the variance, but not the mean, of the one-dimensional state variable and show that a pure strategy perfect equilibrium in stationary Markov strategies (ME) exists and has the property that patient players choose to play it safe when sufficiently ahead and to take risks when sufficiently behind.
Abstract: We consider a differential game in which the joint choices of the two players influence the variance, but not the mean, of the one-dimensional state variable. We show that a pure strategy perfect equilibrium in stationary Markov strategies (ME) exists and has the property that patient players choose to play it safe when sufficiently ahead and to take risks when sufficiently behind. We also provide a simple condition that implies both players choose risky strategies when neither one is too far ahead, a situation that ensures a dominant player emerges "quickly."

52 citations

Journal ArticleDOI
TL;DR: If the players use a learning algorithm of the reward-penalty type, with proper choice of certain parameters in the algorithm, the expected value of the mixed strategies for both players can be made arbitrarily close to optimal strategies.
Abstract: This paper extends recent results [Lakshmivarahan and Narendra, Math. Oper. Res., 6 (1981), pp. 379–386] in two-person zero-sum sequential games in which the players use learning algorithms to update their strategies. It is assumed that neither player knows (i) the set of strategies available to the other player or (ii) the mixed strategy used by the other player or its pure realization at any stage. The outcome of the game depends on chance and the game is played sequentially. The distribution of the random outcome as a function of the pair of pure strategies chosen by the players is also, unknown to them. It is shown that if the players use a learning algorithm of, the reward-penalty type, with proper choice of certain parameters in the algorithm, the expected value of the mixed strategies for both players can be made arbitrarily close to optimal strategies.

52 citations

Journal ArticleDOI
TL;DR: In this paper, a two-person zero-sum discounted stochastic game with a finite state space is considered and two convergent algorithms for arriving at minimax strategies for the players and the value of the game are presented.
Abstract: In this paper, a two-person zero-sum discounted stochastic game with a finite state space is considered. The movement of the game from state to state is jointly controlled by the two players with a finite number of alternatives available to each player in each of the states. We present two convergent algorithms for arriving at minimax strategies for the players and the value of the game. The two algorithms are compared with respect to computational efficiency. Finally, a possible extension to nonzero sum stochastic game is suggested.

52 citations

Journal ArticleDOI
TL;DR: In this paper, the authors show that two probability distributions are strategically close if and only if they assign similar ex ante probability to all events; and if it is approximate common knowledge that they assign similarly conditional probabilities to all players.

52 citations

Book ChapterDOI
01 Jan 2003
TL;DR: In this article, the authors consider Markov Decision Process (MDP) with finite state and action spaces and consider several criteria: total discounted expected reward, average expected reward and more sensitive optimality criteria including the Blackwell optimality criterion.
Abstract: In this chapter we study Markov decision processes (MDPs) with finite state and action spaces. This is the classical theory developed since the end of the fifties. We consider finite and infinite horizon models. For the finite horizon model the utility function of the total expected reward is commonly used. For the infinite horizon the utility function is less obvious. We consider several criteria: total discounted expected reward, average expected reward and more sensitive optimality criteria including the Blackwell optimality criterion. We end with a variety of other subjects.

52 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483