scispace - formally typeset
Search or ask a question

Showing papers on "Stochastic game published in 1970"


Journal ArticleDOI
TL;DR: In this paper, the stochastic games of Shapley with infinite state and action spaces are considered, and it is shown that both players have an optimal strategy under certain conditions.
Abstract: In this paper, we consider the stochastic games of Shapley, when the state and action spaces are all infinite. We prove that, under certain conditions, the stochastic game has a value and that both players have optimal strategies.

91 citations


Journal ArticleDOI
TL;DR: In this paper, the concept of a differential game with a finite duration T for the system (l.l), (1.2) with payoff 1.7; y = y(t), and z = z(t) are the control functions of the players y and x.

34 citations


Proceedings ArticleDOI
01 Dec 1970
TL;DR: It is proved that sequential reproductive plans are robust in the sense that, in environments with a well-defined measure of payoff, any such plan will acquire payoff at a rate which asymptotically approaches the optimum.
Abstract: This paper presents a general formalism for defining problems of adaptation. Within this formalism a class of algorithms, the sequential reproductive plans, is defined. It is then proved that sequential reproductive plans are robust in the sense that, in environments with a well-defined measure of payoff (utility, performance), any such plan will acquire payoff at a rate which asymptotically approaches the optimum.

29 citations


Journal ArticleDOI
TL;DR: A new approach of perturbing linear programs is used to prove convergence of the computational technique and to bound the rate of convergence.
Abstract: This paper develops a method for finding the minimax solutions to a game whose payoff function is of the form XtAY/XtBY, where X, Y are mixed-strategy vectors and A, B > 0 are matrices. Such a payoff function arises in stochastic games, economic models of an expanding economy, and some nonzero-sum game formulations. The game is transformed into a linear program with a parameter in the constraint set. By successive solutions of this program with appropriate values of the parameter, the value and optimal strategies of the game can be approximated to any desired degree of accuracy. A new approach of perturbing linear programs is used to prove convergence of the computational technique and to bound the rate of convergence.

28 citations


Journal ArticleDOI
TL;DR: In this article, the conditions under which the parties are willing to delegate cooperative decision making to a group were determined, based on the assumption that each party knows his own and his opponents' utility functions, subjective probability distributions, available strategies and payoff functions.
Abstract: : In many situations involving interaction among parties, the individual welfare of each party can be improved by cooperation. This requires a cooperative decision, which specifies the course of action to be followed by each party, and a rule for sharing the total payoff resulting from that decision. In general, the total payoff will depend on the value of a random variable that is unknown at the time of the decision making. This study determines the conditions under which the parties are willing to delegate cooperative decision making to a group. In arriving at its optimal decision, this group does not take into account the division of the total payoff among the parties. Furthermore, the parties are willing to form this group independent of the set of alternative payoff functions. The results are based on the assumption of complete information; that is, each party knows his own and his opponents' utility functions, subjective probability distributions, available strategies, and payoff functions. (Author)

16 citations


Journal ArticleDOI
TL;DR: In this paper, the authors report the results of using a specifically designed research methodology to obtain and generalize a quantitative explanation of human behavior in multi-person, interactive games, where the payoff to one individual resulting from the selection of a particular strategy depends on the strategy selections of the other participants.
Abstract: Most so-called "theories" and "explanations" in the behavioral sciences tend to be formulated in qualitative terms which are often ill-defined. Hence, consequences can seldom be rigorously deduced from them, and those consequences that are extracted can seldom be conclusively tested. This has made it possible for conflicting explanations of the same type of behavior to live side by side for many years; for example, Gestalt and Associationalist theories of perception; Freudian, Adlerian, Jungian, and many other varieties of psychoanalytic theories; and a large number of learning theories. Because most psychological theories are not testable in any conclusive way they seem never to die, and only seldom even to fade away. A more important consequence is that theory development is not nearly as cumulative in the behavioral sciences as it is in other sciences; convergence and generalization of behavioral theories are relatively rare phenomena. This paper reports the results of using a specifically designed research methodology to obtain and generalize a quantitative explanation of human behavior in multi-person, interactive games. The games are interactive in the sense that the payoff to one individual resulting from the selection of a particular strategy depends on the strategy selections of the other participants. The payoffs are monetary and the games are structured so that the monetary preferences of outcomes for each individual create a conflict of interest among the participants. The only permitted form of interaction between the participants, and thus the only way the conflict can be resolved, is through the selection of strategies during the course of the game. The objective of the research program reported herein was to identify, quantify, and relate the properties of the conflict environment, the history of the conflict, and the characteristics of the individual which explain his actions at a particular time in the game.

11 citations


Journal ArticleDOI
TL;DR: In this article, the value of a linear differential game whose payoff is determined by a convex function of the state vector at a given terminal time is investigated, and an explicit expression for the greatest lower bound of the game is obtained.
Abstract: A linear differential game whose payoff is determined by a convex function of the state vector at a given terminal time is investigated. An explicit expression for the greatest lower bound of the value of the game is obtained. A simple condition which guarantees the existence of the value of the game is derived by a geometric approach. Under this condition, instead of solving the Bellman equation, the value of the game can be calculated by maximizing a function on the unit sphere of the output space whose dimension is usually much less than that of the state space. The method of synthesizing the optimal strategies is also derived. Another game called a minimax energy game whose payoff is given by the difference between the consumed energies of both players is treated briefly, and an extension of the concept of controllability is discussed.

8 citations


Journal ArticleDOI
TL;DR: This article states the authors' earlier result on the existence of a value of a noisy duel and presents a detailed discussion of the structure of ←-good strategies and criteria for theexistence of good first-shot times.
Abstract: A noisy duel is a zero-sum, two-person game with the following structure. Each player has bullets which he can fire at any time in [0, 1]. If Player i fires at time t, he hits with probability Pi (t). The functions Pi are continuous and nondecreasing with Pi (0)=0 and Pi (1)=1. The number of bullets each player possesses at any time and the functions Pi are known to both. The payoff is 1 to the sole survivor, otherwise 0. This article states the authors' earlier result on the existence of a value of a noisy duel and presents a detailed discussion of the structure of ←-good strategies and criteria for the existence of good first-shot times. We present tables of values and shooting times for noisy duels, which, in some cases, can be used to trace the play of the game. An additional table illustrates how large an arsenal is necessary to overcome the effects of an opponent's superior accuracy.

5 citations


21 Dec 1970
TL;DR: In this article, the authors considered a discrete N-person game where the players choose their strategies separately and independently, and the players can be ranked by each player according to their desirability to that player.
Abstract: : The paper considers a discrete N-person game theory where the players choose their strategies separately and independently. Payoff 'values' can be of a very general nature and need not be numbers. However, the totality of payoff outcomes (N-dimensional), corresponding to the possible combinations of strategies, can be ranked by each player according to their desirability to that player. A largest level of desirability (associated with one or more outcomes o sub i) occurs for the i-th player such that he can assure, with probability at least a given value alpha sub i, that an outcome with at least this desirability level is obtained, and this can be done simultaneously for all the players. This game theory is of a median nature when all the alpha sub i are chosen to the 1/2. A method is given for determining o sub i and an optimum (mixed) strategy for every player. Practical aspects of applying this percentile game theory are examined. Application effort can be substantially reduced when the players have relative desirability functions for ranking the outcomes. Some elementary types of relative desirability functions are introduced. (Author)

3 citations


Eilon Solan1
01 Jan 1970
TL;DR: A course on the theory of stochastic games can be found in this paper, where the author assumes no more than a basic undergraduate curriculum and illustrates the theory with numerous examples and exercises, with solutions available online.
Abstract: Stochastic games are have an element of chance: the state of the next round is determined probabilistically depending upon players' actions and the current state. Successful players need to balance the need for short-term payoffs while ensuring future opportunities remain high. The various techniques needed to analyze these often highly non-trivial games are a showcase of attractive mathematics, including methods from probability, differential equations, algebra, and combinatorics. This book presents a course on the theory of stochastic games going from the basics through to topics of modern research, focusing on conceptual clarity over complete generality. Each of its chapters introduces a new mathematical tool – including contracting mappings, semi-algebraic sets, infinite orbits, and Ramsey's theorem, among others – before discussing the game-theoretic results they can be used to obtain. The author assumes no more than a basic undergraduate curriculum and illustrates the theory with numerous examples and exercises, with solutions available online.

2 citations


Journal ArticleDOI
TL;DR: In this paper, a discussion of a class of two-player zero-sum games is presented, where a play starts at a given state, and terminates when a state in given set of states is reached.
Abstract: This paper contains a discussion of a class of two-player, zero-sum games. The state of the game is determined by a given set of difference equations. A play starts at a given state, and terminates when a state in given set of states is reached. The terminal stage is not prescribed. One player seeks to minimize while the other player seeks to maximize a payoff which is additive over the stages of the play. Optimality is expressed by a saddle-point condition. Optimal strategy pairs are those satisfying the saddle-point condition; the corresponding value of the payoff is called the value of the game. Necessary conditions are derived under certain assumptions on the value of the game and under the assumption of locally directional convexity of the stage-to-stage reachable set of states. These conditions are applied to simple examples.

01 Oct 1970
TL;DR: In this article, the authors considered a two-person zero-sum stochastic game with finite number of positions or states, where the movement of the game from state to state is jointly controlled by the players depending on their choices of strategies from a finite numbers of alternatives in each state available to each of them.
Abstract: : The paper discusses a two-person zero sum stochastic game with finite number of positions or states. The movement of the game from state to state is jointly controlled by the players depending on their choices of strategies from a finite number of alternatives in each state available to each of them. Since no stop probability is introduced, the game is of non-terminating type. Available literature on this type of game is limited to those with Markovian reward structure. In the present work, semi-Markovian reward structure is considered and the equations that the value must satisfy are obtained. Convergent algorithms for the cases of finite and infinite transitions, with and without discounting in each case are presented. Possible extensions to consider time horizon and nonzero-sum situations are indicated. The present work could be considered as a generalization of semi-Markovian decision process developed by Jewell to game theoretic context. Also, the present model generalizes the non-terminating stochastic game of Hoffman and Karp to semi-Markovian reward structure. (Author)

Journal ArticleDOI
G.R Mon1
TL;DR: In this paper, a stochastic finite state game with Markovian state transition process is considered, and the players attempt to maximize the expected value of a payoff function by choosing an optimal pair of strategies.

11 Jun 1970
TL;DR: In this paper, the authors considered discrete two-player game theory where the players behave competitively and choose their strategies separately and independently, and the payoffs can be ranked according to increasing desirability level.
Abstract: : Considered is discrete two-person game theory where the players behave competitively and choose their strategies separately and independently. Payoffs can be of a very general nature and need not be numbers. Within each matrix, the payoffs can be ranked according to increasing desirability level. This is done separately by each player and these rankings are not necessarily related. Separately, player i, (i = 1,2), selects and applies a percentile criterion 100ai to each matrix. A largest desirability level Pi(ai) occurs in the matrix for player i such that, when acting protectively, he can assure with probability at least ai that his payoff is at least this desirable (to him). Also, a smallest desirability level (P'j)(ai), according to the ranking by player i, occurs in the matrix for the other player (designated as j) such that player i, when acting vindictively, can assure with probability at least ai that the payoff to player j has at most this desirability (to player i). An ai-optimum solution occurs for player i when he can be simultaneously ai-protective and ai-vindictive. Median game theory occurs when a1=a2=1/2. Percentile game theory occurs when more, or less, assurance is desired than occurs for the median case. (Author)

01 Jan 1970
TL;DR: In this article, a two-player non-zero-sum game with a threat option was investigated and it was found that those players who were likely to carry out their threats were those who won the most concessions from the other.
Abstract: A two-person nonzero-sum game which provides one player with a threat option is experimentally investigated in this study. In the game, both players have a dominating strategy choice but the “natural” outcome of the game, defined as the intersection of dominating strategy choices, gives one player his largest payoff and the other player his next to smallest. Howe\-er, the “dissatisfied” player (the one who does not receive his largest payoff at the natural outcome) can, by switching his strategy choice, reduce the other’s payoffs but only at a cost to himself. The dissatisfied player’s ability to lower the other’s payoffs constitutes a “threat.” It was found that in repeated trials of play of this game, those players who were likely to carry out their threats were those who won the most concessions from the other. The results of this study suggest that a threat-appeasement, punishment-capitulation interaction develops between the players. That is, the existence of the threat option for one

Journal ArticleDOI
H. Mine1, K. Yamada1, S. Osaki1
TL;DR: The terminating stochastic game is a nonstationary Markov chain with rewards in which the concern is the transient behavior before absorption, and a new concept of rewards is introduced.
Abstract: This paper describes a stochastic game in which the play terminates in a finite number of steps with probability 1. The game is called a terminating stochastic game. When the play terminates at any step, the play is regarded to reach to an absorbing state in the Markov chain under consideration. Hence, the terminating stochastic game is a nonstationary Markov chain with rewards in which our concern is the transient behavior before absorption. In particular, when one of the players is a dummy, the stochastic game reduces to a Markovian decision process of special type. This paper discusses such games. We introduce a new concept of rewards and formulate three problems arising in the games by linear programming. Finally, numerical examples are presented.