scispace - formally typeset
Search or ask a question
Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.


Papers
More filters
Journal ArticleDOI
TL;DR: In this paper, the authors study an infinite horizon game in which pairs of players con-ected in a network are randomly matched to bargain and show that all equilibria are payoff equivalent.
Abstract: We study an infinite horizon game in which pairs of players con nected in a network are randomly matched to bargain. Players who reach agreement are replaced by new players at the same positions in the network. We show that all equilibria are payoff equivalent. The payoffs and the set of agreement links converge as players become patient. Several new concepts—mutually estranged sets, partners, and shortage ratios—provide insights into the relative strengths of the positions in the network. We develop a procedure to deter mine the limit equilibrium payoffs for all players. Characterizations of equitable and nondiscriminatory networks are also obtained. (.JEL C78, D85) Competitive equilibrium theory assumes large and anonymous markets in which every buyer can trade with every seller. Underlying these assumptions are standard goods and services that may be traded at low transaction costs by agents who are not in specific relationships with one another. However, in many markets goods and services are heterogeneous (e.g., cars, apartments) or need to be tailored to particular needs (e.g., manufacturing inputs, technical support). Furthermore, trad ing opportunities may depend on transportation costs, social relationships, infor mation, advertising, trust, technological compatibility, joint business opportunities, free trade agreements, etc. In such cases it is natural to model the market using a network, where only pairs of connected agents may engage in exchange. New theories are needed to explore the influence of the network structure on market outcomes. Many questions arise: How does an agent's position in the net work determine his bargaining power and the local prices he faces? Who trades with whom and on what terms? When are prices uniform in the network? One possible conjecture is that an agent's bargaining power is determined by his (relative) number of connections in the network. However, this simple theory is implausible. Consider the network of four sellers (located at the top nodes) and nine buyers (located at the bottom nodes) depicted in Figure 1. Assume that each seller supplies one unit of a homogeneous indivisible good, each buyer demands one unit of the good, and all buyers have identical values for the good. The buyer located in the middle has the largest number of links in the network, as he is connected to each

134 citations

Proceedings Article
21 Feb 2015
TL;DR: In this article, the problem of computationally and sample efficient learning in stochastic combinatorial semi-bandits was studied and a UCB-like algorithm for solving the problem was presented.
Abstract: A stochastic combinatorial semi-bandit is an online learning problem where at each step a learning agent chooses a subset of ground items subject to constraints, and then observes stochastic weights of these items and receives their sum as a payoff. In this paper, we close the problem of computationally and sample efficient learning in stochastic combinatorial semi-bandits. In particular, we analyze a UCB-like algorithm for solving the problem, which is known to be computationally efficient; and prove O(KL(1/)logn) and O( p KLnlogn) upper bounds on its n-step regret, where L is the number of ground items, K is the maximum number of chosen items, and is the gap between the expected returns of the optimal and best suboptimal solutions. The gapdependent bound is tight up to a constant factor and the gap-free bound is tight up to a polylogarithmic factor.

134 citations

Journal ArticleDOI
TL;DR: In this article, the authors show that any feasible, individually rational payoffs of an infinitely repeated game can arise as subgame perfect equilibrium payoffs if the discount factor is close enough to one even if mixed strategies are not observable and public randomizations are not available.

133 citations

Book ChapterDOI
01 Oct 2003
TL;DR: It is shown that if each player has a reachability objective, that is, if the goal for each player i is to visit some subset of the states, then there exists an e-Nash equilibrium in memoryless strategies, for every e >0, however, exact Nash equilibria need not exist.
Abstract: We study infinite stochastic games played by n-players on a finite graph with goals given by sets of infinite traces. The games are stochastic (each player simultaneously and independently chooses an action at each round, and the next state is determined by a probability distribution depending on the current state and the chosen actions), infinite (the game continues for an infinite number of rounds), nonzero sum (the players’ goals are not necessarily conflicting), and undiscounted. We show that if each player has a reachability objective, that is, if the goal for each player i is to visit some subset Ri of the states, then there exists an e-Nash equilibrium in memoryless strategies, for every e >0. However, exact Nash equilibria need not exist. We study the complexity of finding such Nash equilibria, and show that the payoff of some e-Nash equilibrium in memoryless strategies can be e-approximated in NP.

133 citations

Journal ArticleDOI
TL;DR: The simulation results show that by deploying the proposed best-response learning algorithm, the wireless users can significantly improve their own bidding strategies and, hence, their performance in terms of both the application quality and the incurred cost for the used resources.
Abstract: In this paper, we model the various users in a wireless network (eg, cognitive radio network) as a collection of selfish autonomous agents that strategically interact to acquire dynamically available spectrum opportunities Our main focus is on developing solutions for wireless users to successfully compete with each other for the limited and time-varying spectrum opportunities, given experienced dynamics in the wireless network To analyze the interactions among users given the environment disturbance, we propose a stochastic game framework for modeling how the competition among users for spectrum opportunities evolves over time At each stage of the stochastic game, a central spectrum moderator (CSM) auctions the available resources, and the users strategically bid for the required resources The joint bid actions affect the resource allocation and, hence, the rewards and future strategies of all users Based on the observed resource allocations and corresponding rewards, we propose a best-response learning algorithm that can be deployed by wireless users to improve their bidding policy at each stage The simulation results show that by deploying the proposed best-response learning algorithm, the wireless users can significantly improve their own bidding strategies and, hence, their performance in terms of both the application quality and the incurred cost for the used resources

132 citations


Network Information
Related Topics (5)
Markov chain
51.9K papers, 1.3M citations
81% related
Incentive
41.5K papers, 1M citations
81% related
Heuristics
32.1K papers, 956.5K citations
80% related
Linear programming
32.1K papers, 920.3K citations
79% related
Empirical research
51.3K papers, 1.9M citations
78% related
Performance
Metrics
No. of papers in the topic in previous years
YearPapers
2023364
2022738
2021462
2020512
2019460
2018483