Topic

Stochastic game

About: Stochastic game is a research topic. Over the lifetime, 9493 publications have been published within this topic receiving 202664 citations.

...read moreread less

Papers published on a yearly basis

1 / 2

Papers

PDF

Open Access

More filters

Journal Article•DOI•

Bargaining in Stationary Networks

[...]

Mihai Manea¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Aug 2011-The American Economic Review

TL;DR: In this paper, the authors study an infinite horizon game in which pairs of players con-ected in a network are randomly matched to bargain and show that all equilibria are payoff equivalent.

...read moreread less

Abstract: We study an infinite horizon game in which pairs of players con nected in a network are randomly matched to bargain. Players who reach agreement are replaced by new players at the same positions in the network. We show that all equilibria are payoff equivalent. The payoffs and the set of agreement links converge as players become patient. Several new concepts—mutually estranged sets, partners, and shortage ratios—provide insights into the relative strengths of the positions in the network. We develop a procedure to deter mine the limit equilibrium payoffs for all players. Characterizations of equitable and nondiscriminatory networks are also obtained. (.JEL C78, D85) Competitive equilibrium theory assumes large and anonymous markets in which every buyer can trade with every seller. Underlying these assumptions are standard goods and services that may be traded at low transaction costs by agents who are not in specific relationships with one another. However, in many markets goods and services are heterogeneous (e.g., cars, apartments) or need to be tailored to particular needs (e.g., manufacturing inputs, technical support). Furthermore, trad ing opportunities may depend on transportation costs, social relationships, infor mation, advertising, trust, technological compatibility, joint business opportunities, free trade agreements, etc. In such cases it is natural to model the market using a network, where only pairs of connected agents may engage in exchange. New theories are needed to explore the influence of the network structure on market outcomes. Many questions arise: How does an agent's position in the net work determine his bargaining power and the local prices he faces? Who trades with whom and on what terms? When are prices uniform in the network? One possible conjecture is that an agent's bargaining power is determined by his (relative) number of connections in the network. However, this simple theory is implausible. Consider the network of four sellers (located at the top nodes) and nine buyers (located at the bottom nodes) depicted in Figure 1. Assume that each seller supplies one unit of a homogeneous indivisible good, each buyer demands one unit of the good, and all buyers have identical values for the good. The buyer located in the middle has the largest number of links in the network, as he is connected to each

...read moreread less

134 citations

Proceedings Article•

{Tight Regret Bounds for Stochastic Combinatorial Semi-Bandits}

[...]

Branislav Kveton¹, Zheng Wen², Azin Ashkan, Csaba Szepesvári³•Institutions (3)

Adobe Systems¹, Yahoo!², University of Alberta³

21 Feb 2015

TL;DR: In this article, the problem of computationally and sample efficient learning in stochastic combinatorial semi-bandits was studied and a UCB-like algorithm for solving the problem was presented.

...read moreread less

Abstract: A stochastic combinatorial semi-bandit is an online learning problem where at each step a learning agent chooses a subset of ground items subject to constraints, and then observes stochastic weights of these items and receives their sum as a payoff. In this paper, we close the problem of computationally and sample efficient learning in stochastic combinatorial semi-bandits. In particular, we analyze a UCB-like algorithm for solving the problem, which is known to be computationally efficient; and prove O(KL(1/)logn) and O( p KLnlogn) upper bounds on its n-step regret, where L is the number of ground items, K is the maximum number of chosen items, and is the gap between the expected returns of the optimal and best suboptimal solutions. The gapdependent bound is tight up to a constant factor and the gap-free bound is tight up to a polylogarithmic factor.

...read moreread less

134 citations

Journal Article•DOI•

On the dispensability of public randomization in discounted repeated games

[...]

Drew Fudenberg¹, Eric Maskin²•Institutions (2)

Massachusetts Institute of Technology¹, Harvard University²

01 Apr 1991-Journal of Economic Theory

TL;DR: In this article, the authors show that any feasible, individually rational payoffs of an infinitely repeated game can arise as subgame perfect equilibrium payoffs if the discount factor is close enough to one even if mixed strategies are not observable and public randomizations are not available.

...read moreread less

133 citations

Book Chapter•DOI•

On Nash equilibria in stochastic games

[...]

Krishnendu Chatterjee¹, Rupak Majumdar¹, Marcin Jurdziński²•Institutions (2)

University of California¹, University of Warwick²

01 Oct 2003

TL;DR: It is shown that if each player has a reachability objective, that is, if the goal for each player i is to visit some subset of the states, then there exists an e-Nash equilibrium in memoryless strategies, for every e >0, however, exact Nash equilibria need not exist.

...read moreread less

Abstract: We study infinite stochastic games played by n-players on a finite graph with goals given by sets of infinite traces. The games are stochastic (each player simultaneously and independently chooses an action at each round, and the next state is determined by a probability distribution depending on the current state and the chosen actions), infinite (the game continues for an infinite number of rounds), nonzero sum (the players’ goals are not necessarily conflicting), and undiscounted. We show that if each player has a reachability objective, that is, if the goal for each player i is to visit some subset Ri of the states, then there exists an e-Nash equilibrium in memoryless strategies, for every e >0. However, exact Nash equilibria need not exist. We study the complexity of finding such Nash equilibria, and show that the payoff of some e-Nash equilibrium in memoryless strategies can be e-approximated in NP.

...read moreread less

133 citations

Journal Article•DOI•

Learning to Compete for Resources in Wireless Stochastic Games

[...]

Fangwen Fu¹, M. van der Schaar¹•Institutions (1)

University of California, Los Angeles¹

01 May 2009-IEEE Transactions on Vehicular Technology

TL;DR: The simulation results show that by deploying the proposed best-response learning algorithm, the wireless users can significantly improve their own bidding strategies and, hence, their performance in terms of both the application quality and the incurred cost for the used resources.

...read moreread less

Abstract: In this paper, we model the various users in a wireless network (eg, cognitive radio network) as a collection of selfish autonomous agents that strategically interact to acquire dynamically available spectrum opportunities Our main focus is on developing solutions for wireless users to successfully compete with each other for the limited and time-varying spectrum opportunities, given experienced dynamics in the wireless network To analyze the interactions among users given the environment disturbance, we propose a stochastic game framework for modeling how the competition among users for spectrum opportunities evolves over time At each stage of the stochastic game, a central spectrum moderator (CSM) auctions the available resources, and the users strategically bid for the required resources The joint bid actions affect the resource allocation and, hence, the rewards and future strategies of all users Based on the observed resource allocations and corresponding rewards, we propose a best-response learning algorithm that can be deployed by wireless users to improve their bidding policy at each stage The simulation results show that by deploying the proposed best-response learning algorithm, the wireless users can significantly improve their own bidding strategies and, hence, their performance in terms of both the application quality and the incurred cost for the used resources

...read moreread less

132 citations

Collapse

Network Information

Performance

Metrics

10,612

Papers

226,366

Citations

No. of papers in the topic in previous years
Year	Papers
2023	364
2022	738
2021	462
2020	512
2019	460
2018	483

Stochastic game

Papers published on a yearly basis

Papers

Trending Questions (10)

Network Information

Related Topics (5)

Performance

Metrics