Utility Based Q-learning to Maintain Cooperation in Prisoner's Dilemma Games

doi:10.1109/IAT.2007.104

Proceedings ArticleDOI

Utility Based Q-learning to Maintain Cooperation in Prisoner's Dilemma Games

Koichi Moriyama

- pp 146-152

Chats0

TLDR

This work derives a theorem on how many times the cooperation is needed to make the Q- function of cooperation larger than that of defection, and derives a corollary on how much utility is necessary to make that function larger by one-shot mutual cooperation.

Abstract:

This work deals with Q-learning in a multiagent environment. There are many multiagent Q-learning methods, and most of them aim to converge to a Nash equilibrium, which is not desirable in games like the prisoner's dilemma (PD). However, normal Q-learning agents that use a stochastic method in choosing actions to avoid local optima may bring mutual cooperation in PD. Although such mutual cooperation usually occurs singly, it can be maintained if the Q- function of cooperation becomes larger than that of defection after the cooperation. This work derives a theorem on how many times the cooperation is needed to make the Q- function of cooperation larger than that of defection. In addition, from the perspective of the author's previous works that discriminate utilities from rewards and use utilities for learning in PD, this work also derives a corollary on how much utility is necessary to make the Q-function larger by one-shot mutual cooperation.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Algorithms, machine learning, and collusion

Ulrich Schwalbe

- 01 Dec 2018 -

Journal of Competition Law and Economics

TL;DR: Whether self-learning price-setting algorithms can coordinate their pricing behavior to achieve a collusive outcome that maximizes the joint profits of the firms using them is discussed.

...read moreread less

Posted Content

Algorithmic Pricing and Competition: Empirical Evidence from the German Retail Gasoline Market

Stephanie Assad, +3 more

- 01 Aug 2020 -

Research Papers in Economics

TL;DR: In this article, the authors investigate the impact of AI adoption on outcomes linked to competition in the German retail gasoline market and show that adoption in-creases margins by 9%, but only in non-monopoly markets.

...read moreread less

Proceedings ArticleDOI

Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

Koichi Moriyama

TL;DR: This paper introduces a new Q-learning method called the learning-rate adjusting Q- learning, or LRA-Q, which deals with the learning rate directly and aims to converge to a Nash equilibrium.

...read moreread less

Proceedings ArticleDOI

Maintaining cooperation in homogeneous multi-agent system

Jianye Hao, +1 more

TL;DR: It is shown that the system can maintain certain level of cooperation though the agents are individually rational, and a mathematical model is developed to analyze the dynamics resulting from the learning framework.

...read moreread less

Multi-agent learning in complex environments

Chao Yu

TL;DR: Experimental results show that agents using the proposed approaches can learn efficient coordinated behaviors in domains of different sizes, and a collective multi-agent learning framework is proposed to study the impact of agent local collective learning on the emergence of social norms in a number of different settings in terms of agent heterogeneities and topological varieties.

...read moreread less

References

PDF

Open Access

More filters

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Book

The Evolution of Cooperation

Robert Axelrod, +1 more

TL;DR: In this paper, a model based on the concept of an evolutionarily stable strategy in the context of the Prisoner's Dilemma game was developed for cooperation in organisms, and the results of a computer tournament showed how cooperation based on reciprocity can get started in an asocial world, can thrive while interacting with a wide range of other strategies, and can resist invasion once fully established.

...read moreread less

Journal ArticleDOI

Technical Note Q-Learning

Chris Watkins, +1 more

- 01 May 1992 -

Machine Learning

TL;DR: In this article, it is shown that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action values are represented discretely.

...read moreread less

Technical Note:Q-Learning

C. J. C. H. Watkins

Book ChapterDOI

Markov games as a framework for multi-agent reinforcement learning

Michael L. Littman

TL;DR: A Q-learning-like algorithm for finding optimal policies and its application to a simple two-player game in which the optimal policy is probabilistic is demonstrated.

...read moreread less

Collapse

Related Papers (5)

Learning and Cooperation in Sequential Games

Annapurna Valluri

- 01 Sep 2006 -

Adaptive Behavior

IEEE Signal Processing Magazine

Utility Based Q-learning to Maintain Cooperation in Prisoner's Dilemma Games

Citations

Algorithms, machine learning, and collusion

Algorithmic Pricing and Competition: Empirical Evidence from the German Retail Gasoline Market

Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

Maintaining cooperation in homogeneous multi-agent system

Multi-agent learning in complex environments

References

Reinforcement Learning: An Introduction

The Evolution of Cooperation

Technical Note Q-Learning

Technical Note:Q-Learning

Markov games as a framework for multi-agent reinforcement learning

Related Papers (5)

Learning and Cooperation in Sequential Games

Emergence of cooperation supported by communication in a one-shot 2×2 game

Nash-reinforcement learning (N-RL) for developing coordination strategies in non-transferable utility games

Agent-based simulation of coalition formation in cooperative games

Distributed No-Regret Learning in Multiagent Systems: Challenges and Recent Developments