Convergence analysis on approximate reinforcement learning

doi:10.1007/978-3-540-76719-0_12

Book ChapterDOI

Convergence analysis on approximate reinforcement learning

Jinsong Leng, +2 more

- pp 85-91

Chats0

TLDR

The aim of this paper is to propose a methodology for analysing the performance for adaptively selecting a set of optimal parameter values in TD(λ) learning algorithm.

Abstract:

Temporal difference (TD) learning is a form of approximate reinforcement learning using an incremental learning updates. For large, stochastic and dynamic systems, however, it is still on open question for lacking the methodology to analyse the convergence and sensitivity of TD algorithms. Meanwhile, analysis on convergence and sensitivity of parameters are very expensive, such analysis metrics are obtained only by running an experiment with different parameter values. In this paper, we utilise the TD(λ) learning control algorithm with a linear function approximation technique known as tile coding in order to help soccer agent learn the optimal control processes. The aim of this paper is to propose a methodology for analysing the performance for adaptively selecting a set of optimal parameter values in TD(λ) learning algorithm.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Aggregate Reinforcement Learning for multi-agent territory division

Mohamed K. Gunady, +2 more

- 01 Sep 2014 -

Engineering Applications of Artificial I...

TL;DR: This paper targets the problem of territory division in the children's game of Hide-and-Seek as a test-bed for a hierarchical learning scheme using Reinforcement Learning (RL), and proposes a revised version of the standard updating rule of the Q-learning to cope with multiple seekers.

...read moreread less

Journal ArticleDOI

An Efficient Computing of Correlated Equilibrium for Cooperative $Q$ -Learning-Based Multi-Robot Planning

Arup Kumar Sadhu, +1 more

- 01 Aug 2020 -

IEEE Transactions on Systems, Man, and C...

TL;DR: A novel approach to adapt composite rewards of all the agents in one single table in joint state-action space during learning, and uses these rewards to compute CE in an efficient way during the planning phases is introduced.

...read moreread less

Journal ArticleDOI

Experimental analysis of eligibility traces strategies in temporal difference learning

Jinsong Leng, +2 more

- 01 Dec 2008 -

International Journal of Knowledge Engin...

TL;DR: Sarsa(λ) learning control algorithm is adopted with a large, stochastic and dynamic simulation environment called SoccerBots and the underlying mechanism of eligibility traces with an approximation function known as tile coding is investigated.

...read moreread less

Proceedings ArticleDOI

Reinforcement learning generalization using state aggregation with a maze-solving problem

Mohamed K. Gunady, +1 more

TL;DR: This paper proposes a generalization technique using `state aggregation', and applies it to Q-learning, and shows how to aggregate similar states together.

...read moreread less

Book ChapterDOI

Temporal difference learning and simulated annealing for optimal control: a case study

Jinsong Leng, +2 more

TL;DR: A modified Sarsa(λ) control algorithm is presented by sampling actions in conjunction with simulated annealing technique to demonstrate that the quality of convergence has been significantly improved by using the simulatedAnnealing approach.

...read moreread less

References

PDF

Open Access

More filters

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Book

Dynamic Programming

Richard Ernest Bellman

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.

...read moreread less

Book

Introduction to Reinforcement Learning

Richard S. Sutton, +1 more

TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.

...read moreread less

Journal ArticleDOI

Intelligent Agents: Theory and Practice

Michael Wooldridge, +1 more

- 01 Jun 1995 -

Knowledge Engineering Review

TL;DR: Agent theory is concerned with the question of what an agent is, and the use of mathematical formalisms for representing and reasoning about the properties of agents as discussed by the authors ; agent architectures can be thought of as software engineering models of agents; and agent languages are software systems for programming and experimenting with agents.

...read moreread less

Learning from delayed rewards

Chris Watkins

Related Papers (5)

Reinforcement learning for linear continuous-time systems: an incremental learning approach

Tao Bian, +1 more

- 25 Feb 2019 -

IEEE/CAA Journal of Automatica Sinica

The Improvement of Convergence Rate in n-Queen Problem Using Reinforcement learning

Lim Soo-Yeon, +3 more

- 01 Feb 2005 -

Journal of The Korean Institute of Intel...

IEEE Transactions on Evolutionary Comput...

Convergence analysis on approximate reinforcement learning

Citations

Aggregate Reinforcement Learning for multi-agent territory division

An Efficient Computing of Correlated Equilibrium for Cooperative $Q$ -Learning-Based Multi-Robot Planning

Experimental analysis of eligibility traces strategies in temporal difference learning

Reinforcement learning generalization using state aggregation with a maze-solving problem

Temporal difference learning and simulated annealing for optimal control: a case study

References

Reinforcement Learning: An Introduction

Dynamic Programming

Introduction to Reinforcement Learning

Intelligent Agents: Theory and Practice

Learning from delayed rewards

Related Papers (5)

Reinforcement learning for linear continuous-time systems: an incremental learning approach

The Improvement of Convergence Rate in n-Queen Problem Using Reinforcement learning

A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning

Adaptive step-size for online temporal difference learning

Learning Adaptive Differential Evolution Algorithm From Optimization Experiences by Policy Gradient