scispace - formally typeset
Book ChapterDOI

Convergence analysis on approximate reinforcement learning

Reads0
Chats0
TLDR
The aim of this paper is to propose a methodology for analysing the performance for adaptively selecting a set of optimal parameter values in TD(λ) learning algorithm.
Abstract
Temporal difference (TD) learning is a form of approximate reinforcement learning using an incremental learning updates. For large, stochastic and dynamic systems, however, it is still on open question for lacking the methodology to analyse the convergence and sensitivity of TD algorithms. Meanwhile, analysis on convergence and sensitivity of parameters are very expensive, such analysis metrics are obtained only by running an experiment with different parameter values. In this paper, we utilise the TD(λ) learning control algorithm with a linear function approximation technique known as tile coding in order to help soccer agent learn the optimal control processes. The aim of this paper is to propose a methodology for analysing the performance for adaptively selecting a set of optimal parameter values in TD(λ) learning algorithm.

read more

Citations
More filters
Journal ArticleDOI

Aggregate Reinforcement Learning for multi-agent territory division

TL;DR: This paper targets the problem of territory division in the children's game of Hide-and-Seek as a test-bed for a hierarchical learning scheme using Reinforcement Learning (RL), and proposes a revised version of the standard updating rule of the Q-learning to cope with multiple seekers.
Journal ArticleDOI

An Efficient Computing of Correlated Equilibrium for Cooperative $Q$ -Learning-Based Multi-Robot Planning

TL;DR: A novel approach to adapt composite rewards of all the agents in one single table in joint state-action space during learning, and uses these rewards to compute CE in an efficient way during the planning phases is introduced.
Journal ArticleDOI

Experimental analysis of eligibility traces strategies in temporal difference learning

TL;DR: Sarsa(λ) learning control algorithm is adopted with a large, stochastic and dynamic simulation environment called SoccerBots and the underlying mechanism of eligibility traces with an approximation function known as tile coding is investigated.
Proceedings ArticleDOI

Reinforcement learning generalization using state aggregation with a maze-solving problem

TL;DR: This paper proposes a generalization technique using `state aggregation', and applies it to Q-learning, and shows how to aggregate similar states together.
Book ChapterDOI

Temporal difference learning and simulated annealing for optimal control: a case study

TL;DR: A modified Sarsa(λ) control algorithm is presented by sampling actions in conjunction with simulated annealing technique to demonstrate that the quality of convergence has been significantly improved by using the simulatedAnnealing approach.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Book

Dynamic Programming

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.
Book

Introduction to Reinforcement Learning

TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.
Journal ArticleDOI

Intelligent Agents: Theory and Practice

TL;DR: Agent theory is concerned with the question of what an agent is, and the use of mathematical formalisms for representing and reasoning about the properties of agents as discussed by the authors ; agent architectures can be thought of as software engineering models of agents; and agent languages are software systems for programming and experimenting with agents.
Related Papers (5)