scispace - formally typeset
Open AccessJournal ArticleDOI

Policy Derivation Methods for Critic-Only Reinforcement Learning in Continuous Action Spaces

Reads0
Chats0
TLDR
Several variants of the policy-derivation algorithm are introduced and compared on two continuous state-action benchmarks: double pendulum swing-up and 3D mountain car.
About
This article is published in IFAC-PapersOnLine.The article was published on 2016-01-01 and is currently open access. It has received 8 citations till now. The article focuses on the topics: Q-learning & Reinforcement learning.

read more

Citations
More filters
Journal ArticleDOI

Optimal and Autonomous Control Using Reinforcement Learning: A Survey

TL;DR: Q-learning and the integral RL algorithm as core algorithms for discrete time (DT) and continuous-time (CT) systems, respectively are discussed, and a new direction of off-policy RL for both CT and DT systems is discussed.
Journal ArticleDOI

Experience selection in deep reinforcement learning for control

TL;DR: This work proposes guidelines for using prior knowledge about the characteristics of the control problem at hand to choose the appropriate experience replay strategy, and investigates different proxies for their immediate and long-term utility.
Proceedings ArticleDOI

Symbolic method for deriving policy in reinforcement learning

TL;DR: A novel method based on genetic programming is proposed to construct a symbolic function, which serves as a proxy to the value function and from which a continuous policy is derived, which outperforms the standard policy derivation method.
Journal ArticleDOI

Knowledge-based reinforcement learning controller with fuzzy-rule network: experimental validation

TL;DR: A model-free controller for a general class of output feedback nonlinear discrete-time systems is established by action-critic networks and reinforcement learning with human knowledge based on IF–THEN rules.
Journal ArticleDOI

Policy derivation methods for critic-only reinforcement learning in continuous spaces

TL;DR: Policy derivation methods which alleviate the above problems by means of action space refinement, continuous approximation, and post-processing of the V-function by using symbolic regression are proposed.
References
More filters
Journal ArticleDOI

Experiments with reinforcement learning in problems with continuous state and action spaces

TL;DR: This article proposes a simple and modular technique that can be used to implement function approximators with nonuniform degrees of resolution so that the value function can be represented with higher accuracy in important regions of the state and action spaces.
Journal ArticleDOI

Reinforcement Learning with Factored States and Actions

TL;DR: A novel approximation method is presented for approximating the value function and selecting good actions for Markov decision processes with large state and action spaces and shows that the product of experts approximation can be used to solve large problems.
Journal ArticleDOI

Detection and Diagnosis of Oscillation in Control Loops

TL;DR: Further opportunities for oscillation detection in the off-line analysis of ensembles of data from control loops are examined and operational signatures that indicate the cause of an oscillation are presented.
Proceedings ArticleDOI

Autonomous transfer for reinforcement learning

TL;DR: This paper introduces Modeling Approximate State Transitions by Exploiting Regression (MASTER), a method for automatically learning a mapping from one task to another through an agent's experience and demonstrates that such learned relationships can significantly improve the speed of a reinforcement learning algorithm in a series of Mountain Car tasks.
Journal ArticleDOI

Approximate dynamic programming with a fuzzy parameterization

TL;DR: This work shows that fuzzy Q-iteration is consistent, i.e., that it asymptotically obtains the optimal solution as the approximation accuracy increases, and proves that the asynchronous algorithm is proven to converge at least as fast as the synchronous one.