scispace - formally typeset
Open AccessPosted Content

Inter-Level Cooperation in Hierarchical Reinforcement Learning

Reads0
Chats0
TLDR
It is hypothesized that improved cooperation between the internal agents of a hierarchy can simplify the credit assignment problem from the perspective of the high-level policies, thereby leading to significant improvements to training in situations where intricate sets of action primitives must be performed to yield improvements in performance.
Abstract
Hierarchical models for deep reinforcement learning (RL) have emerged as powerful methods for generating meaningful control strategies in difficult long time horizon tasks. Training of said hierarchical models, however, continue to suffer from instabilities that limit their applicability. In this paper, we address instabilities that arise from the concurrent optimization of goal-assignment and goal-achievement policies. Drawing connections between this concurrent optimization scheme and communication and cooperation in multi-agent RL, we redefine the standard optimization procedure to explicitly promote cooperation between these disparate tasks. Our method is demonstrated to achieve superior results to existing techniques in a set of difficult long time horizon tasks, and serves to expand the scope of solvable tasks by hierarchical reinforcement learning. Videos of the results are available at: this https URL.

read more

Citations
More filters
Proceedings ArticleDOI

Demonstration-Bootstrapped Autonomous Practicing via Multi-Task Reinforcement Learning

TL;DR: This work proposes a system for reinforcement learning that leverages multi-task reinforcement learning bootstrapped with prior data to enable continuous autonomous practicing, minimizing the number of resets needed while being able to learn temporally extended behaviors.
Journal ArticleDOI

Learning Pneumatic Non-Prehensile Manipulation With a Mobile Blower

TL;DR: This work investigates pneumatic non-prehensile manipulation as a means of efficiently moving scattered objects into a target receptacle by introducing a multi-frequency version of the spatial action maps framework and shows that its simulation-trained policies transfer well to a real environment and can generalize to novel objects.
Journal Article

HiSaRL: A Hierarchical Framework for Safe Reinforcement Learning

TL;DR: A two-level hierarchical framework for safe reinforcement learning in a complex environment that contains a learning-based controller and its corresponding neural Lyapunov function, which characterizes the controller’s stability property.
Posted Content

Active Hierarchical Imitation and Reinforcement Learning.

TL;DR: The experimental results showed that using DAgger and reward-based active learning method can achieve better performance while saving more human efforts physically and mentally during the training process.
Proceedings ArticleDOI

Learning in Bi-level Markov Games

TL;DR: Experimental results show that the proposed framework of the bi-level Markov game (BMG), which breaks the simultaneity of multi-agent reinforcement learning, achieves competitive advantages in terms of better performance and lower variance.
References
More filters
Journal ArticleDOI

Human-level control through deep reinforcement learning

TL;DR: This work bridges the divide between high-dimensional sensory inputs and actions, resulting in the first artificial agent that is capable of learning to excel at a diverse array of challenging tasks.
Posted Content

Proximal Policy Optimization Algorithms

TL;DR: A new family of policy gradient methods for reinforcement learning, which alternate between sampling data through interaction with the environment, and optimizing a "surrogate" objective function using stochastic gradient ascent, are proposed.
Journal ArticleDOI

Technical Note : \cal Q -Learning

TL;DR: This paper presents and proves in detail a convergence theorem forQ-learning based on that outlined in Watkins (1989), showing that Q-learning converges to the optimum action-values with probability 1 so long as all actions are repeatedly sampled in all states and the action- values are represented discretely.
Journal ArticleDOI

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

TL;DR: This article presents a general class of associative reinforcement learning algorithms for connectionist networks containing stochastic units that are shown to make weight adjustments in a direction that lies along the gradient of expected reinforcement in both immediate-reinforcement tasks and certain limited forms of delayed-reInforcement tasks, and they do this without explicitly computing gradient estimates.
Proceedings Article

Policy Gradient Methods for Reinforcement Learning with Function Approximation

TL;DR: This paper proves for the first time that a version of policy iteration with arbitrary differentiable function approximation is convergent to a locally optimal policy.
Related Papers (5)