Structured Control Nets for Deep Reinforcement Learning

Open AccessProceedings Article

Structured Control Nets for Deep Reinforcement Learning

- pp 4742-4751

TLDR

This work proposes a new neural network architecture for the policy network representation that is simple yet effective, and demonstrates much improved performance for locomotion tasks by emulating the biological central pattern generators as the nonlinear part of the architecture.

Abstract:

In recent years, Deep Reinforcement Learning has made impressive advances in solving several important benchmark problems for sequential decision making. Many control applications use a generic multilayer perceptron (MLP) for non-vision parts of the policy network. In this work, we propose a new neural network architecture for the policy network representation that is simple yet effective. The proposed Structured Control Net (SCN) splits the generic MLP into two separate sub-modules: a nonlinear control module and a linear control module. Intuitively, the nonlinear control is for forward-looking and global control, while the linear control stabilizes the local dynamics around the residual of global control. We hypothesize that this will bring together the benefits of both linear and nonlinear policies: improve training sample efficiency, final episodic reward, and generalization of learned policy, while requiring a smaller network and being generally applicable to different training methods. We validated our hypothesis with competitive results on simulations from OpenAI MuJoCo, Roboschool, Atari, and a custom 2D urban driving environment, with various ablation and generalization tests, trained with multiple black-box and policy gradient training methods. The proposed architecture has the potential to improve upon broader control tasks by incorporating problem specific priors into the architecture. As a case study, we demonstrate much improved performance for locomotion tasks by emulating the biological central pattern generators (CPGs) as the nonlinear part of the architecture.

Citations

PDF

Open Access

More filters

Posted Content

Residual Reinforcement Learning for Robot Control.

Tobias Johannink, +8 more

- 07 Dec 2018 -

arXiv: Robotics

TL;DR: This paper studies how to solve difficult control problems in the real world by decomposing them into a part that is solved efficiently by conventional feedback control methods, and the residual which is solved with RL.

...read moreread less

Proceedings ArticleDOI

Residual Reinforcement Learning for Robot Control

Tobias Johannink, +8 more

TL;DR: In this paper, a reinforcement learning (RL) method is used to learn continuous robot controllers from interactions with the environment, even for problems that include friction and contacts, by decomposing them into a part that is solved efficiently by conventional feedback control methods and the residual which is solved with RL.

...read moreread less

Proceedings ArticleDOI

Learn-to-Recover: Retrofitting UAVs with Reinforcement Learning-Assisted Flight Control Under Cyber-Physical Attacks

Fan Fei, +3 more

TL;DR: The result shows that the generic fault-tolerant control strategy via reinforcement learning can effectively tolerate different types of attacks/faults and maintain the vehicle's position, outperforming the other two methods.

...read moreread less

Posted Content

Investigating Generalisation in Continuous Deep Reinforcement Learning

Chenyang Zhao, +3 more

- 19 Feb 2019 -

arXiv: Learning

TL;DR: It is shown that, if generalisation is the goal, then common practice of evaluating algorithms based on their training performance leads to the wrong conclusions about algorithm choice, and a new benchmark and thorough empirical evaluation of generalisation challenges for state of the art Deep RL methods are provided.

...read moreread less

Journal ArticleDOI

Learning Natural Locomotion Behaviors for Humanoid Robots Using Human Bias

Chuanyu Yang, +4 more

TL;DR: This letter presents a new learning framework that leverages the knowledge from imitation learning, deep reinforcement learning, and control theories to achieve human-style locomotion that is natural, dynamic, and robust for humanoids.

...read moreread less