scispace - formally typeset
Open AccessPosted Content

Visual Foresight: Model-Based Deep Reinforcement Learning for Vision-Based Robotic Control.

Reads0
Chats0
TLDR
It is demonstrated that visual MPC can generalize to never-before-seen objects---both rigid and deformable---and solve a range of user-defined object manipulation tasks using the same model.
Abstract
Deep reinforcement learning (RL) algorithms can learn complex robotic skills from raw sensory inputs, but have yet to achieve the kind of broad generalization and applicability demonstrated by deep learning methods in supervised domains. We present a deep RL method that is practical for real-world robotics tasks, such as robotic manipulation, and generalizes effectively to never-before-seen tasks and objects. In these settings, ground truth reward signals are typically unavailable, and we therefore propose a self-supervised model-based approach, where a predictive model learns to directly predict the future from raw sensory readings, such as camera images. At test time, we explore three distinct goal specification methods: designated pixels, where a user specifies desired object manipulation tasks by selecting particular pixels in an image and corresponding goal positions, goal images, where the desired goal state is specified with an image, and image classifiers, which define spaces of goal states. Our deep predictive models are trained using data collected autonomously and continuously by a robot interacting with hundreds of objects, without human supervision. We demonstrate that visual MPC can generalize to never-before-seen objects---both rigid and deformable---and solve a range of user-defined object manipulation tasks using the same model.

read more

Citations
More filters
Posted Content

Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems

TL;DR: This tutorial article aims to provide the reader with the conceptual tools needed to get started on research on offline reinforcement learning algorithms: reinforcementlearning algorithms that utilize previously collected data, without additional online data collection.
Posted Content

When to Trust Your Model: Model-Based Policy Optimization

TL;DR: This paper first formulate and analyze a model-based reinforcement learning algorithm with a guarantee of monotonic improvement at each step, and demonstrates that a simple procedure of using short model-generated rollouts branched from real data has the benefits of more complicated model- based algorithms without the usual pitfalls.
Proceedings Article

MOPO: Model-based Offline Policy Optimization

TL;DR: Model-based offline policy optimization (MOPO) as discussed by the authors proposes to modify the existing model-based RL methods by applying them with rewards artificially penalized by the uncertainty of the dynamics and theoretically shows that the algorithm maximizes a lower bound of the policy's return under the true MDP.
Posted Content

Learning Latent Dynamics for Planning from Pixels

TL;DR: In this article, the Deep Planning Network (PlaNet) learns the environment dynamics from images and chooses actions through fast online planning in latent space, which achieves state-of-the-art performance on continuous control tasks with contact dynamics, partial observability and sparse rewards.
Posted Content

Model-Based Reinforcement Learning for Atari

TL;DR: SimPLe as discussed by the authors is a model-based deep RL algorithm based on video prediction models, which can solve Atari games with fewer interactions than model-free methods and outperforms state-of-the-art RL algorithms by over an order of magnitude.
References
More filters
Proceedings Article

Adam: A Method for Stochastic Optimization

TL;DR: This work introduces Adam, an algorithm for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimates of lower-order moments, and provides a regret bound on the convergence rate that is comparable to the best known results under the online convex optimization framework.
Posted Content

Adam: A Method for Stochastic Optimization

TL;DR: In this article, the adaptive estimates of lower-order moments are used for first-order gradient-based optimization of stochastic objective functions, based on adaptive estimate of lowerorder moments.
Posted Content

Playing Atari with Deep Reinforcement Learning

TL;DR: This work presents the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning, which outperforms all previous approaches on six of the games and surpasses a human expert on three of them.
Proceedings Article

Model-agnostic meta-learning for fast adaptation of deep networks

TL;DR: An algorithm for meta-learning that is model-agnostic, in the sense that it is compatible with any model trained with gradient descent and applicable to a variety of different learning problems, including classification, regression, and reinforcement learning is proposed.
Posted Content

Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting

TL;DR: This paper proposes the convolutional LSTM (ConvLSTM) and uses it to build an end-to-end trainable model for the precipitation nowcasting problem and shows that it captures spatiotemporal correlations better and consistently outperforms FC-L STM and the state-of-the-art operational ROVER algorithm.
Related Papers (5)