scispace - formally typeset
Open AccessProceedings Article

Probabilistic Differential Dynamic Programming

Reads0
Chats0
TLDR
Compared with the classical DDP and a state-of-the-art GP-based policy search method, PDDP offers a superior combination of data-efficiency, learning speed, and applicability.
Abstract
We present a data-driven, probabilistic trajectory optimization framework for systems with unknown dynamics, called Probabilistic Differential Dynamic Programming (PDDP). PDDP takes into account uncertainty explicitly for dynamics models using Gaussian processes (GPs). Based on the second-order local approximation of the value function, PDDP performs Dynamic Programming around a nominal trajectory in Gaussian belief spaces. Different from typical gradient-based policy search methods, PDDP does not require a policy parameterization and learns a locally optimal, time-varying control policy. We demonstrate the effectiveness and efficiency of the proposed algorithm using two nontrivial tasks. Compared with the classical DDP and a state-of-the-art GP-based policy search method, PDDP offers a superior combination of data-efficiency, learning speed, and applicability.

read more

Content maybe subject to copyright    Report

Citations
More filters
Posted Content

Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images

TL;DR: In this article, a deep generative model, belonging to the family of variational autoencoders, is used to generate image trajectories from a latent space in which the dynamics is constrained to be locally linear.
Proceedings Article

Embed to control: a locally Linear Latent dynamics model for control from raw images

TL;DR: In this paper, a deep generative model, belonging to the family of variational autoencoders, is used to generate image trajectories from a latent space in which the dynamics is constrained to be locally linear.
Proceedings ArticleDOI

Safe Learning of Quadrotor Dynamics Using Barrier Certificates

TL;DR: This paper presents a data-driven approach based on Gaussian processes that learns models of quadrotors operating in partially unknown environments that expands the barrier certified safe region based on an adaptive sampling scheme.
Proceedings ArticleDOI

One-shot learning of manipulation skills with online dynamics adaptation and neural network priors

TL;DR: In this paper, a model-based reinforcement learning algorithm that combines prior knowledge from previous tasks with online adaptation of the dynamics model is developed, which enables highly sample-efficient learning even in regimes where estimating the true dynamics is very difficult.
Posted Content

From Pixels to Torques: Policy Learning with Deep Dynamical Models

TL;DR: In this paper, a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this lowdimensional feature space is proposed.
References
More filters
Book

Gaussian Processes for Machine Learning

TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.
Proceedings Article

Sparse Gaussian Processes using Pseudo-inputs

TL;DR: It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime.
Proceedings Article

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

TL;DR: PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way by learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning.
Journal ArticleDOI

Sparse on-line Gaussian processes

TL;DR: An approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets is developed based on a combination of a Bayesian on-line algorithm and a sequential construction of a relevant subsample of data that fully specifies the prediction of the GP model.
Proceedings ArticleDOI

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

TL;DR: Todorov et al. as discussed by the authors presented an iterative linear-quadratic-Gaussian method for locally-optimal feedback control of nonlinear stochastic systems subject to control constraints.