Probabilistic Differential Dynamic Programming

Open AccessProceedings Article

Probabilistic Differential Dynamic Programming

Yunpeng Pan, +1 more

- Vol. 27, pp 1907-1915

Chats0

TLDR

Compared with the classical DDP and a state-of-the-art GP-based policy search method, PDDP offers a superior combination of data-efficiency, learning speed, and applicability.

Abstract:

We present a data-driven, probabilistic trajectory optimization framework for systems with unknown dynamics, called Probabilistic Differential Dynamic Programming (PDDP). PDDP takes into account uncertainty explicitly for dynamics models using Gaussian processes (GPs). Based on the second-order local approximation of the value function, PDDP performs Dynamic Programming around a nominal trajectory in Gaussian belief spaces. Different from typical gradient-based policy search methods, PDDP does not require a policy parameterization and learns a locally optimal, time-varying control policy. We demonstrate the effectiveness and efficiency of the proposed algorithm using two nontrivial tasks. Compared with the classical DDP and a state-of-the-art GP-based policy search method, PDDP offers a superior combination of data-efficiency, learning speed, and applicability.

Citations

PDF

Open Access

More filters

Posted Content

Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images

Manuel Watter, +3 more

- 24 Jun 2015 -

arXiv: Learning

TL;DR: In this article, a deep generative model, belonging to the family of variational autoencoders, is used to generate image trajectories from a latent space in which the dynamics is constrained to be locally linear.

...read moreread less

Proceedings Article

Embed to control: a locally Linear Latent dynamics model for control from raw images

Manuel Watter, +3 more

TL;DR: In this paper, a deep generative model, belonging to the family of variational autoencoders, is used to generate image trajectories from a latent space in which the dynamics is constrained to be locally linear.

...read moreread less

Proceedings ArticleDOI

Safe Learning of Quadrotor Dynamics Using Barrier Certificates

Li Wang, +2 more

TL;DR: This paper presents a data-driven approach based on Gaussian processes that learns models of quadrotors operating in partially unknown environments that expands the barrier certified safe region based on an adaptive sampling scheme.

...read moreread less

Proceedings ArticleDOI

One-shot learning of manipulation skills with online dynamics adaptation and neural network priors

Justin Fu, +2 more

TL;DR: In this paper, a model-based reinforcement learning algorithm that combines prior knowledge from previous tasks with online adaptation of the dynamics model is developed, which enables highly sample-efficient learning even in regimes where estimating the true dynamics is very difficult.

...read moreread less

Posted Content

From Pixels to Torques: Policy Learning with Deep Dynamical Models

Niklas Wahlström, +2 more

- 08 Feb 2015 -

arXiv: Machine Learning

TL;DR: In this paper, a deep dynamical model that uses deep auto-encoders to learn a low-dimensional embedding of images jointly with a predictive model in this lowdimensional feature space is proposed.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Gaussian Processes for Machine Learning

Carl Edward Rasmussen, +1 more

TL;DR: The treatment is comprehensive and self-contained, targeted at researchers and students in machine learning and applied statistics, and deals with the supervised learning problem for both regression and classification.

...read moreread less

Proceedings Article

Sparse Gaussian Processes using Pseudo-inputs

Edward Snelson, +1 more

TL;DR: It is shown that this new Gaussian process (GP) regression model can match full GP performance with small M, i.e. very sparse solutions, and it significantly outperforms other approaches in this regime.

...read moreread less

Proceedings Article

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Marc Peter Deisenroth, +1 more

TL;DR: PILCO reduces model bias, one of the key problems of model-based reinforcement learning, in a principled way by learning a probabilistic dynamics model and explicitly incorporating model uncertainty into long-term planning.

...read moreread less

Journal ArticleDOI

Sparse on-line Gaussian processes

Lehel Csató, +1 more

- 01 Mar 2002 -

Neural Computation

TL;DR: An approach for sparse representations of gaussian process (GP) models (which are Bayesian types of kernel machines) in order to overcome their limitations for large data sets is developed based on a combination of a Bayesian on-line algorithm and a sequential construction of a relevant subsample of data that fully specifies the prediction of the GP model.

...read moreread less

Proceedings ArticleDOI

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

Emanuel Todorov, +1 more

TL;DR: Todorov et al. as discussed by the authors presented an iterative linear-quadratic-Gaussian method for locally-optimal feedback control of nonlinear stochastic systems subject to control constraints.

...read moreread less

Probabilistic Differential Dynamic Programming

Citations

Embed to Control: A Locally Linear Latent Dynamics Model for Control from Raw Images

Embed to control: a locally Linear Latent dynamics model for control from raw images

Safe Learning of Quadrotor Dynamics Using Barrier Certificates

One-shot learning of manipulation skills with online dynamics adaptation and neural network priors

From Pixels to Torques: Policy Learning with Deep Dynamical Models

References

Gaussian Processes for Machine Learning

Sparse Gaussian Processes using Pseudo-inputs

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

Sparse on-line Gaussian processes

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

Related Papers (5)

PILCO: A Model-Based and Data-Efficient Approach to Policy Search

A generalized iterative LQG method for locally-optimal feedback control of constrained nonlinear stochastic systems

Gaussian Processes for Machine Learning

Differential dynamic programming

Synthesis and stabilization of complex behaviors through online trajectory optimization