scispace - formally typeset
Search or ask a question
Author

Angela P. Schoellig

Other affiliations: Max Planck Society, ETH Zurich, Aarhus University  ...read more
Bio: Angela P. Schoellig is an academic researcher from University of Toronto. The author has contributed to research in topics: Control theory & Computer science. The author has an hindex of 26, co-authored 148 publications receiving 3692 citations. Previous affiliations of Angela P. Schoellig include Max Planck Society & ETH Zurich.

Papers published on a yearly basis

Papers
More filters
Proceedings Article
01 Jan 2017
TL;DR: In this paper, the authors present a learning algorithm that explicitly considers safety, defined in terms of stability guarantees, and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates.
Abstract: Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety, defined in terms of stability guarantees. Specifically, we extend control-theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

453 citations

Proceedings ArticleDOI
24 Dec 2012
TL;DR: An algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds using sequential convex programming that approximates non-convex constraints by using convex ones is presented.
Abstract: This paper presents an algorithm that generates collision-free trajectories in three dimensions for multiple vehicles within seconds. The problem is cast as a non-convex optimization problem, which is iteratively solved using sequential convex programming that approximates non-convex constraints by using convex ones. The method generates trajectories that account for simple dynamics constraints and is thus independent of the vehicle's type. An extensive a posteriori vehicle-specific feasibility check is included in the algorithm. The algorithm is applied to a quadrocopter fleet. Experimental results are shown.

301 citations

Proceedings ArticleDOI
16 May 2016
TL;DR: In this paper, a safe optimization algorithm, SafeOptimization, is applied to the problem of automatic controller parameter tuning for low-performance quadrotor vehicles, where the underlying performance measure is modeled as a Gaussian process and only new controller parameters whose performance lies above a safe performance threshold with high probability.
Abstract: One of the most fundamental problems when designing controllers for dynamic systems is the tuning of the controller parameters. Typically, a model of the system is used to obtain an initial controller, but ultimately the controller parameters must be tuned manually on the real system to achieve the best performance. To avoid this manual tuning step, methods from machine learning, such as Bayesian optimization, have been used. However, as these methods evaluate different controller parameters on the real system, safety-critical system failures may happen. In this paper, we overcome this problem by applying, for the first time, a recently developed safe optimization algorithm, SafeOpt, to the problem of automatic controller parameter tuning. Given an initial, low-performance controller, SafeOpt automatically optimizes the parameters of a control law while guaranteeing safety. It models the underlying performance measure as a Gaussian process and only explores new controller parameters whose performance lies above a safe performance threshold with high probability. Experimental results on a quadrotor vehicle indicate that the proposed method enables fast, automatic, and safe optimization of controller parameters without human intervention.

267 citations

Posted Content
TL;DR: This paper presents a learning algorithm that explicitly considers safety, defined in terms of stability guarantees, and extends control-theoretic results on Lyapunov stability verification and shows how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates.
Abstract: Reinforcement learning is a powerful paradigm for learning optimal policies from experimental data. However, to find optimal policies, most reinforcement learning algorithms explore all possible actions, which may be harmful for real-world systems. As a consequence, learning algorithms are rarely applied on safety-critical systems in the real world. In this paper, we present a learning algorithm that explicitly considers safety, defined in terms of stability guarantees. Specifically, we extend control-theoretic results on Lyapunov stability verification and show how to use statistical models of the dynamics to obtain high-performance control policies with provable stability certificates. Moreover, under additional regularity assumptions in terms of a Gaussian process prior, we prove that one can effectively and safely collect data in order to learn about the dynamics and thus both improve control performance and expand the safe region of the state space. In our experiments, we show how the resulting algorithm can safely optimize a neural network policy on a simulated inverted pendulum, without the pendulum ever falling down.

263 citations

Journal ArticleDOI
TL;DR: The architecture of the Arena is described from the viewpoint of system robustness and its capability as a dual-purpose research and demonstration platform.

214 citations


Cited by
More filters
Christopher M. Bishop1
01 Jan 2006
TL;DR: Probability distributions of linear models for regression and classification are given in this article, along with a discussion of combining models and combining models in the context of machine learning and classification.
Abstract: Probability Distributions.- Linear Models for Regression.- Linear Models for Classification.- Neural Networks.- Kernel Methods.- Sparse Kernel Machines.- Graphical Models.- Mixture Models and EM.- Approximate Inference.- Sampling Methods.- Continuous Latent Variables.- Sequential Data.- Combining Models.

10,141 citations

01 Jan 2016
TL;DR: The table of integrals series and products is universally compatible with any devices to read and is available in the book collection an online access to it is set as public so you can get it instantly.
Abstract: Thank you very much for downloading table of integrals series and products. Maybe you have knowledge that, people have look hundreds times for their chosen books like this table of integrals series and products, but end up in harmful downloads. Rather than reading a good book with a cup of coffee in the afternoon, instead they cope with some harmful virus inside their laptop. table of integrals series and products is available in our book collection an online access to it is set as public so you can get it instantly. Our book servers saves in multiple locations, allowing you to get the most less latency time to download any of our books like this one. Merely said, the table of integrals series and products is universally compatible with any devices to read.

4,085 citations

Book ChapterDOI
11 Dec 2012

1,704 citations