scispace - formally typeset
Search or ask a question
Author

Jonas Buchli

Bio: Jonas Buchli is an academic researcher from ETH Zurich. The author has contributed to research in topics: Robot & Optimal control. The author has an hindex of 46, co-authored 164 publications receiving 7375 citations. Previous affiliations of Jonas Buchli include Istituto Italiano di Tecnologia & University of Southern California.


Papers
More filters
Journal Article
TL;DR: The framework of stochastic optimal control with path integrals is used to derive a novel approach to RL with parameterized policies to demonstrate interesting similarities with previous RL research in the framework of probability matching and provides intuition why the slightly heuristically motivated probability matching approach can actually perform well.
Abstract: With the goal to generate more scalable algorithms with higher efficiency and fewer open parameters, reinforcement learning (RL) has recently moved towards combining classical techniques from optimal control and dynamic programming with modern learning techniques from statistical estimation theory. In this vein, this paper suggests to use the framework of stochastic optimal control with path integrals to derive a novel approach to RL with parameterized policies. While solidly grounded in value function estimation and optimal control based on the stochastic Hamilton-Jacobi-Bellman (HJB) equations, policy improvements can be transformed into an approximation problem of a path integral which has no open algorithmic parameters other than the exploration noise. The resulting algorithm can be conceived of as model-based, semi-model-based, or even model free, depending on how the learning problem is structured. The update equations have no danger of numerical instabilities as neither matrix inversions nor gradient learning rates are required. Our new algorithm demonstrates interesting similarities with previous RL research in the framework of probability matching and provides intuition why the slightly heuristically motivated probability matching approach can actually perform well. Empirical evaluations demonstrate significant performance improvements over gradient-based policy learning and scalability to high-dimensional control problems. Finally, a learning experiment on a simulated 12 degree-of-freedom robot dog illustrates the functionality of our algorithm in a complex robot learning scenario. We believe that Policy Improvement with Path Integrals (PI2) offers currently one of the most efficient, numerically robust, and easy to implement algorithms for RL based on trajectory roll-outs.

520 citations

01 Jan 2010
TL;DR: In this paper, the authors correct a mistake in the derivation of the generalized path integral control in lemma 2 and show that the term b in equation (20) should not appear at all.
Abstract: In this erratum we correct a mistake in the derivation of the generalized path integral control in lemma 2. More precisely, we show that the term b in equation (20) should not appear at all. This mistake does not affect any of the results presented in this paper, as the b term always dropped out in all of our applications. The changes are:1 1. Equation Z (τ i) = S (τ i) + λ(N−i)l 2 log (2πdt), in page (3144) should change to:

430 citations

Journal ArticleDOI
31 Oct 2016
TL;DR: In this paper, the authors review the methods of digital fabrication with concrete, including 3D printing, under the encompassing term of digital concrete, identifying major challenges for concrete technology within this field.
Abstract: Digital fabrication has been termed the “third industrial revolution” in recent years, and promises to revolutionize the construction industry with the potential of freeform architecture, less material waste, reduced construction costs, and increased worker safety Digital fabrication techniques and cementitious materials have only intersected in a significant way within recent years In this letter, we review the methods of digital fabrication with concrete, including 3D printing, under the encompassing term “digital concrete”, identifying major challenges for concrete technology within this field We additionally provide an analysis of layered extrusion, the most popular digital fabrication technique in concrete technology, identifying the importance of hydration control in its implementation

413 citations

Journal ArticleDOI
06 Feb 2018
TL;DR: A single trajectory optimization formulation for legged locomotion that automatically determines the gait sequence, step timings, footholds, swing-leg motions, and six-dimensional body motion over nonflat terrain, without any additional modules is presented.
Abstract: We present a single trajectory optimization formulation for legged locomotion that automatically determines the gait sequence, step timings, footholds, swing-leg motions, and six-dimensional body motion over nonflat terrain, without any additional modules. Our phase-based parameterization of feet motion and forces allows to optimize over the discrete gait sequence using only continuous decision variables. The system is represented using a simplified centroidal dynamics model that is influenced by the feet's location and forces. We explicitly enforce friction cone constraints, depending on the shape of the terrain. The nonlinear programming problem solver generates highly dynamic motion plans with full flight phases for a variety of legged systems with arbitrary morphologies in an efficient manner. We validate the feasibility of the generated plans in simulation and on the real quadruped robot ANYmal. Additionally, the entire solver software TOWR , which used to generate these motions is made freely available.

381 citations

Journal ArticleDOI
TL;DR: A learning rule for oscillators which adapts their frequency to the frequency of any periodic or pseudo-periodic input signal, which is easily generalizable to a large class of oscillators, from phase oscillators to relaxation oscillators and strange attractors with a generic learning rule.

344 citations


Cited by
More filters
Journal ArticleDOI
01 Apr 1988-Nature
TL;DR: In this paper, a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) is presented.
Abstract: Deposits of clastic carbonate-dominated (calciclastic) sedimentary slope systems in the rock record have been identified mostly as linearly-consistent carbonate apron deposits, even though most ancient clastic carbonate slope deposits fit the submarine fan systems better. Calciclastic submarine fans are consequently rarely described and are poorly understood. Subsequently, very little is known especially in mud-dominated calciclastic submarine fan systems. Presented in this study are a sedimentological core and petrographic characterisation of samples from eleven boreholes from the Lower Carboniferous of Bowland Basin (Northwest England) that reveals a >250 m thick calciturbidite complex deposited in a calciclastic submarine fan setting. Seven facies are recognised from core and thin section characterisation and are grouped into three carbonate turbidite sequences. They include: 1) Calciturbidites, comprising mostly of highto low-density, wavy-laminated bioclast-rich facies; 2) low-density densite mudstones which are characterised by planar laminated and unlaminated muddominated facies; and 3) Calcidebrites which are muddy or hyper-concentrated debrisflow deposits occurring as poorly-sorted, chaotic, mud-supported floatstones. These

9,929 citations

Journal ArticleDOI
06 Jun 1986-JAMA
TL;DR: The editors have done a masterful job of weaving together the biologic, the behavioral, and the clinical sciences into a single tapestry in which everyone from the molecular biologist to the practicing psychiatrist can find and appreciate his or her own research.
Abstract: I have developed "tennis elbow" from lugging this book around the past four weeks, but it is worth the pain, the effort, and the aspirin. It is also worth the (relatively speaking) bargain price. Including appendixes, this book contains 894 pages of text. The entire panorama of the neural sciences is surveyed and examined, and it is comprehensive in its scope, from genomes to social behaviors. The editors explicitly state that the book is designed as "an introductory text for students of biology, behavior, and medicine," but it is hard to imagine any audience, interested in any fragment of neuroscience at any level of sophistication, that would not enjoy this book. The editors have done a masterful job of weaving together the biologic, the behavioral, and the clinical sciences into a single tapestry in which everyone from the molecular biologist to the practicing psychiatrist can find and appreciate his or

7,563 citations

Journal ArticleDOI
TL;DR: In this paper, the authors describe the rules of the ring, the ring population, and the need to get off the ring in order to measure the movement of a cyclic clock.
Abstract: 1980 Preface * 1999 Preface * 1999 Acknowledgements * Introduction * 1 Circular Logic * 2 Phase Singularities (Screwy Results of Circular Logic) * 3 The Rules of the Ring * 4 Ring Populations * 5 Getting Off the Ring * 6 Attracting Cycles and Isochrons * 7 Measuring the Trajectories of a Circadian Clock * 8 Populations of Attractor Cycle Oscillators * 9 Excitable Kinetics and Excitable Media * 10 The Varieties of Phaseless Experience: In Which the Geometrical Orderliness of Rhythmic Organization Breaks Down in Diverse Ways * 11 The Firefly Machine 12 Energy Metabolism in Cells * 13 The Malonic Acid Reagent ('Sodium Geometrate') * 14 Electrical Rhythmicity and Excitability in Cell Membranes * 15 The Aggregation of Slime Mold Amoebae * 16 Numerical Organizing Centers * 17 Electrical Singular Filaments in the Heart Wall * 18 Pattern Formation in the Fungi * 19 Circadian Rhythms in General * 20 The Circadian Clocks of Insect Eclosion * 21 The Flower of Kalanchoe * 22 The Cell Mitotic Cycle * 23 The Female Cycle * References * Index of Names * Index of Subjects

3,424 citations

Book ChapterDOI
01 Jan 1998
TL;DR: In this paper, the authors explore questions of existence and uniqueness for solutions to stochastic differential equations and offer a study of their properties, using diffusion processes as a model of a Markov process with continuous sample paths.
Abstract: We explore in this chapter questions of existence and uniqueness for solutions to stochastic differential equations and offer a study of their properties. This endeavor is really a study of diffusion processes. Loosely speaking, the term diffusion is attributed to a Markov process which has continuous sample paths and can be characterized in terms of its infinitesimal generator.

2,446 citations

Journal ArticleDOI
TL;DR: This article attempts to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots by highlighting both key challenges in robot reinforcement learning as well as notable successes.
Abstract: Reinforcement learning offers to robotics a framework and set of tools for the design of sophisticated and hard-to-engineer behaviors. Conversely, the challenges of robotic problems provide both inspiration, impact, and validation for developments in reinforcement learning. The relationship between disciplines has sufficient promise to be likened to that between physics and mathematics. In this article, we attempt to strengthen the links between the two research communities by providing a survey of work in reinforcement learning for behavior generation in robots. We highlight both key challenges in robot reinforcement learning as well as notable successes. We discuss how contributions tamed the complexity of the domain and study the role of algorithms, representations, and prior knowledge in achieving these successes. As a result, a particular focus of our paper lies on the choice between model-based and model-free as well as between value-function-based and policy-search methods. By analyzing a simple problem in some detail we demonstrate how reinforcement learning approaches may be profitably applied, and we note throughout open questions and the tremendous potential for future research.

2,391 citations