scispace - formally typeset
Search or ask a question
Author

Gabriel A. D. Lopes

Bio: Gabriel A. D. Lopes is an academic researcher from Delft University of Technology. The author has contributed to research in topics: Reinforcement learning & Visual servoing. The author has an hindex of 15, co-authored 57 publications receiving 1366 citations. Previous affiliations of Gabriel A. D. Lopes include Instituto Superior Técnico & University of Michigan.


Papers
More filters
Journal ArticleDOI
01 Nov 2012
TL;DR: The workings of the natural gradient is described, which has made its way into many actor-critic algorithms over the past few years, and a review of several standard and natural actor-Critic algorithms is given.
Abstract: Policy-gradient-based actor-critic algorithms are amongst the most popular algorithms in the reinforcement learning framework. Their advantage of being able to search for optimal policies using low-variance gradient estimates has made them useful in several real-life applications, such as robotics, power control, and finance. Although general surveys on reinforcement learning techniques already exist, no survey is specifically dedicated to actor-critic algorithms in particular. This paper, therefore, describes the state of the art of actor-critic algorithms, with a focus on methods that can work in an online setting and use function approximation in order to deal with continuous state and action spaces. After starting with a discussion on the concepts of reinforcement learning and the origins of actor-critic algorithms, this paper describes the workings of the natural gradient, which has made its way into many actor-critic algorithms over the past few years. A review of several standard and natural actor-critic algorithms is given, and the paper concludes with an overview of application areas and a discussion on open issues.

764 citations

Proceedings ArticleDOI
06 Jul 2004
TL;DR: This paper presents a system for gait adaptation in the RHex series of hexapedal robots that renders this arduous process nearly autonomous, by recourse to a modified version of Nelder-Mead descent.
Abstract: Gait parameter adaptation on a physical robot is an error-prone, tedious and time-consuming process. In this paper we present a system for gait adaptation in our RHex series of hexapedal robots that renders this arduous process nearly autonomous. The robot adapts its gait parameters by recourse to a modified version of Nelder-Mead descent, while managing its self-experiments and measuring the outcome by visual servoing within a partially engineered environment The resulting performance gains extend considerably beyond what we have managed with hand tuning. For example, the best hand tuned alternating tripod gaits never exceeded 0.8 m/s nor achieved specific resistance below 2.0. In contrast, Nelder-Mead based tuning has yielded alternating tripod gaits at 2.7 m/s (well over 5 body lengths per second) and reduced specific resistance to 0.6 while requiring little human intervention at low and moderate speeds. Comparable gains have been achieved on the much larger ruggedized version of this machine.

151 citations

Journal ArticleDOI
TL;DR: This paper considers optimal output synchronization of heterogeneous linear multi-agent systems and shows that this optimal distributed approach implicitly solves the output regulation equations without actually doing so.

128 citations

Journal ArticleDOI
TL;DR: A comprehensive review of the current learning and adaptive control methodologies that have been adapted specifically to PH systems, and highlights the changes from the general setting due to PH model, followed by a detailed presentation of the respective control algorithm.
Abstract: Port-Hamiltonian (PH) theory is a novel, but well established modeling framework for nonlinear physical systems. Due to the emphasis on the physical structure and modular framework, PH modeling has become a prime focus in system theory. This has led to a considerable research interest in the control of PH systems, resulting in numerous nonlinear control techniques. General nonlinear control methodologies are classified in a spectrum from model-based to model-free, where adaptation and learning typically lie close to the end of the range. Various articles and monographs have provided a detailed overview of model-based control techniques on PH models, but no survey is specifically dedicated to the learning and adaptive control methods that can benefit from the PH structure. To this end, we provide a comprehensive review of the current learning and adaptive control methodologies that have been adapted specifically to PH systems. After establishing the required theoretical background, we elaborate on various general machine learning, iterative learning, and adaptive control techniques and their application to PH systems. For each method we highlight the changes from the general setting due to PH model, followed by a detailed presentation of the respective control algorithm. In general, the advantages of using PH models in learning and adaptive controllers are: i) Prior knowledge in the form of PH model speeds up the learning. ii) In some instances new stability or convergence guarantees are obtained by having a PH model. iii) The resulting control laws can be interpreted in the context of physical systems. We conclude the paper with notes on open research issues.

60 citations

Journal ArticleDOI
TL;DR: In this paper, a sampling approach is proposed to estimate the domain of attraction (DoA) of nonlinear systems in real time, which is validated to approximate the DoAs of stable equilibria.
Abstract: Most stabilizing controllers designed for nonlinear systems are valid only within a specific region of the state space, called the domain of attraction (DoA). Computation of the DoA is usually costly and time-consuming. This paper proposes a computationally effective sampling approach to estimate the DoAs of nonlinear systems in real time. This method is validated to approximate the DoAs of stable equilibria in several nonlinear systems. In addition, it is implemented for the passivity-based learning controller designed for a second-order dynamical system. Simulation and experimental results show that, in all cases studied, the proposed sampling technique quickly estimates the DoAs, corroborating its suitability for real-time applications.

54 citations


Cited by
More filters
Journal ArticleDOI
TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

14,635 citations

Journal ArticleDOI
TL;DR: This survey reviews recent trends in video-based human capture and analysis, as well as discussing open problems for future research to achieve automatic visual analysis of human movement.

2,738 citations

Book
01 Jan 2018

2,291 citations

Posted Content
TL;DR: This work discusses core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration, and important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn.
Abstract: We give an overview of recent exciting achievements of deep reinforcement learning (RL). We discuss six core elements, six important mechanisms, and twelve applications. We start with background of machine learning, deep learning and reinforcement learning. Next we discuss core RL elements, including value function, in particular, Deep Q-Network (DQN), policy, reward, model, planning, and exploration. After that, we discuss important mechanisms for RL, including attention and memory, unsupervised learning, transfer learning, multi-agent RL, hierarchical RL, and learning to learn. Then we discuss various applications of RL, including games, in particular, AlphaGo, robotics, natural language processing, including dialogue systems, machine translation, and text generation, computer vision, neural architecture design, business management, finance, healthcare, Industry 4.0, smart grid, intelligent transportation systems, and computer systems. We mention topics not reviewed yet, and list a collection of RL resources. After presenting a brief summary, we close with discussions. Please see Deep Reinforcement Learning, arXiv:1810.06339, for a significant update.

935 citations