Showing papers by "Robert Babuska published in 2019"

PDF

Open Access

Journal Article•DOI•

Reinforcement learning based compensation methods for robot manipulators

[...]

Yudha Pane¹, Subramanya Nageshrao², Jens Kober³, Robert Babuska³•Institutions (3)

Katholieke Universiteit Leuven¹, Ford Motor Company², Delft University of Technology³

01 Feb 2019-Engineering Applications of Artificial Intelligence

TL;DR: Two reinforcement learning (RL) based compensation methods are introduced that compensate for unmodeled aberrations in industrial robotic manipulators and show a considerable performance improvement when compared to PD, MPC, and ILC.

...read moreread less

74 citations

Proceedings Article•DOI•

Vision-based Navigation Using Deep Reinforcement Learning

[...]

Jonáš Kulhánek¹, Erik Derner², Tim de Bruin¹, Robert Babuska²•Institutions (2)

Delft University of Technology¹, Czech Technical University in Prague²

01 Sep 2019

TL;DR: In this article, the authors proposed a novel learning architecture capable of navigating an agent to a target given by an image, and extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance.

...read moreread less

Abstract: Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image. To achieve this, we have extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance. We propose three additional auxiliary tasks: predicting the segmentation of the observation image and of the target image and predicting the depth-map. These tasks enable the use of supervised learning to pre-train a major part of the network and to reduce the number of training steps substantially. The training performance has been further improved by increasing the environment complexity gradually over time. An efficient neural network structure is proposed, which is capable of learning for multiple targets in multiple environments. Our method navigates in continuous state spaces and on the AI2-THOR environment simulator surpasses the performance of state-of-the-art goal-oriented visual navigation methods from the literature.

...read moreread less

44 citations

Journal Article•DOI•

Closed-Loop Control Through Self-Sensing of a Joule-Heated Twisted and Coiled Polymer Muscle.

[...]

Joost van der Weijde¹, Heike Vallery¹, Robert Babuska¹•Institutions (1)

Delft University of Technology¹

04 Oct 2019-Soft robotics

TL;DR: It is concluded that self-sensing in closed-loop control of Joule-heated TCPMs is feasible and may facilitate further deployment of such actuators in applications where low cost and weight are critical.

...read moreread less

Abstract: The twisted and coiled polymer muscle (TCPM) has two major benefits: low weight and low cost. Therefore, this new type of actuator is increasingly used in robotic applications where these ...

...read moreread less

19 citations

Posted Content•

Symbolic Regression Methods for Reinforcement Learning

[...]

Jiří Kubalík¹, Jan Žegklitz, Erik Derner, Robert Babuska•Institutions (1)

Czech Technical University in Prague¹

22 Mar 2019-arXiv: Learning

TL;DR: This paper introduces three off-line methods for finding value functions based on a state-transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation.

...read moreread less

Abstract: Reinforcement learning algorithms can be used to optimally solve dynamic decision-making and control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering no insight in the mappings learned, and they require significant trial and error tuning of their meta-parameters. In this paper, we propose a new approach to constructing smooth value functions by means of symbolic regression. We introduce three off-line methods for finding value functions based on a state transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions not only yield well-performing policies, but also are compact, human-readable and mathematically tractable. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with alternative approaches using neural networks shows that our method constructs well-performing value functions with substantially fewer parameters.

...read moreread less

14 citations

Proceedings Article•DOI•

Towards Life-Long Autonomy of Mobile Robots Through Feature-Based Change Detection

[...]

Erik Derner¹, Clara Gomez¹, Alejandra C. Hernandez¹, Ramon Barber¹, Robert Babuska² - Show less +1 more•Institutions (2)

Charles III University of Madrid¹, Delft University of Technology²

01 Sep 2019

TL;DR: A novel method for change detection based on the similarity of local visual features to distinguish important stable regions of the scene from the regions that are changing that substantially improves the accuracy of the robot localization, compared to using the baseline localization method without change detection.

...read moreread less

Abstract: Autonomous mobile robots are becoming increasingly important in many industrial and domestic environments. Dealing with unforeseen situations is a difficult problem that must be tackled in order to move closer to the ultimate goal of life-long autonomy. In computer vision-based methods employed on mobile robots, such as localization or navigation, one of the major issues is the dynamics of the scenes. The autonomous operation of the robot may become unreliable if the changes that are common in dynamic environments are not detected and managed. Moving chairs, opening and closing doors or windows, replacing objects on the desks and other changes make many conventional methods fail. To deal with that, we present a novel method for change detection based on the similarity of local visual features. The core idea of the algorithm is to distinguish important stable regions of the scene from the regions that are changing. To evaluate the change detection algorithm, we have designed a simple visual localization framework based on feature matching and we have performed a series of real-world localization experiments. The results have shown that the change detection method substantially improves the accuracy of the robot localization, compared to using the baseline localization method without change detection.

...read moreread less

3 citations

Proceedings Article•DOI•

Genetic programming methods for reinforcement learning

[...]

Robert Babuska¹•Institutions (1)

Delft University of Technology¹

13 Jul 2019

TL;DR: A family of new approaches to constructing smooth approximators for RL by means of genetic programming and more specifically by symbolic regression are discussed and shown how to construct process models and value functions represented by parsimonious analytic expressions using state-of-the-art algorithms, such as Single Node Genetic Programming and Multi-Gene Genetic Programming.

...read moreread less

Abstract: Reinforcement Learning (RL) algorithms can be used to optimally solve dynamic decision-making and control problems. With continuous-valued state and input variables, RL algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering no insight in the mappings learnt, and they require significant trial and error tuning of their meta-parameters. In addition, results obtained with deep neural networks suffer from the lack of reproducibility. In this talk, we discuss a family of new approaches to constructing smooth approximators for RL by means of genetic programming and more specifically by symbolic regression. We show how to construct process models and value functions represented by parsimonious analytic expressions using state-of-the-art algorithms, such as Single Node Genetic Programming and Multi-Gene Genetic Programming. We will include examples of nonlinear control problems that can be successfully solved by reinforcement learning with symbolic regression and illustrate some of the challenges this exciting field of research is currently facing.

...read moreread less

1 citations

Proceedings Article•DOI•

Vision-based Navigation Using Deep Reinforcement Learning

[...]

Jonáš Kulhánek¹, Erik Derner¹, Tim de Bruin², Robert Babuska¹•Institutions (2)

Delft University of Technology¹, Czech Technical University in Prague²

08 Aug 2019-arXiv: Robotics

TL;DR: This work proposes a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image, and extends the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance.

...read moreread less

Abstract: Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image. To achieve this, we have extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance. We propose three additional auxiliary tasks: predicting the segmentation of the observation image and of the target image and predicting the depth-map. These tasks enable the use of supervised learning to pre-train a large part of the network and to reduce the number of training steps substantially. The training performance has been further improved by increasing the environment complexity gradually over time. An efficient neural network structure is proposed, which is capable of learning for multiple targets in multiple environments. Our method navigates in continuous state spaces and on the AI2-THOR environment simulator outperforms state-of-the-art goal-oriented visual navigation methods from the literature.

...read moreread less