scispace - formally typeset
Search or ask a question

Showing papers by "Robert Babuska published in 2019"


Journal ArticleDOI
TL;DR: Two reinforcement learning (RL) based compensation methods are introduced that compensate for unmodeled aberrations in industrial robotic manipulators and show a considerable performance improvement when compared to PD, MPC, and ILC.

74 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: In this article, the authors proposed a novel learning architecture capable of navigating an agent to a target given by an image, and extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance.
Abstract: Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image. To achieve this, we have extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance. We propose three additional auxiliary tasks: predicting the segmentation of the observation image and of the target image and predicting the depth-map. These tasks enable the use of supervised learning to pre-train a major part of the network and to reduce the number of training steps substantially. The training performance has been further improved by increasing the environment complexity gradually over time. An efficient neural network structure is proposed, which is capable of learning for multiple targets in multiple environments. Our method navigates in continuous state spaces and on the AI2-THOR environment simulator surpasses the performance of state-of-the-art goal-oriented visual navigation methods from the literature.

44 citations


Journal ArticleDOI
TL;DR: It is concluded that self-sensing in closed-loop control of Joule-heated TCPMs is feasible and may facilitate further deployment of such actuators in applications where low cost and weight are critical.
Abstract: The twisted and coiled polymer muscle (TCPM) has two major benefits: low weight and low cost. Therefore, this new type of actuator is increasingly used in robotic applications where these ...

19 citations


Posted Content
TL;DR: This paper introduces three off-line methods for finding value functions based on a state-transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation.
Abstract: Reinforcement learning algorithms can be used to optimally solve dynamic decision-making and control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering no insight in the mappings learned, and they require significant trial and error tuning of their meta-parameters. In this paper, we propose a new approach to constructing smooth value functions by means of symbolic regression. We introduce three off-line methods for finding value functions based on a state transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions not only yield well-performing policies, but also are compact, human-readable and mathematically tractable. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with alternative approaches using neural networks shows that our method constructs well-performing value functions with substantially fewer parameters.

14 citations


Proceedings ArticleDOI
01 Sep 2019
TL;DR: A novel method for change detection based on the similarity of local visual features to distinguish important stable regions of the scene from the regions that are changing that substantially improves the accuracy of the robot localization, compared to using the baseline localization method without change detection.
Abstract: Autonomous mobile robots are becoming increasingly important in many industrial and domestic environments. Dealing with unforeseen situations is a difficult problem that must be tackled in order to move closer to the ultimate goal of life-long autonomy. In computer vision-based methods employed on mobile robots, such as localization or navigation, one of the major issues is the dynamics of the scenes. The autonomous operation of the robot may become unreliable if the changes that are common in dynamic environments are not detected and managed. Moving chairs, opening and closing doors or windows, replacing objects on the desks and other changes make many conventional methods fail. To deal with that, we present a novel method for change detection based on the similarity of local visual features. The core idea of the algorithm is to distinguish important stable regions of the scene from the regions that are changing. To evaluate the change detection algorithm, we have designed a simple visual localization framework based on feature matching and we have performed a series of real-world localization experiments. The results have shown that the change detection method substantially improves the accuracy of the robot localization, compared to using the baseline localization method without change detection.

3 citations


Proceedings ArticleDOI
13 Jul 2019
TL;DR: A family of new approaches to constructing smooth approximators for RL by means of genetic programming and more specifically by symbolic regression are discussed and shown how to construct process models and value functions represented by parsimonious analytic expressions using state-of-the-art algorithms, such as Single Node Genetic Programming and Multi-Gene Genetic Programming.
Abstract: Reinforcement Learning (RL) algorithms can be used to optimally solve dynamic decision-making and control problems. With continuous-valued state and input variables, RL algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering no insight in the mappings learnt, and they require significant trial and error tuning of their meta-parameters. In addition, results obtained with deep neural networks suffer from the lack of reproducibility. In this talk, we discuss a family of new approaches to constructing smooth approximators for RL by means of genetic programming and more specifically by symbolic regression. We show how to construct process models and value functions represented by parsimonious analytic expressions using state-of-the-art algorithms, such as Single Node Genetic Programming and Multi-Gene Genetic Programming. We will include examples of nonlinear control problems that can be successfully solved by reinforcement learning with symbolic regression and illustrate some of the challenges this exciting field of research is currently facing.

1 citations


Proceedings ArticleDOI
TL;DR: This work proposes a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image, and extends the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance.
Abstract: Deep reinforcement learning (RL) has been successfully applied to a variety of game-like environments. However, the application of deep RL to visual navigation with realistic environments is a challenging task. We propose a novel learning architecture capable of navigating an agent, e.g. a mobile robot, to a target given by an image. To achieve this, we have extended the batched A2C algorithm with auxiliary tasks designed to improve visual navigation performance. We propose three additional auxiliary tasks: predicting the segmentation of the observation image and of the target image and predicting the depth-map. These tasks enable the use of supervised learning to pre-train a large part of the network and to reduce the number of training steps substantially. The training performance has been further improved by increasing the environment complexity gradually over time. An efficient neural network structure is proposed, which is capable of learning for multiple targets in multiple environments. Our method navigates in continuous state spaces and on the AI2-THOR environment simulator outperforms state-of-the-art goal-oriented visual navigation methods from the literature.