Showing papers by "Robert Babuska published in 2021"

PDF

Open Access

Journal Article•DOI•

Change detection using weighted features for image-based localization

[...]

Erik Derner¹, Clara Gomez², Alejandra C. Hernandez², Ramon Barber², Robert Babuska³, Robert Babuska¹ - Show less +2 more•Institutions (3)

Czech Technical University in Prague¹, Charles III University of Madrid², Delft University of Technology³

01 Jan 2021-Robotics and Autonomous Systems

TL;DR: A novel method for change detection based on weighted local visual features to distinguish the valuable information in stable regions of the scene from the potentially misleading information in the regions that are changing is presented.

...read moreread less

15 citations

Journal Article•DOI•

Visual Navigation in Real-World Indoor Environments Using End-to-End Deep Reinforcement Learning

[...]

Jonáš Kulhánek¹, Erik Derner¹, Robert Babuska²•Institutions (2)

Czech Technical University in Prague¹, Delft University of Technology²

23 Mar 2021

TL;DR: In this paper, a DRL-based visual navigation policy is fine-tuned on images collected from real-world environments and applied on a mobile robot in a real office environment, which reached a 0.3-meter neighbourhood of the goal in more than 86.7% of cases.

...read moreread less

Abstract: Visual navigation is essential for many applications in robotics, from manipulation, through mobile robotics to automated driving. Deep reinforcement learning (DRL) provides an elegant map-free approach integrating image processing, localization, and planning in one module, which can be trained and therefore optimized for a given environment. However, to date, DRL-based visual navigation was validated exclusively in simulation, where the simulator provides information that is not available in the real world, e.g., the robot's position or segmentation masks. This precludes the use of the learned policy on a real robot. Therefore, we present a novel approach that enables a direct deployment of the trained policy on real robots. We have designed a new powerful simulator capable of domain randomization. To facilitate the training, we propose visual auxiliary tasks and a tailored reward scheme. The policy is fine-tuned on images collected from real-world environments. We have evaluated the method on a mobile robot in a real office environment. The training took approximately 30 hours on a single GPU. In 30 navigation experiments, the robot reached a 0.3-meter neighbourhood of the goal in more than 86.7% of cases. This result makes the proposed method directly applicable to tasks like mobile manipulation.

...read moreread less

15 citations

Journal Article•DOI•

Multi-objective symbolic regression for physics-aware dynamic modeling

[...]

Jiří Kubalík¹, Erik Derner¹, Robert Babuska¹, Robert Babuska²•Institutions (2)

Czech Technical University in Prague¹, Delft University of Technology²

15 Nov 2021-Expert Systems With Applications

TL;DR: This paper considers a multi-objective symbolic regression method that optimizes models with respect to their training error and the measure of how well they comply with the desired physical properties and proposes an extension to the existing algorithm that helps generate a diverse set of high-quality models.

...read moreread less

Abstract: Virtually all dynamic system control methods benefit from the availability of an accurate mathematical model of the system. This includes also methods like reinforcement learning, which can be vastly sped up and made safer by using a dynamic system model. However, obtaining a sufficient amount of informative data for constructing dynamic models can be difficult. Consequently, standard data-driven model learning techniques using small data sets that do not cover all important properties of the system yield models that are partly incorrect, for instance, in terms of their steady-state characteristics or local behavior. However, often some knowledge about the desired physical properties of the model is available. Recently, several symbolic regression approaches making use of such knowledge to compensate for data insufficiency were proposed. Therefore, this knowledge should be incorporated into the model learning process to compensate for data insufficiency. In this paper, we consider a multi-objective symbolic regression method that optimizes models with respect to their training error and the measure of how well they comply with the desired physical properties. We propose an extension to the existing algorithm that helps generate a diverse set of high-quality models. Further, we propose a method for selecting a single final model out of the pool of candidate output models. We experimentally demonstrate the approach on three real systems: the TurtleBot 2 mobile robot, the Parrot Bebop 2 drone and the magnetic manipulation system. The results show that the proposed model-learning algorithm yields accurate models that are physically justified. The improvement in terms of the model’s compliance with prior knowledge over the models obtained when no prior knowledge was involved in the learning process is of several orders of magnitude.

...read moreread less

12 citations

Journal Article•DOI•

Learning Assembly Tasks in a Few Minutes by Combining Impedance Control and Residual Recurrent Reinforcement Learning

[...]

Padmaja Kulkarni¹, Jens Kober¹, Robert Babuska², Robert Babuska¹, Cosimo Della Santina³, Cosimo Della Santina¹ - Show less +2 more•Institutions (3)

Delft University of Technology¹, Czech Technical University in Prague², German Aerospace Center³

02 Sep 2021

TL;DR: An architecture aiming at getting the best out of the two worlds, by combining RL and classical strategies so that each one deals with the right portion of the assembly problem, is proposed, which can learn to insert an object in a frame within a few minutes of real‐world training.

...read moreread less

Abstract: Adapting to uncertainties is essential yet challenging for robots while conducting assembly tasks in real‐world scenarios. Reinforcement learning (RL) methods provide a promising solution for these cases. However, training robots with RL can be a data‐extensive, time‐consuming, and potentially unsafe process. In contrast, classical control strategies can have near‐optimal performance without training and be certifiably safe. However, this is achieved at the cost of assuming that the environment is known up to small uncertainties. Herein, an architecture aiming at getting the best out of the two worlds, by combining RL and classical strategies so that each one deals with the right portion of the assembly problem, is proposed. A time‐varying weighted sum combines a recurrent RL method with a nominal strategy. The output serves as the reference for a task space impedance controller. The proposed approach can learn to insert an object in a frame within a few minutes of real‐world training. A success rate of 94% in the presence of considerable uncertainties is observed. Furthermore, the approach is robust to changes in the experimental setup and task, even when no retrain is performed. For example, the same policy achieves a success rate of 85% when the object properties change.

...read moreread less

10 citations

Journal Article•DOI•

Symbolic Regression Methods for Reinforcement Learning

[...]

Jiri Kubalik¹, Erik Derner¹, Jan Zegklitz¹, Robert Babuska¹•Institutions (1)

Czech Technical University in Prague¹

01 Jan 2021-IEEE Access

TL;DR: In this paper, the authors propose a new approach to construct smooth value functions in the form of analytic expressions by using symbolic regression, which is shown to yield well-performing policies and is easy to plug into other algorithms.

...read moreread less

Abstract: Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one.

...read moreread less

4 citations

Posted Content•

Inclined Quadrotor Landing using Deep Reinforcement Learning.

[...]

Jacob E. Kooi, Robert Babuska¹•Institutions (1)

Delft University of Technology¹

16 Mar 2021-arXiv: Robotics

TL;DR: In this paper, a deep reinforcement learning approach is proposed to design an autonomous landing controller for inclined surfaces using the proximal policy optimization (PPO) algorithm with sparse rewards and a tailored curriculum learning approach.

...read moreread less

Abstract: Landing a quadrotor on an inclined surface is a challenging manoeuvre. The final state of any inclined landing trajectory is not an equilibrium, which precludes the use of most conventional control methods. We propose a deep reinforcement learning approach to design an autonomous landing controller for inclined surfaces. Using the proximal policy optimization (PPO) algorithm with sparse rewards and a tailored curriculum learning approach, a robust policy can be trained in simulation in less than 90 minutes on a standard laptop. The policy then directly runs on a real Crazyflie 2.1 quadrotor and successfully performs real inclined landings in a flying arena. A single policy evaluation takes approximately 2.5 ms, which makes it suitable for a future embedded implementation on the quadrotor.

...read moreread less

4 citations

Journal Article•DOI•

Selecting Informative Data Samples for Model Learning Through Symbolic Regression

[...]

Erik Derner¹, Jiri Kubalik¹, Robert Babuska¹•Institutions (1)

Czech Technical University in Prague¹

18 Jan 2021-IEEE Access

TL;DR: In this paper, the authors compare five sample-selection methods, including a novel method using the model prediction error, and show that informed sample selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.

...read moreread less

Abstract: Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.

...read moreread less

3 citations

Journal Article•DOI•

GEM: Glare or Gloom, I Can Still See You – End-to-End Multi-Modal Object Detection

[...]

Osama Mazhar¹, Robert Babuska¹, Jens Kober¹•Institutions (1)

Delft University of Technology¹

30 Jun 2021

TL;DR: Wang et al. as mentioned in this paper developed a multi-modal 2D object detector, and proposed deterministic and stochastic sensor-aware feature fusion strategies to address the issue of changing lighting conditions and asymmetric sensor degradation in object detection.

...read moreread less

Abstract: Deep neural networks designed for vision tasks are often prone to failure when they encounter environmental conditions not covered by the training data. Single-modal strategies are insufficient when the sensor fails to acquire information due to malfunction or its design limitations. Multi-sensor configurations are known to provide redundancy, increase reliability, and are crucial in achieving robustness against asymmetric sensor failures. To address the issue of changing lighting conditions and asymmetric sensor degradation in object detection, we develop a multi-modal 2D object detector, and propose deterministic and stochastic sensor-aware feature fusion strategies. The proposed fusion mechanisms are driven by the estimated sensor measurement reliability values/weights. Reliable object detection in harsh lighting conditions is essential for applications such as self-driving vehicles and human-robot interaction. We also propose a new “r-blended” hybrid depth modality for RGB-D sensors. Through extensive experimentation, we show that the proposed strategies outperform the existing state-of-the-art methods on the FLIR-Thermal dataset, and obtain promising results on the SUNRGB-D dataset. We additionally record a new RGB-Infra indoor dataset, namely L515-Indoors, and demonstrate that the proposed object detection methodologies are highly effective for a variety of lighting conditions.

...read moreread less

1 citations

Posted Content•

GEM: Glare or Gloom, I Can Still See You -- End-to-End Multimodal Object Detection.

[...]

Osama Mazhar, Robert Babuska, Jens Kober¹•Institutions (1)

Delft University of Technology¹

24 Feb 2021-arXiv: Computer Vision and Pattern Recognition

...read moreread less

Abstract: Deep neural networks designed for vision tasks are often prone to failure when they encounter environmental conditions not covered by the training data. Single-modal strategies are insufficient when the sensor fails to acquire information due to malfunction or its design limitations. Multi-sensor configurations are known to provide redundancy, increase reliability, and are crucial in achieving robustness against asymmetric sensor failures. To address the issue of changing lighting conditions and asymmetric sensor degradation in object detection, we develop a multi-modal 2D object detector, and propose deterministic and stochastic sensor-aware feature fusion strategies. The proposed fusion mechanisms are driven by the estimated sensor measurement reliability values/weights. Reliable object detection in harsh lighting conditions is essential for applications such as self-driving vehicles and human-robot interaction. We also propose a new "r-blended" hybrid depth modality for RGB-D sensors. Through extensive experimentation, we show that the proposed strategies outperform the existing state-of-the-art methods on the FLIR-Thermal dataset, and obtain promising results on the SUNRGB-D dataset. We additionally record a new RGB-Infra indoor dataset, namely L515-Indoors, and demonstrate that the proposed object detection methodologies are highly effective for a variety of lighting conditions.

...read moreread less

Posted Content•

Geometry-Based Grasping of Vine Tomatoes.

[...]

Taeke de Haan, Padmaja Kulkarni, Robert Babuska

01 Mar 2021-arXiv: Robotics

TL;DR: In this article, a geometry-based grasping method for vine tomatoes is proposed, which relies on a computer-vision pipeline to identify the required geometric features of the tomatoes and of the truss stem.

...read moreread less

Abstract: We propose a geometry-based grasping method for vine tomatoes. It relies on a computer-vision pipeline to identify the required geometric features of the tomatoes and of the truss stem. The grasping method then uses a geometric model of the robotic hand and the truss to determine a suitable grasping location on the stem. This approach allows for grasping tomato trusses without requiring delicate contact sensors or complex mechanistic models and under minimal risk of damaging the tomatoes. Lab experiments were conducted to validate the proposed methods, using an RGB-D camera and a low-cost robotic manipulator. The success rate was 83% to 92%, depending on the type of truss.

...read moreread less