scispace - formally typeset
Search or ask a question

Showing papers by "Robert Babuska published in 2021"


Journal ArticleDOI
TL;DR: A novel method for change detection based on weighted local visual features to distinguish the valuable information in stable regions of the scene from the potentially misleading information in the regions that are changing is presented.

15 citations


Journal ArticleDOI
23 Mar 2021
TL;DR: In this paper, a DRL-based visual navigation policy is fine-tuned on images collected from real-world environments and applied on a mobile robot in a real office environment, which reached a 0.3-meter neighbourhood of the goal in more than 86.7% of cases.
Abstract: Visual navigation is essential for many applications in robotics, from manipulation, through mobile robotics to automated driving. Deep reinforcement learning (DRL) provides an elegant map-free approach integrating image processing, localization, and planning in one module, which can be trained and therefore optimized for a given environment. However, to date, DRL-based visual navigation was validated exclusively in simulation, where the simulator provides information that is not available in the real world, e.g., the robot's position or segmentation masks. This precludes the use of the learned policy on a real robot. Therefore, we present a novel approach that enables a direct deployment of the trained policy on real robots. We have designed a new powerful simulator capable of domain randomization. To facilitate the training, we propose visual auxiliary tasks and a tailored reward scheme. The policy is fine-tuned on images collected from real-world environments. We have evaluated the method on a mobile robot in a real office environment. The training took approximately 30 hours on a single GPU. In 30 navigation experiments, the robot reached a 0.3-meter neighbourhood of the goal in more than 86.7% of cases. This result makes the proposed method directly applicable to tasks like mobile manipulation.

15 citations


Journal ArticleDOI
TL;DR: This paper considers a multi-objective symbolic regression method that optimizes models with respect to their training error and the measure of how well they comply with the desired physical properties and proposes an extension to the existing algorithm that helps generate a diverse set of high-quality models.
Abstract: Virtually all dynamic system control methods benefit from the availability of an accurate mathematical model of the system. This includes also methods like reinforcement learning, which can be vastly sped up and made safer by using a dynamic system model. However, obtaining a sufficient amount of informative data for constructing dynamic models can be difficult. Consequently, standard data-driven model learning techniques using small data sets that do not cover all important properties of the system yield models that are partly incorrect, for instance, in terms of their steady-state characteristics or local behavior. However, often some knowledge about the desired physical properties of the model is available. Recently, several symbolic regression approaches making use of such knowledge to compensate for data insufficiency were proposed. Therefore, this knowledge should be incorporated into the model learning process to compensate for data insufficiency. In this paper, we consider a multi-objective symbolic regression method that optimizes models with respect to their training error and the measure of how well they comply with the desired physical properties. We propose an extension to the existing algorithm that helps generate a diverse set of high-quality models. Further, we propose a method for selecting a single final model out of the pool of candidate output models. We experimentally demonstrate the approach on three real systems: the TurtleBot 2 mobile robot, the Parrot Bebop 2 drone and the magnetic manipulation system. The results show that the proposed model-learning algorithm yields accurate models that are physically justified. The improvement in terms of the model’s compliance with prior knowledge over the models obtained when no prior knowledge was involved in the learning process is of several orders of magnitude.

12 citations


Journal ArticleDOI
02 Sep 2021
TL;DR: An architecture aiming at getting the best out of the two worlds, by combining RL and classical strategies so that each one deals with the right portion of the assembly problem, is proposed, which can learn to insert an object in a frame within a few minutes of real‐world training.
Abstract: Adapting to uncertainties is essential yet challenging for robots while conducting assembly tasks in real‐world scenarios. Reinforcement learning (RL) methods provide a promising solution for these cases. However, training robots with RL can be a data‐extensive, time‐consuming, and potentially unsafe process. In contrast, classical control strategies can have near‐optimal performance without training and be certifiably safe. However, this is achieved at the cost of assuming that the environment is known up to small uncertainties. Herein, an architecture aiming at getting the best out of the two worlds, by combining RL and classical strategies so that each one deals with the right portion of the assembly problem, is proposed. A time‐varying weighted sum combines a recurrent RL method with a nominal strategy. The output serves as the reference for a task space impedance controller. The proposed approach can learn to insert an object in a frame within a few minutes of real‐world training. A success rate of 94% in the presence of considerable uncertainties is observed. Furthermore, the approach is robust to changes in the experimental setup and task, even when no retrain is performed. For example, the same policy achieves a success rate of 85% when the object properties change.

10 citations


Journal ArticleDOI
TL;DR: In this paper, the authors propose a new approach to construct smooth value functions in the form of analytic expressions by using symbolic regression, which is shown to yield well-performing policies and is easy to plug into other algorithms.
Abstract: Reinforcement learning algorithms can solve dynamic decision-making and optimal control problems. With continuous-valued state and input variables, reinforcement learning algorithms must rely on function approximators to represent the value function and policy mappings. Commonly used numerical approximators, such as neural networks or basis function expansions, have two main drawbacks: they are black-box models offering little insight into the mappings learned, and they require extensive trial and error tuning of their hyper-parameters. In this paper, we propose a new approach to constructing smooth value functions in the form of analytic expressions by using symbolic regression. We introduce three off-line methods for finding value functions based on a state-transition model: symbolic value iteration, symbolic policy iteration, and a direct solution of the Bellman equation. The methods are illustrated on four nonlinear control problems: velocity control under friction, one-link and two-link pendulum swing-up, and magnetic manipulation. The results show that the value functions yield well-performing policies and are compact, mathematically tractable, and easy to plug into other algorithms. This makes them potentially suitable for further analysis of the closed-loop system. A comparison with an alternative approach using neural networks shows that our method outperforms the neural network-based one.

4 citations


Posted Content
TL;DR: In this paper, a deep reinforcement learning approach is proposed to design an autonomous landing controller for inclined surfaces using the proximal policy optimization (PPO) algorithm with sparse rewards and a tailored curriculum learning approach.
Abstract: Landing a quadrotor on an inclined surface is a challenging manoeuvre. The final state of any inclined landing trajectory is not an equilibrium, which precludes the use of most conventional control methods. We propose a deep reinforcement learning approach to design an autonomous landing controller for inclined surfaces. Using the proximal policy optimization (PPO) algorithm with sparse rewards and a tailored curriculum learning approach, a robust policy can be trained in simulation in less than 90 minutes on a standard laptop. The policy then directly runs on a real Crazyflie 2.1 quadrotor and successfully performs real inclined landings in a flying arena. A single policy evaluation takes approximately 2.5 ms, which makes it suitable for a future embedded implementation on the quadrotor.

4 citations


Journal ArticleDOI
TL;DR: In this paper, the authors compare five sample-selection methods, including a novel method using the model prediction error, and show that informed sample selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.
Abstract: Continual model learning for nonlinear dynamic systems, such as autonomous robots, presents several challenges. First, it tends to be computationally expensive as the amount of data collected by the robot quickly grows in time. Second, the model accuracy is impaired when data from repetitive motions prevail in the training set and outweigh scarcer samples that also capture interesting properties of the system. It is not known in advance which samples will be useful for model learning. Therefore, effective methods need to be employed to select informative training samples from the continuous data stream collected by the robot. Existing literature does not give any guidelines as to which of the available sample-selection methods are suitable for such a task. In this paper, we compare five sample-selection methods, including a novel method using the model prediction error. We integrate these methods into a model learning framework based on symbolic regression, which allows for learning accurate models in the form of analytic equations. Unlike the currently popular data-hungry deep learning methods, symbolic regression is able to build models even from very small training data sets. We demonstrate the approach on two real robots: the TurtleBot mobile robot and the Parrot Bebop drone. The results show that an accurate model can be constructed even from training sets as small as 24 samples. Informed sample-selection techniques based on prediction error and model variance clearly outperform uninformed methods, such as sequential or random selection.

3 citations


Journal ArticleDOI
30 Jun 2021
TL;DR: Wang et al. as mentioned in this paper developed a multi-modal 2D object detector, and proposed deterministic and stochastic sensor-aware feature fusion strategies to address the issue of changing lighting conditions and asymmetric sensor degradation in object detection.
Abstract: Deep neural networks designed for vision tasks are often prone to failure when they encounter environmental conditions not covered by the training data. Single-modal strategies are insufficient when the sensor fails to acquire information due to malfunction or its design limitations. Multi-sensor configurations are known to provide redundancy, increase reliability, and are crucial in achieving robustness against asymmetric sensor failures. To address the issue of changing lighting conditions and asymmetric sensor degradation in object detection, we develop a multi-modal 2D object detector, and propose deterministic and stochastic sensor-aware feature fusion strategies. The proposed fusion mechanisms are driven by the estimated sensor measurement reliability values/weights. Reliable object detection in harsh lighting conditions is essential for applications such as self-driving vehicles and human-robot interaction. We also propose a new “r-blended” hybrid depth modality for RGB-D sensors. Through extensive experimentation, we show that the proposed strategies outperform the existing state-of-the-art methods on the FLIR-Thermal dataset, and obtain promising results on the SUNRGB-D dataset. We additionally record a new RGB-Infra indoor dataset, namely L515-Indoors, and demonstrate that the proposed object detection methodologies are highly effective for a variety of lighting conditions.

1 citations


Posted Content
TL;DR: Wang et al. as mentioned in this paper developed a multi-modal 2D object detector, and proposed deterministic and stochastic sensor-aware feature fusion strategies to address the issue of changing lighting conditions and asymmetric sensor degradation in object detection.
Abstract: Deep neural networks designed for vision tasks are often prone to failure when they encounter environmental conditions not covered by the training data. Single-modal strategies are insufficient when the sensor fails to acquire information due to malfunction or its design limitations. Multi-sensor configurations are known to provide redundancy, increase reliability, and are crucial in achieving robustness against asymmetric sensor failures. To address the issue of changing lighting conditions and asymmetric sensor degradation in object detection, we develop a multi-modal 2D object detector, and propose deterministic and stochastic sensor-aware feature fusion strategies. The proposed fusion mechanisms are driven by the estimated sensor measurement reliability values/weights. Reliable object detection in harsh lighting conditions is essential for applications such as self-driving vehicles and human-robot interaction. We also propose a new "r-blended" hybrid depth modality for RGB-D sensors. Through extensive experimentation, we show that the proposed strategies outperform the existing state-of-the-art methods on the FLIR-Thermal dataset, and obtain promising results on the SUNRGB-D dataset. We additionally record a new RGB-Infra indoor dataset, namely L515-Indoors, and demonstrate that the proposed object detection methodologies are highly effective for a variety of lighting conditions.

Posted Content
TL;DR: In this article, a geometry-based grasping method for vine tomatoes is proposed, which relies on a computer-vision pipeline to identify the required geometric features of the tomatoes and of the truss stem.
Abstract: We propose a geometry-based grasping method for vine tomatoes. It relies on a computer-vision pipeline to identify the required geometric features of the tomatoes and of the truss stem. The grasping method then uses a geometric model of the robotic hand and the truss to determine a suitable grasping location on the stem. This approach allows for grasping tomato trusses without requiring delicate contact sensors or complex mechanistic models and under minimal risk of damaging the tomatoes. Lab experiments were conducted to validate the proposed methods, using an RGB-D camera and a low-cost robotic manipulator. The success rate was 83% to 92%, depending on the type of truss.