scispace - formally typeset
Search or ask a question

Showing papers on "Robot published in 2017"


Journal ArticleDOI
TL;DR: A detailed survey of ongoing methodologies for soft actuators, highlighting approaches suitable for nanometer- to centimeter-scale robotic applications, including both the development of new materials and composites, as well as novel implementations leveraging the unique properties of soft materials.
Abstract: This review comprises a detailed survey of ongoing methodologies for soft actuators, highlighting approaches suitable for nanometer- to centimeter-scale robotic applications. Soft robots present a special design challenge in that their actuation and sensing mechanisms are often highly integrated with the robot body and overall functionality. When less than a centimeter, they belong to an even more special subcategory of robots or devices, in that they often lack on-board power, sensing, computation, and control. Soft, active materials are particularly well suited for this task, with a wide range of stimulants and a number of impressive examples, demonstrating large deformations, high motion complexities, and varied multifunctionality. Recent research includes both the development of new materials and composites, as well as novel implementations leveraging the unique properties of soft materials.

897 citations


Journal ArticleDOI
TL;DR: In this paper, the authors focus on a particular type of intrinsically soft, elastomeric robot powered via fluidic pressurization, and present a review of their use in soft robotics.
Abstract: The emerging field of soft robotics makes use of many classes of materials including metals, low glass transition temperature (Tg) plastics, and high Tg elastomers. Dependent on the specific design, all of these materials may result in extrinsically soft robots. Organic elastomers, however, have elastic moduli ranging from tens of megapascals down to kilopascals; robots composed of such materials are intrinsically soft − they are always compliant independent of their shape. This class of soft machines has been used to reduce control complexity and manufacturing cost of robots, while enabling sophisticated and novel functionalities often in direct contact with humans. This review focuses on a particular type of intrinsically soft, elastomeric robot − those powered via fluidic pressurization.

653 citations


Journal ArticleDOI
TL;DR: A simple solution for facial expression recognition that uses a combination of Convolutional Neural Network and specific image pre-processing steps to extract only expression specific features from a face image and explore the presentation order of the samples during training.

639 citations


Proceedings ArticleDOI
Chelsea Finn1, Sergey Levine1
01 May 2017
TL;DR: This work develops a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data and enables a real robot to perform nonprehensile manipulation — pushing objects — and can handle novel objects not seen during training.
Abstract: A key challenge in scaling up robot learning to many skills and environments is removing the need for human supervision, so that robots can collect their own data and improve their own performance without being limited by the cost of requesting human feedback. Model-based reinforcement learning holds the promise of enabling an agent to learn to predict the effects of its actions, which could provide flexible predictive models for a wide range of tasks and environments, without detailed human supervision. We develop a method for combining deep action-conditioned video prediction models with model-predictive control that uses entirely unlabeled training data. Our approach does not require a calibrated camera, an instrumented training set-up, nor precise sensing and actuation. Our results show that our method enables a real robot to perform nonprehensile manipulation — pushing objects — and can handle novel objects not seen during training.

620 citations


Proceedings ArticleDOI
01 Jan 2017
TL;DR: In this paper, a mapless motion planner is proposed by taking the sparse 10-dimensional range findings and the target position with respect to the mobile robot coordinate frame as input and the continuous steering commands as output.
Abstract: We present a learning-based mapless motion planner by taking the sparse 10-dimensional range findings and the target position with respect to the mobile robot coordinate frame as input and the continuous steering commands as output. Traditional motion planners for mobile ground robots with a laser range sensor mostly depend on the obstacle map of the navigation environment where both the highly precise laser sensor and the obstacle map building work of the environment are indispensable. We show that, through an asynchronous deep reinforcement learning method, a mapless motion planner can be trained end-to-end without any manually designed features and prior demonstrations. The trained planner can be directly applied in unseen virtual and real environments. The experiments show that the proposed mapless motion planner can navigate the nonholonomic mobile robot to the desired targets without colliding with any obstacles.

551 citations


Book
01 Jul 2017
TL;DR: In this paper, the authors present common sensor models and practical advice on how to carry out state estimation for rotations and other state variables, including batch estimation, the Bayes filter, sigmapoint and particle filters, robust estimation for outlier rejection and continuous-time trajectory estimation.
Abstract: A key aspect of robotics today is estimating the state, such as position and orientation, of a robot as it moves through the world. Most robots and autonomous vehicles depend on noisy data from sensors such as cameras or laser rangefinders to navigate in a three-dimensional world. This book presents common sensor models and practical advice on how to carry out state estimation for rotations and other state variables. It covers both classical state estimation methods such as the Kalman filter, as well as important modern topics such as batch estimation, the Bayes filter, sigmapoint and particle filters, robust estimation for outlier rejection, and continuous-time trajectory estimation and its connection to Gaussian-process regression. The methods are demonstrated in the context of important applications such as point-cloud alignment, pose-graph relaxation, bundle adjustment, and simultaneous localization and mapping. Students and practitioners of robotics alike will find this a valuable resource.

516 citations


Posted Content
TL;DR: A general and model-free approach for Reinforcement Learning on real robotics with sparse rewards built upon the Deep Deterministic Policy Gradient algorithm to use demonstrations that out-performs DDPG, and does not require engineered rewards.
Abstract: We propose a general and model-free approach for Reinforcement Learning (RL) on real robotics with sparse rewards. We build upon the Deep Deterministic Policy Gradient (DDPG) algorithm to use demonstrations. Both demonstrations and actual interactions are used to fill a replay buffer and the sampling ratio between demonstrations and transitions is automatically tuned via a prioritized replay mechanism. Typically, carefully engineered shaping rewards are required to enable the agents to efficiently explore on high dimensional control problems such as robotics. They are also required for model-based acceleration methods relying on local solvers such as iLQG (e.g. Guided Policy Search and Normalized Advantage Function). The demonstrations replace the need for carefully engineered rewards, and reduce the exploration problem encountered by classical RL approaches in these domains. Demonstrations are collected by a robot kinesthetically force-controlled by a human demonstrator. Results on four simulated insertion tasks show that DDPG from demonstrations out-performs DDPG, and does not require engineered rewards. Finally, we demonstrate the method on a real robotics task consisting of inserting a clip (flexible object) into a rigid object.

514 citations


Proceedings ArticleDOI
01 May 2017
TL;DR: In this paper, a method to incorporate photo-realistic computer images from a simulation engine to rapidly generate annotated data that can be used for the training of machine learning algorithms is described.
Abstract: Deep learning has rapidly transformed the state of the art algorithms used to address a variety of problems in computer vision and robotics. These breakthroughs have relied upon massive amounts of human annotated training data. This time consuming process has begun impeding the progress of these deep learning efforts. This paper describes a method to incorporate photo-realistic computer images from a simulation engine to rapidly generate annotated data that can be used for the training of machine learning algorithms. We demonstrate that a state of the art architecture, which is trained only using these synthetic annotations, performs better than the identical architecture trained on human annotated real-world data, when tested on the KITTI data set for vehicle detection. By training machine learning algorithms on a rich virtual world, real objects in real scenes can be learned and classified using synthetic data. This approach offers the possibility of accelerating deep learning's application to sensor-based classification problems like those that appear in self-driving cars. The source code and data to train and validate the networks described in this paper are made available for researchers.

489 citations


Journal ArticleDOI
TL;DR: This survey paper review, extend, compare, and evaluate experimentally model-based algorithms for real-time collision detection, isolation, and identification that use only proprioceptive sensors that cover the context-independent phases of the collision event pipeline for robots interacting with the environment.
Abstract: Robot assistants and professional coworkers are becoming a commodity in domestic and industrial settings. In order to enable robots to share their workspace with humans and physically interact with them, fast and reliable handling of possible collisions on the entire robot structure is needed, along with control strategies for safe robot reaction. The primary motivation is the prevention or limitation of possible human injury due to physical contacts. In this survey paper, based on our early work on the subject, we review, extend, compare, and evaluate experimentally model-based algorithms for real-time collision detection, isolation, and identification that use only proprioceptive sensors. This covers the context-independent phases of the collision event pipeline for robots interacting with the environment, as in physical human–robot interaction or manipulation tasks. The problem is addressed for rigid robots first and then extended to the presence of joint/transmission flexibility. The basic physically motivated solution has already been applied to numerous robotic systems worldwide, ranging from manipulators and humanoids to flying robots, and even to commercial products.

467 citations


Journal ArticleDOI
15 Sep 2017-Science
TL;DR: A DNA robot is demonstrated that performs a nanomechanical task substantially more sophisticated than previous work and modularity could allow diverse new functions performed by robots using the same set of building blocks.
Abstract: INTRODUCTION Since the 1980s, the design and synthesis of molecular machines has been identified as a grand challenge for molecular engineering. Robots are an important type of molecular machine that automatically carry out complex nanomechanical tasks. DNA molecules are excellent materials for building molecular robots, because their geometric, thermodynamic, and kinetic properties are well understood and highly programmable. So far, the development of DNA robots has been limited to simple functions. Most DNA robots were designed to perform a single function: walking in a controlled direction. A few demonstrations included a second function combined with walking (for example, picking up nanoparticles or choosing a path at a junction). However, these relatively more complex functions were also more difficult to control, and the complexity of the tasks was limited to what the robot can perform within 3 to 12 steps. In addition, each robot design was tailored for a specific task, complicating efforts to develop new robots that perform new tasks by combining functions and mechanisms. RATIONALE The design and synthesis of molecular robots presents two critical challenges, those of modularity and algorithm simplicity, which have been transformative in other areas of molecular engineering. For example, simple and modular building blocks have been used for scaling up molecular information processing with DNA circuits. As in DNA circuits, simple building blocks for DNA robots could enable more complex nanomechanical tasks, whereas modularity could allow diverse new functions performed by robots using the same set of building blocks. RESULTS We demonstrate a DNA robot that performs a nanomechanical task substantially more sophisticated than previous work. We developed a simple algorithm and three modular building blocks for a DNA robot that performs autonomous cargo sorting. The robot explores a two-dimensional testing ground on the surface of DNA origami, picks up multiple cargos of two types that are initially at unordered locations, and delivers each type to a specified destination until all cargo molecules are sorted into two distinct piles. The robot is designed to perform a random walk without any energy supply. Exploiting this feature, a single robot can repeatedly sort multiple cargos. Localization on DNA origami allows for distinct cargo-sorting tasks to take place simultaneously in one test tube or for multiple robots to collectively perform the same task. On average, our robot performed approximately 300 steps while sorting the cargos. The number of steps is one to two magnitudes larger than the previously demonstrated DNA robots performing additional tasks while walking. Using exactly the same robot design, the system could be generalized to multiple types of cargos with arbitrary initial distributions, and to many instances of distinct tasks in parallel, whereas each task can be assigned a distinct number of robots depending on the difficulty of the task. CONCLUSION Using aptamers, antibodies, or direct conjugation, small chemicals, metal nanoparticles, and proteins could be transported as cargo molecules so that the cargo-sorting DNA robots could have potential applications in autonomous chemical synthesis, in manufacturing responsive molecular devices, and in programmable therapeutics. The building blocks developed in this work could also be used for diverse functions other than cargo sorting. For example, inspired by ant foraging, adding a new building block for leaving pheromone-like signals on a path, DNA robots could be programmed to find the shortest path and efficiently transport cargo molecules. With simple communication between the robots, they could perform even more sophisticated tasks. With more effort in developing modular and collective molecular robots, and with simple and systematic approaches, molecular robots could eventually be easily programmed like macroscopic robots, but working in microscopic environments.

397 citations


Journal ArticleDOI
TL;DR: It is argued that, by employing model-based reinforcement learning, the—now limited—adaptability characteristics of robotic systems can be expanded, and model- based reinforcement learning exhibits advantages that makes it more applicable to real life use-cases compared to model-free methods.
Abstract: Reinforcement learning is an appealing approach for allowing robots to learn new tasks. Relevant literature reveals a plethora of methods, but at the same time makes clear the lack of implementations for dealing with real life challenges. Current expectations raise the demand for adaptable robots. We argue that, by employing model-based reinforcement learning, the--now limited--adaptability characteristics of robotic systems can be expanded. Also, model-based reinforcement learning exhibits advantages that makes it more applicable to real life use-cases compared to model-free methods. Thus, in this survey, model-based methods that have been applied in robotics are covered. We categorize them based on the derivation of an optimal policy, the definition of the returns function, the type of the transition model and the learned task. Finally, we discuss the applicability of model-based reinforcement learning approaches in new applications, taking into consideration the state of the art in both algorithms and hardware.

Posted Content
TL;DR: In this paper, a technique called hindsight experience replay is proposed to learn from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering, which can be combined with an arbitrary off-policy algorithm and may be seen as a form of implicit curriculum.
Abstract: Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be deployed on a physical robot and successfully complete the task.

Proceedings ArticleDOI
TL;DR: By randomizing the dynamics of the simulator during training, this paper is able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained.
Abstract: Simulations are attractive environments for training agents as they provide an abundant source of data and alleviate certain safety concerns during the training process. But the behaviours developed by agents in simulation are often specific to the characteristics of the simulator. Due to modeling error, strategies that are successful in simulation may not transfer to their real world counterparts. In this paper, we demonstrate a simple method to bridge this "reality gap". By randomizing the dynamics of the simulator during training, we are able to develop policies that are capable of adapting to very different dynamics, including ones that differ significantly from the dynamics on which the policies were trained. This adaptivity enables the policies to generalize to the dynamics of the real world without any training on the physical system. Our approach is demonstrated on an object pushing task using a robotic arm. Despite being trained exclusively in simulation, our policies are able to maintain a similar level of performance when deployed on a real robot, reliably moving an object to a desired location from random initial configurations. We explore the impact of various design decisions and show that the resulting policies are robust to significant calibration error.

Journal ArticleDOI
TL;DR: A review of state-of-the-art researches on soft robots and application areas can be found in this paper, where several innovative techniques and diverse materials and fabrication methods are described.
Abstract: Soft robots are often inspired from biological systems which consist of soft materials or are actuated by electrically activated materials. There are several advantages of soft robots compared to the conventional robots; safe human-machine interaction, adaptability to wearable devices, simple gripping system, and so on. Due to the unique features and advantages, soft robots have a considerable range of applications. This article reviews state-of-the-art researches on soft robots and application areas. Actuation systems for soft robots can be categorized and analyzed into three types: variable length tendon, fluidic actuation, and electro-active polymer (EAP). The deformable property of soft robots restricts the use of many conventional rigid sensors such as encoders, strain gauges, or inertial measurement units. Thus, contactless approaches for sensing and/or sensors with low modulus are preferable for soft robots. Sensors include low modulus (< 1 MPa) elastomers with liquid-phase material filled channels and are appropriate for proprioception which is determined by the degree of curvature. In control perspective, novel control idea should be developed because the conventional control techniques may be inadequate to handle soft robots. Several innovative techniques and diverse materials & fabrication methods are described in this review article. In addition, a wide range of soft robots are characterized and analyzed based on the following sub-categories; actuation, sensing, structure, control and electronics, materials, fabrication and system, and applications.

Journal ArticleDOI
TL;DR: In order to extend the semiglobal stability achieved by conventional neural control to global stability, a switching mechanism is integrated into the control design and effectiveness of the proposed control design has been shown through experiments carried out on the Baxter Robot.
Abstract: Robots with coordinated dual arms are able to perform more complicated tasks that a single manipulator could hardly achieve. However, more rigorous motion precision is required to guarantee effective cooperation between the dual arms, especially when they grasp a common object. In this case, the internal forces applied on the object must also be considered in addition to the external forces. Therefore, a prescribed tracking performance at both transient and steady states is first specified, and then, a controller is synthesized to rigorously guarantee the specified motion performance. In the presence of unknown dynamics of both the robot arms and the manipulated object, the neural network approximation technique is employed to compensate for uncertainties. In order to extend the semiglobal stability achieved by conventional neural control to global stability, a switching mechanism is integrated into the control design. Effectiveness of the proposed control design has been shown through experiments carried out on the Baxter Robot.

Book
01 Dec 2017
TL;DR: The use of factor graphs for the modeling and solving of large-scale inference problems in robotics is reviewed, and the iSAM class of algorithms that can reuse previous computations are discussed, re-interpreting incremental matrix factorization methods as operations on graphical models, introducing the Bayes tree in the process.
Abstract: Factor Graphs for Robot Perception reviews the use of factor graphs for the modeling and solving of large-scale inference problems in robotics. Factor graphs are a family of probabilistic graphical models, other examples of which are Bayesian networks and Markov random fields, well known from the statistical modeling and machine learning literature. They provide a powerful abstraction that gives insight into particular inference problems, making it easier to think about and design solutions, and write modular software to perform the actual inference. This book illustrates their use in the simultaneous localization and mapping problem and other important problems associated with deploying robots in the real world. Factor graphs are introduced as an economical representation within which to formulate the different inference problems, setting the stage for the subsequent sections on practical methods to solve them. The book explains the nonlinear optimization techniques for solving arbitrary nonlinear factor graphs, which requires repeatedly solving large sparse linear systems. Factor Graphs for Robot Perception will be of interest to students, researchers and practicing roboticists with an interest in the broad impact factor graphs have had, and continue to have, in robot perception.

Journal ArticleDOI
TL;DR: By explicitly taking into account the effect of uncertainty, the robot can evaluate motion plans based on how vulnerable they are to disturbances, and constitute one of the first examples of provably safe and robust control for robotic systems with complex nonlinear dynamics that need to plan in real time in environments with complex geometric constraints.
Abstract: We consider the problem of generating motion plans for a robot that are guaranteed to succeed despite uncertainty in the environment, parametric model uncertainty, and disturbances. Furthermore, we...

Journal ArticleDOI
TL;DR: The so-called DEAs are introduced emphasizing the key points of working principle, key components and electromechanical modeling approaches, and different DEA-driven soft robots, including wearable/humanoid robots, walking/serpentine robots, flying robots and swimming robots, are reviewed.
Abstract: Conventional industrial robots with the rigid actuation technology have made great progress for humans in the fields of automation assembly and manufacturing. With an increasing number of robots needing to interact with humans and unstructured environments, there is a need for soft robots capable of sustaining large deformation while inducing little pressure or damage when maneuvering through confined spaces. The emergence of soft robotics offers the prospect of applying soft actuators as artificial muscles in robots, replacing traditional rigid actuators. Dielectric elastomer actuators (DEAs) are recognized as one of the most promising soft actuation technologies due to the facts that: i) dielectric elastomers are kind of soft, motion-generating materials that resemble natural muscle of humans in terms of force, strain (displacement per unit length or area) and actuation pressure/density; ii) dielectric elastomers can produce large voltage-induced deformation. In this survey, we first introduce the so-called DEAs emphasizing the key points of working principle, key components and electromechanical modeling approaches. Then, different DEA-driven soft robots, including wearable/humanoid robots, walking/serpentine robots, flying robots and swimming robots, are reviewed. Lastly, we summarize the challenges and opportunities for the further studies in terms of mechanism design, dynamics modeling and autonomous control.

Journal ArticleDOI
TL;DR: A novel control scheme is developed for a teleoperation system, combining the radial basis function (RBF) neural networks (NNs) and wave variable technique to simultaneously compensate for the effects caused by communication delays and dynamics uncertainties.
Abstract: In this paper, a novel control scheme is developed for a teleoperation system, combining the radial basis function (RBF) neural networks (NNs) and wave variable technique to simultaneously compensate for the effects caused by communication delays and dynamics uncertainties. The teleoperation system is set up with a TouchX joystick as the master device and a simulated Baxter robot arm as the slave robot. The haptic feedback is provided to the human operator to sense the interaction force between the slave robot and the environment when manipulating the stylus of the joystick. To utilize the workspace of the telerobot as much as possible, a matching process is carried out between the master and the slave based on their kinematics models. The closed loop inverse kinematics (CLIK) method and RBF NN approximation technique are seamlessly integrated in the control design. To overcome the potential instability problem in the presence of delayed communication channels, wave variables and their corrections are effectively embedded into the control system, and Lyapunov-based analysis is performed to theoretically establish the closed-loop stability. Comparative experiments have been conducted for a trajectory tracking task, under the different conditions of various communication delays. Experimental results show that in terms of tracking performance and force reflection, the proposed control approach shows superior performance over the conventional methods.

Journal ArticleDOI
TL;DR: In this article, a transparent tactile sensitive layer based on single-layer graphene and a photovoltaic cell underneath is used as a building block for energy-autonomous, flexible, and tactile skin.
Abstract: Tactile or electronic skin is needed to provide critical haptic perception to robots and amputees, as well as in wearable electronics for health monitoring and wellness applications. Energy autonomy of skin is a critical feature that would enable better portability and longer operation times. This study shows a novel structure, consisting of a transparent tactile sensitive layer based on single-layer graphene, and a photovoltaic cell underneath as a building block for energy-autonomous, flexible, and tactile skin. Transparency of the touch sensitive layer is considered a key feature to allow the photovoltaic cell to effectively harvest light. Moreover, ultralow power consumed by the sensitive layer (20 nW cm−2) further reduces the photovoltaic area required to drive the tactile skin. In addition to its energy autonomy, the fabricated skin is sensitive to touch, mainly because a transparent polymeric protective layer, spin-coated on the sensor's active area, makes the coplanar capacitor sensitive to touch, detecting minimum pressures of 0.11 kPa with a uniform sensitivity of 4.3 Pa−1 along a broad pressure range. Finally, the tactile skin patches are integrated on a prosthetic hand, and the responses of the sensors for static and dynamic stimuli are evaluated by performing tasks, ranging from simple touching to grabbing of soft objects.

Journal ArticleDOI
TL;DR: In this article, a survey extensively reviews current trends in robot tactile perception of object properties, including shape, surface material and object pose, and the role of touch sensing in combination with other sensing sources is discussed.

Journal ArticleDOI
30 Aug 2017
TL;DR: This work introduces a vacuum-powered soft pneumatic actuator (V-SPA) that leverages a single, shared vacuum power supply and enables complex soft robotic systems with multiple degrees of freedom (DoFs) and diverse functions.
Abstract: We introduce a vacuum-powered soft pneumatic actuator (V-SPA) that leverages a single, shared vacuum power supply and enables complex soft robotic systems with multiple degrees of freedom (DoFs) and diverse functions. In addition to actuation, other utilities enabled by vacuum pressure include gripping and stiffening through granular media jamming, as well as direct suction adhesion to smooth surfaces, for manipulation or vertical fixation. We investigate the performance of the new actuator through direct characterization of a 3-DoF, plug-and-play V-SPA Module built from multiple V-SPAs and demonstrate the integration of different vacuum-enabled capabilities with a continuum-style robot platform outfitted with modular peripheral mechanisms. We show that these different vacuum-powered modules can be combined to achieve a variety of tasks—including multimodal locomotion, object manipulation, and stiffness tuning—to illustrate the utility and viability of vacuum as a singular alternative power source for soft pneumatic robots and not just a peripheral feature in itself. Our results highlight the effectiveness of V-SPAs in providing core soft robot capabilities and facilitating the consolidation of previously disparate subsystems for actuation and various specialized tasks, conducive to improving the compact design efficiency of larger, more complex multifunctional soft robotic systems.

Journal ArticleDOI
TL;DR: This paper provides a comprehensive survey of the recent development of the human-centered intelligent robot and presents a survey of existing works on human- centered robots.
Abstract: Intelligent techniques foster the dissemination of new discoveries and novel technologies that advance the ability of robots to assist and support humans. The human-centered intelligent robot has become an important research field that spans all of the robot capabilities including navigation, intelligent control, pattern recognition and human-robot interaction. This paper focuses on the recent achievements and presents a survey of existing works on human-centered robots. Furthermore, we provide a comprehensive survey of the recent development of the human-centered intelligent robot and discuss the issues and challenges in the field.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: In this article, a recurrent neural network with reinforcement learning was used to perform a peg-in-hole task with a tight clearance and robustness against positional and angular errors for part-mating.
Abstract: The high precision assembly of mechanical parts requires precision that exceeds that of robots. Conventional part-mating methods used in the current manufacturing require numerous parameters to be tediously tuned before deployment. We show how a robot can successfully perform a peg-in-hole task with a tight clearance through training a recurrent neural network with reinforcement learning. In addition to reducing manual effort, the proposed method also shows a better fitting performance with a tighter clearance and robustness against positional and angular errors for the peg-in-hole task. The neural network learns to take the optimal action by observing the sensors of a robot to estimate the system state. The advantages of our proposed method are validated experimentally on a 7-axis articulated robot arm.

Journal ArticleDOI
TL;DR: A global overview of deliberation functions in robotics is presented and the main characteristics, design choices and constraints of these functions are discussed.

Proceedings ArticleDOI
Mark Pfeiffer1, Michael Schaeuble1, Juan Nieto1, Roland Siegwart1, Cesar Cadena1 
01 May 2017
TL;DR: In this paper, a target-oriented end-to-end navigation model for a robotic platform is learned from expert demonstrations generated in simulation with an existing motion planner, which can safely navigate the robot through obstacle-cluttered environments to reach the provided targets.
Abstract: Learning from demonstration for motion planning is an ongoing research topic. In this paper we present a model that is able to learn the complex mapping from raw 2D-laser range findings and a target position to the required steering commands for the robot. To our best knowledge, this work presents the first approach that learns a target-oriented end-to-end navigation model for a robotic platform. The supervised model training is based on expert demonstrations generated in simulation with an existing motion planner. We demonstrate that the learned navigation model is directly transferable to previously unseen virtual and, more interestingly, real-world environments. It can safely navigate the robot through obstacle-cluttered environments to reach the provided targets. We present an extensive qualitative and quantitative evaluation of the neural network-based motion planner, and compare it to a grid-based global approach, both in simulation and in real-world experiments.

Proceedings ArticleDOI
06 Mar 2017
TL;DR: This work presents a series of algorithms and an accompanying system that enables robots to autonomously synthesize policy descriptions and respond to both general and targeted queries by human collaborators, demonstrating applicability to a variety of robot controller types.
Abstract: Shared expectations and mutual understanding are critical facets of teamwork. Achieving these in human-robot collaborative contexts can be especially challenging, as humans and robots are unlikely to share a common language to convey intentions, plans, or justifications. Even in cases where human co-workers can inspect a robot's control code, and particularly when statistical methods are used to encode control policies, there is no guarantee that meaningful insights into a robot's behavior can be derived or that a human will be able to efficiently isolate the behaviors relevant to the interaction. We present a series of algorithms and an accompanying system that enables robots to autonomously synthesize policy descriptions and respond to both general and targeted queries by human collaborators. We demonstrate applicability to a variety of robot controller types including those that utilize conditional logic, tabular reinforcement learning, and deep reinforcement learning, synthesizing informative policy descriptions for collaborators and facilitating fault diagnosis by non-experts.

Proceedings ArticleDOI
01 May 2017
TL;DR: The authors decompose neural network policies into task-specific and robot-specific modules, where the task specific modules are shared across robots and the robot specific modules were shared across all tasks on that robot, and exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations.
Abstract: Reinforcement learning (RL) can automate a wide variety of robotic skills, but learning each new skill requires considerable real-world data collection and manual representation engineering to design policy classes or features. Using deep reinforcement learning to train general purpose neural network policies alleviates some of the burden of manual representation engineering by using expressive policy classes, but exacerbates the challenge of data collection, since such methods tend to be less efficient than RL with low-dimensional, hand-designed representations. Transfer learning can mitigate this problem by enabling us to transfer information from one skill to another and even from one robot to another. We show that neural network policies can be decomposed into “task-specific” and “robot-specific” modules, where the task-specific modules are shared across robots, and the robot-specific modules are shared across all tasks on that robot. This allows for sharing task information, such as perception, between robots and sharing robot information, such as dynamics and kinematics, between tasks. We exploit this decomposition to train mix-and-match modules that can solve new robot-task combinations that were not seen during training. Using a novel approach to train modular neural networks, we demonstrate the effectiveness of our transfer method for enabling zero-shot generalization with a variety of robots and tasks in simulation for both visual and non-visual tasks.

Posted Content
TL;DR: The results show that by making extensive use of off-policy data and replay, it is possible to find control policies that robustly grasp objects and stack them and hint that it may soon be feasible to train successful stacking policies by collecting interactions on real robots.
Abstract: Deep learning and reinforcement learning methods have recently been used to solve a variety of problems in continuous control domains. An obvious application of these techniques is dexterous manipulation tasks in robotics which are difficult to solve using traditional control theory or hand-engineered approaches. One example of such a task is to grasp an object and precisely stack it on another. Solving this difficult and practically relevant problem in the real world is an important long-term goal for the field of robotics. Here we take a step towards this goal by examining the problem in simulation and providing models and techniques aimed at solving it. We introduce two extensions to the Deep Deterministic Policy Gradient algorithm (DDPG), a model-free Q-learning based method, which make it significantly more data-efficient and scalable. Our results show that by making extensive use of off-policy data and replay, it is possible to find control policies that robustly grasp objects and stack them. Further, our results hint that it may soon be feasible to train successful stacking policies by collecting interactions on real robots.

Proceedings ArticleDOI
01 Sep 2017
TL;DR: In this paper, a successor-feature-based deep reinforcement learning algorithm is proposed to transfer navigation knowledge from previously mastered navigation tasks to new problem instances, which substantially decreases the required learning time after the first task instance has been solved, making it easily adaptable to changing environments.
Abstract: In this paper we consider the problem of robot navigation in simple maze-like environments where the robot has to rely on its onboard sensors to perform the navigation task. In particular, we are interested in solutions to this problem that do not require localization, mapping or planning. Additionally, we require that our solution can quickly adapt to new situations (e.g., changing navigation goals and environments). To meet these criteria we frame this problem as a sequence of related reinforcement learning tasks. We propose a successor-feature-based deep reinforcement learning algorithm that can learn to transfer knowledge from previously mastered navigation tasks to new problem instances. Our algorithm substantially decreases the required learning time after the first task instance has been solved, which makes it easily adaptable to changing environments. We validate our method in both simulated and real robot experiments with a Robotino and compare it to a set of baseline methods including classical planning-based navigation.