scispace - formally typeset
Search or ask a question

Showing papers in "arXiv: Robotics in 2015"


Journal ArticleDOI
TL;DR: A survival of the fittest strategy that selects the points and keyframes of the reconstruction leads to excellent robustness and generates a compact and trackable map that only grows if the scene content changes, allowing lifelong operation.
Abstract: This paper presents ORB-SLAM, a feature-based monocular SLAM system that operates in real time, in small and large, indoor and outdoor environments. The system is robust to severe motion clutter, allows wide baseline loop closing and relocalization, and includes full automatic initialization. Building on excellent algorithms of recent years, we designed from scratch a novel system that uses the same features for all SLAM tasks: tracking, mapping, relocalization, and loop closing. A survival of the fittest strategy that selects the points and keyframes of the reconstruction leads to excellent robustness and generates a compact and trackable map that only grows if the scene content changes, allowing lifelong operation. We present an exhaustive evaluation in 27 sequences from the most popular datasets. ORB-SLAM achieves unprecedented performance with respect to other state-of-the-art monocular SLAM approaches. For the benefit of the community, we make the source code public.

3,807 citations


Posted Content
TL;DR: It is shown how existing convolutional neural networks can be used to perform lane and vehicle detection while running at frame rates required for a real-time system, lending credence to the hypothesis that deep learning holds promise for autonomous driving.
Abstract: Numerous groups have applied a variety of deep learning techniques to computer vision problems in highway perception scenarios. In this paper, we presented a number of empirical evaluations of recent deep learning advances. Computer vision, combined with deep learning, has the potential to bring about a relatively inexpensive, robust solution to autonomous driving. To prepare deep learning for industry uptake and practical applications, neural networks will require large data sets that represent all possible driving environments and scenarios. We collect a large data set of highway data and apply deep learning and computer vision algorithms to problems such as car and lane detection. We show how existing convolutional neural networks (CNNs) can be used to perform lane and vehicle detection while running at frame rates required for a real-time system. Our results lend credence to the hypothesis that deep learning holds promise for autonomous driving.

594 citations


Journal ArticleDOI
TL;DR: In this article, a preintegrated IMU model is proposed for visual-inertial odometry (VIO) to achieve accurate state estimation in real-time, outperforming state-of-the-art approaches.
Abstract: Current approaches for visual-inertial odometry (VIO) are able to attain highly accurate state estimation via nonlinear optimization. However, real-time optimization quickly becomes infeasible as the trajectory grows over time, this problem is further emphasized by the fact that inertial measurements come at high rate, hence leading to fast growth of the number of variables in the optimization. In this paper, we address this issue by preintegrating inertial measurements between selected keyframes into single relative motion constraints. Our first contribution is a \emph{preintegration theory} that properly addresses the manifold structure of the rotation group. We formally discuss the generative measurement model as well as the nature of the rotation noise and derive the expression for the \emph{maximum a posteriori} state estimator. Our theoretical development enables the computation of all necessary Jacobians for the optimization and a-posteriori bias correction in analytic form. The second contribution is to show that the preintegrated IMU model can be seamlessly integrated into a visual-inertial pipeline under the unifying framework of factor graphs. This enables the application of incremental-smoothing algorithms and the use of a \emph{structureless} model for visual measurements, which avoids optimizing over the 3D points, further accelerating the computation. We perform an extensive evaluation of our monocular \VIO pipeline on real and simulated datasets. The results confirm that our modelling effort leads to accurate state estimation in real-time, outperforming state-of-the-art approaches.

370 citations


Posted Content
TL;DR: A tutorial of quaternion algebra, especially suited for the error-state Kalman filter, with the aim of building Visual-Inertial SLAM and odometry systems.
Abstract: A tutorial of quaternion algebra, especially suited for the error-state Kalman filter, with the aim of building Visual-Inertial SLAM and odometry systems.

269 citations


Journal ArticleDOI
TL;DR: The Yale-CMU-Berkeley Object and Model Set (YCB) as mentioned in this paper is a set of objects and models designed to cover a wide range of aspects of the manipulation problem, including objects of daily life with different shapes, sizes, textures, weight and rigidity.
Abstract: In this paper we present the Yale-CMU-Berkeley (YCB) Object and Model set, intended to be used to facilitate benchmarking in robotic manipulation, prosthetic design and rehabilitation research. The objects in the set are designed to cover a wide range of aspects of the manipulation problem; it includes objects of daily life with different shapes, sizes, textures, weight and rigidity, as well as some widely used manipulation tests. The associated database provides high-resolution RGBD scans, physical properties, and geometric models of the objects for easy incorporation into manipulation and planning software platforms. In addition to describing the objects and models in the set along with how they were chosen and derived, we provide a framework and a number of example task protocols, laying out how the set can be used to quantitatively evaluate a range of manipulation approaches including planning, learning, mechanical design, control, and many others. A comprehensive literature survey on existing benchmarks and object datasets is also presented and their scope and limitations are discussed. The set will be freely distributed to research groups worldwide at a series of tutorials at robotics conferences, and will be otherwise available at a reasonable purchase cost. It is our hope that the ready availability of this set along with the ground laid in terms of protocol templates will enable the community of manipulation researchers to more easily compare approaches as well as continually evolve benchmarking tests as the field matures.

220 citations


Posted Content
TL;DR: It is confirmed that networks trained for semantic place categorization also perform better at (specific) place recognition when faced with severe appearance changes and provide a reference for which networks and layers are optimal for different aspects of the place recognition problem.
Abstract: After the incredible success of deep learning in the computer vision domain, there has been much interest in applying Convolutional Network (ConvNet) features in robotic fields such as visual navigation and SLAM. Unfortunately, there are fundamental differences and challenges involved. Computer vision datasets are very different in character to robotic camera data, real-time performance is essential, and performance priorities can be different. This paper comprehensively evaluates and compares the utility of three state-of-the-art ConvNets on the problems of particular relevance to navigation for robots; viewpoint-invariance and condition-invariance, and for the first time enables real-time place recognition performance using ConvNets with large maps by integrating a variety of existing (locality-sensitive hashing) and novel (semantic search space partitioning) optimization techniques. We present extensive experiments on four real world datasets cultivated to evaluate each of the specific challenges in place recognition. The results demonstrate that speed-ups of two orders of magnitude can be achieved with minimal accuracy degradation, enabling real-time performance. We confirm that networks trained for semantic place categorization also perform better at (specific) place recognition when faced with severe appearance changes and provide a reference for which networks and layers are optimal for different aspects of the place recognition problem.

203 citations


Posted Content
TL;DR: The combination of ILP model based algorithms and the heuristics proves to be highly effective, allowing the computation of 1.x-optimal solutions for problems containing hundreds of robots, densely populated in the environment, often in just seconds.
Abstract: We study the problem of optimal multi-robot path planning on graphs MPP over four distinct minimization objectives: the makespan (last arrival time), the maximum (single-robot traveled) distance, the total arrival time, and the total distance. In a related paper, we show that these objectives are distinct and NP-hard to optimize. In this work, we focus on efficiently algorithmic solutions for solving these optimal MPP problems. Toward this goal, we first establish a one-to-one solution mapping between MPP and network-flow. Based on this equivalence and integer linear programming (ILP), we design novel and complete algorithms for optimizing over each of the four objectives. In particular, our exact algorithm for computing optimal makespan solutions is a first such that is capable of solving extremely challenging problems with robot-vertex ratio as high as 100%. Then, we further improve the computational performance of these exact algorithms through the introduction of principled heuristics, at the expense of some optimality loss. The combination of ILP model based algorithms and the heuristics proves to be highly effective, allowing the computation of 1.x-optimal solutions for problems containing hundreds of robots, densely populated in the environment, often in just seconds.

190 citations


Posted Content
TL;DR: This work systematically analyzes possible cyber security attacks against Raven II, an advanced teleoperated robotic surgery system, and demonstrates the ability to maliciously control a wide range of robots functions, and to completely ignore or override command inputs from the surgeon.
Abstract: Teleoperated robots are playing an increasingly important role in military actions and medical services. In the future, remotely operated surgical robots will likely be used in more scenarios such as battlefields and emergency response. But rapidly growing applications of teleoperated surgery raise the question; what if the computer systems for these robots are attacked, taken over and even turned into weapons? Our work seeks to answer this question by systematically analyzing possible cyber security attacks against Raven II, an advanced teleoperated robotic surgery system. We identify a slew of possible cyber security threats, and experimentally evaluate their scopes and impacts. We demonstrate the ability to maliciously control a wide range of robots functions, and even to completely ignore or override command inputs from the surgeon. We further find that it is possible to abuse the robot's existing emergency stop (E-stop) mechanism to execute efficient (single packet) attacks. We then consider steps to mitigate these identified attacks, and experimentally evaluate the feasibility of applying the existing security solutions against these threats. The broader goal of our paper, however, is to raise awareness and increase understanding of these emerging threats. We anticipate that the majority of attacks against telerobotic surgery will also be relevant to other teleoperated robotic and co-robotic systems.

140 citations


Posted Content
TL;DR: This paper introduces a novel technique for fabricating functional robots using 3D printers that simultaneously depositing photopolymers and a non-curing liquid, and enables complex, pre-filled fluidic channels to be fabricated.
Abstract: This work introduces a novel technique for fabricating functional robots using 3D printers. Simultaneously depositing photopolymers and a non-curing liquid allows complex, pre-filled fluidic channels to be fabricated. This new printing capability enables complex hydraulically actuated robots and robotic components to be automatically built, with no assembly required. The technique is showcased by printing linear bellows actuators, gear pumps, soft grippers and a hexapod robot, using a commercially-available 3D printer. We detail the steps required to modify the printer and describe the design constraints imposed by this new fabrication approach.

136 citations


Posted Content
TL;DR: This paper addresses the inconsistency of the EKF-based SLAM algorithm that stems from non-observability of the origin and orientation of the global reference frame and proves on the non-linear two-dimensional problem with point landmarks that this type of inconsistency is remedied using the Invariant EKf.
Abstract: In this paper we address the inconsistency of the EKF-based SLAM algorithm that stems from non-observability of the origin and orientation of the global reference frame. We prove on the non-linear two-dimensional problem with point landmarks observed that this type of inconsistency is remedied using the Invariant EKF, a recently introduced variant ot the EKF meant to account for the symmetries of the state space. Extensive Monte-Carlo runs illustrate the theoretical results.

105 citations


Posted Content
TL;DR: It is shown how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot's behaviour during navigation tasks.
Abstract: In this paper we focus on the challenging problem of place categorization and semantic mapping on a robot without environment-specific training. Motivated by their ongoing success in various visual recognition tasks, we build our system upon a state-of-the-art convolutional network. We overcome its closed-set limitations by complementing the network with a series of one-vs-all classifiers that can learn to recognize new semantic classes online. Prior domain knowledge is incorporated by embedding the classification system into a Bayesian filter framework that also ensures temporal coherence. We evaluate the classification accuracy of the system on a robot that maps a variety of places on our campus in real-time. We show how semantic information can boost robotic object detection performance and how the semantic map can be used to modulate the robot's behaviour during navigation tasks. The system is made available to the community as a ROS module.

Journal ArticleDOI
TL;DR: This paper presents a formal hybrid dynamical system model that introduces suitably restricted compositions of these familiar abstractions with the guarantee of consistency analogous to global existence and uniqueness in classical dynamical systems.
Abstract: Rigid bodies, plastic impact, persistent contact, Coulomb friction, and massless limbs are ubiquitous simplifications introduced to reduce the complexity of mechanics models despite the obvious physical inaccuracies that each incurs individually. In concert, it is well known that the interaction of such idealized approximations can lead to conflicting and even paradoxical results. As robotics modeling moves from the consideration of isolated behaviors to the analysis of tasks requiring their composition, a mathematically tractable framework for building models that combine these simple approximations yet achieve reliable results is overdue. In this paper we present a formal hybrid dynamical system model that introduces suitably restricted compositions of these familiar abstractions with the guarantee of consistency analogous to global existence and uniqueness in classical dynamical systems. The hybrid system developed here provides a discontinuous but self-consistent approximation to the continuous (though possibly very stiff and fast) dynamics of a physical robot undergoing intermittent impacts. The modeling choices sacrifice some quantitative numerical efficiencies while maintaining qualitatively correct and analytically tractable results with consistency guarantees promoting their use in formal reasoning about mechanism, feedback control, and behavior design in robots that make and break contact with their environment.

Posted Content
TL;DR: In this article, the authors investigated the functional and social acceptance of a humanoid robot and found that participants conformed more to the robot's answers when their decisions were about functional issues than when they were about social issues.
Abstract: To investigate the functional and social acceptance of a humanoid robot, we carried out an experimental study with 56 adult participants and the iCub robot. Trust in the robot has been considered as a main indicator of acceptance in decision-making tasks characterized by perceptual uncertainty (e.g., evaluating the weight of two objects) and socio-cognitive uncertainty (e.g., evaluating which is the most suitable item in a specific context), and measured by the participants' conformation to the iCub's answers to specific questions. In particular, we were interested in understanding whether specific (i) user-related features (i.e. desire for control), (ii) robot-related features (i.e., attitude towards social influence of robots), and (iii) context-related features (i.e., collaborative vs. competitive scenario), may influence their trust towards the iCub robot. We found that participants conformed more to the iCub's answers when their decisions were about functional issues than when they were about social issues. Moreover, the few participants conforming to the iCub's answers for social issues also conformed less for functional issues. Trust in the robot's functional savvy does not thus seem to be a pre-requisite for trust in its social savvy. Finally, desire for control, attitude towards social influence of robots and type of interaction scenario did not influence the trust in iCub. Results are discussed with relation to methodology of HRI research.

Posted Content
TL;DR: In this paper, the influence of personality traits such as extroversion and negative attitude toward robots on speech and gaze during a cooperative task with a humanoid robot iCub was assessed.
Abstract: Estimating the engagement is critical for human - robot interaction. Engagement measures typically rely on the dynamics of the social signals exchanged by the partners, especially speech and gaze. However, the dynamics of these signals is likely to be influenced by individual and social factors, such as personality traits, as it is well documented that they critically influence how two humans interact with each other. Here, we assess the influence of two factors, namely extroversion and negative attitude toward robots, on speech and gaze during a cooperative task, where a human must physically manipulate a robot to assemble an object. We evaluate if the scores of extroversion and negative attitude towards robots co-variate with the duration and frequency of gaze and speech cues. The experiments were carried out with the humanoid robot iCub and N=56 adult participants. We found that the more people are extrovert, the more and longer they tend to talk with the robot; and the more people have a negative attitude towards robots, the less they will look at the robot face and the more they will look at the robot hands where the assembly and the contacts occur. Our results confirm and provide evidence that the engagement models classically used in human-robot interaction should take into account attitudes and personality traits.

Journal ArticleDOI
TL;DR: This symmetric design has enabled the quadrotor to become a simple but powerful vertical takeoff and landing aerial platform popular among the robotics community.
Abstract: This paper describes the synthesis and evaluation of a novel state estimator for a Quadrotor Micro Aerial Vehicle. Dynamic equations which relate acceleration, attitude and the aero-dynamic propeller drag are encapsulated in an extended Kalman filter framework for estimating the velocity and the attitude of the quadrotor. It is demonstrated that exploiting the relationship between the body frame accelerations and velocities, due to blade flapping, enables drift free estimation of lateral and longitudinal components of body frame translational velocity along with improvements to roll and pitch components of body attitude estimations. Real world data sets gathered using a commercial off-the-shelf quadrotor platform, together with ground truth data from a Vicon system, are used to evaluate the effectiveness of the proposed algorithm.

Posted Content
Yinxiao Li1, Yonghao Yue1, Danfei Xu1, Eitan Grinspun1, Peter K. Allen1 
TL;DR: This work presents a novel solution to find an optimal trajectory to move the robotic arm to fold a garment, and demonstrates that the two-arm robot can follow the optimized trajectories, achieving accurate and efficient manipulations of deformable objects.
Abstract: Robotic manipulation of deformable objects remains a challenging task. One such task is folding a garment autonomously. Given start and end folding positions, what is an optimal trajectory to move the robotic arm to fold a garment? Certain trajectories will cause the garment to move, creating wrinkles, and gaps, other trajectories will fail altogether. We present a novel solution to find an optimal trajectory that avoids such problematic scenarios. The trajectory is optimized by minimizing a quadratic objective function in an off-line simulator, which includes material properties of the garment and frictional force on the table. The function measures the dissimilarity between a user folded shape and the folded garment in simulation, which is then used as an error measurement to create an optimal trajectory. We demonstrate that our two-arm robot can follow the optimized trajectories, achieving accurate and efficient manipulations of deformable objects.

Posted Content
TL;DR: In this article, it was shown that for a certain selection of tuning parameters and deterministic low-dispersion sampling sequences, PRM is deterministically asymptotically optimal.
Abstract: Probabilistic sampling-based algorithms, such as the probabilistic roadmap (PRM) and the rapidly-exploring random tree (RRT) algorithms, represent one of the most successful approaches to robotic motion planning, due to their strong theoretical properties (in terms of probabilistic completeness or even asymptotic optimality) and remarkable practical performance. Such algorithms are probabilistic in that they compute a path by connecting independently and identically distributed random points in the configuration space. Their randomization aspect, however, makes several tasks challenging, including certification for safety-critical applications and use of offline computation to improve real-time execution. Hence, an important open question is whether similar (or better) theoretical guarantees and practical performance could be obtained by considering deterministic, as opposed to random sampling sequences. The objective of this paper is to provide a rigorous answer to this question. Specifically, we first show that PRM, for a certain selection of tuning parameters and deterministic low-dispersion sampling sequences, is deterministically asymptotically optimal. Second, we characterize the convergence rate, and we find that the factor of sub-optimality can be very explicitly upper-bounded in terms of the l2-dispersion of the sampling sequence and the connection radius of PRM. Third, we show that an asymptotically optimal version of PRM exists with computational and space complexity arbitrarily close to O(n) (the theoretical lower bound), where n is the number of points in the sequence. This is in stark contrast to the O(n logn) complexity results for existing asymptotically-optimal probabilistic planners. Finally, through numerical experiments, we show that planning with deterministic low-dispersion sampling generally provides superior performance in terms of path cost and success rate.

Posted Content
TL;DR: The first fully functional vehicle platform operating in air and underwater with seamless transition between both mediums is presented, built on a bio-inspired locomotion force analysis that combines flight and swimming.
Abstract: Bio-inspired vehicles are currently leading the way in the quest to produce a vehicle capable of flight and underwater navigation. However, a fully functional vehicle has not yet been realized. We present the first fully functional vehicle platform operating in air and underwater with seamless transition between both mediums. These unique capabilities combined with the hovering, high maneuverability and reliability of multirotor vehicles, results in a disruptive technology for both civil and military application including air/water search and rescue, inspection, repairs and survey missions among others. The invention was built on a bio-inspired locomotion force analysis that combines flight and swimming. Three main advances in the present work has allowed this invention. The first is the discovery of a seamless transition method between air and underwater. The second is the design of a multi-medium propulsion system capable of efficient operation in air and underwater. The third combines the requirements for lift and thrust for flight (for a given weight) and the requirements for thrust and neutral buoyancy (in water) for swimming. The result is a careful balance between lift, thrust, weight, and neutral buoyancy implemented in the vehicle design. A fully operational prototype demonstrated the flight, and underwater navigation capabilities as well as the rapid air/water and water/air transition.

Posted Content
TL;DR: This paper presents a preintegration theory that properly addresses the manifold structure of the rotation group, and shows that the preintegrated IMU model can be seamlessly integrated into a visual-inertial pipeline under the uni- fying framework of factor graphs.
Abstract: Current approaches for visual-inertial navigation (VIN) are able to attain highly accurate state estimation via nonlinear optimization. However, real-time optimization quickly becomes infeasible as the trajectory grows over time; this problem is further emphasized by the fact that inertial measurements come at high rate, hence leading to fast growth of the number of variables in the optimization. In this paper, we address this issue by preintegrating inertial measurements between selected keyframes into single relative motion constraints. Our first contribution is a preintegration theory that properly addresses the manifold structure of the rotation group. We formally discuss the generative measurement model as well as the nature of the rotation noise and derive the expression for the maximum a posteriori state estimator. Our theoretical development enables the computation of all necessary Jacobians for the optimization and a-posteriori bias correction in analytic form. The second contribution is to show that the preintegrated IMU model can be seamlessly integrated into a visual-inertial pipeline under the uni- fying framework of factor graphs. This enables the application of incremental-smoothing algorithms and the use of a structureless model for visual measurements, which avoids optimizing over the 3D points, further accelerating the computation. We perform an extensive evaluation of our monocular VIN pipeline on real and simulated datasets. The results confirm that our modelling effort leads to accurate state estimation in real-time, outperforming state-of-the-art approaches.

Posted Content
TL;DR: In this article, the authors formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory.
Abstract: There is a large variety of objects and appliances in human environments, such as stoves, coffee dispensers, juice extractors, and so on. It is challenging for a roboticist to program a robot for each of these object types and for each of their instantiations. In this work, we present a novel approach to manipulation planning based on the idea that many household objects share similarly-operated object parts. We formulate the manipulation planning as a structured prediction problem and design a deep learning model that can handle large noise in the manipulation demonstrations and learns features from three different modalities: point-clouds, language and trajectory. In order to collect a large number of manipulation demonstrations for different objects, we developed a new crowd-sourcing platform called Robobarista. We test our model on our dataset consisting of 116 objects with 249 parts along with 250 language instructions, for which there are 1225 crowd-sourced manipulation demonstrations. We further show that our robot can even manipulate objects it has never seen before.

Proceedings ArticleDOI
TL;DR: In this article, a closed-loop interactive approach is proposed to perform incremental reconstruction in real-time and give users an online feedback about the quality parameters like ground sampling distance (GSD), image redundancy, etc on a surface mesh.
Abstract: Automatic reconstruction of 3D models from images using multi-view Structure-from-Motion methods has been one of the most fruitful outcomes of computer vision. These advances combined with the growing popularity of Micro Aerial Vehicles as an autonomous imaging platform, have made 3D vision tools ubiquitous for large number of Architecture, Engineering and Construction applications among audiences, mostly unskilled in computer vision. However, to obtain high-resolution and accurate reconstructions from a large-scale object using SfM, there are many critical constraints on the quality of image data, which often become sources of inaccuracy as the current 3D reconstruction pipelines do not facilitate the users to determine the fidelity of input data during the image acquisition. In this paper, we present and advocate a closed-loop interactive approach that performs incremental reconstruction in real-time and gives users an online feedback about the quality parameters like Ground Sampling Distance (GSD), image redundancy, etc on a surface mesh. We also propose a novel multi-scale camera network design to prevent scene drift caused by incremental map building, and release the first multi-scale image sequence dataset as a benchmark. Further, we evaluate our system on real outdoor scenes, and show that our interactive pipeline combined with a multi-scale camera network approach provides compelling accuracy in multi-view reconstruction tasks when compared against the state-of-the-art methods.

Posted Content
TL;DR: The problem of shared autonomy is formulated as a Partially Observable Markov Decision Process with uncertainty over the user's goal, and maximum entropy inverse optimal control is utilized to estimate a distribution over the users' goal based on the history of inputs.
Abstract: In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the history of inputs. Ideally, the robot assists the user by solving for an action which minimizes the expected cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal action is intractable, we use hindsight optimization to approximate the solution. In a user study, we compare our method to a standard predict-then-blend approach. We find that our method enables users to accomplish tasks more quickly while utilizing less input. However, when asked to rate each system, users were mixed in their assessment, citing a tradeoff between maintaining control authority and accomplishing tasks quickly.

Posted Content
TL;DR: In this paper, a policy search method is used to learn a set of trajectories for the desired motion skill by using iteratively refitted time-varying linear models, and then unifies these trajectories into a single control policy that can generalize to new situations.
Abstract: Autonomous learning of object manipulation skills can enable robots to acquire rich behavioral repertoires that scale to the variety of objects found in the real world. However, current motion skill learning methods typically restrict the behavior to a compact, low-dimensional representation, limiting its expressiveness and generality. In this paper, we extend a recently developed policy search method \cite{la-lnnpg-14} and use it to learn a range of dynamic manipulation behaviors with highly general policy representations, without using known models or example demonstrations. Our approach learns a set of trajectories for the desired motion skill by using iteratively refitted time-varying linear models, and then unifies these trajectories into a single control policy that can generalize to new situations. To enable this method to run on a real robot, we introduce several improvements that reduce the sample count and automate parameter selection. We show that our method can acquire fast, fluent behaviors after only minutes of interaction time, and can learn robust controllers for complex tasks, including putting together a toy airplane, stacking tight-fitting lego blocks, placing wooden rings onto tight-fitting pegs, inserting a shoe tree into a shoe, and screwing bottle caps onto bottles.

Journal ArticleDOI
TL;DR: In this article, the authors propose a model-free AP framework using Genetic Programming (GP) to derive an optimal Behavior Tree (BT) for an autonomous agent to achieve a given goal in unknown (but fully observable) environments.
Abstract: Definition of an accurate system model for Automated Planner (AP) is often impractical, especially for real-world problems. Conversely, off-the-shelf planners fail to scale up and are domain dependent. These drawbacks are inherited from conventional transition systems such as Finite State Machines (FSMs) that describes the action-plan execution generated by the AP. On the other hand, Behavior Trees (BTs) represent a valid alternative to FSMs presenting many advantages in terms of modularity, reactiveness, scalability and domain-independence. In this paper, we propose a model-free AP framework using Genetic Programming (GP) to derive an optimal BT for an autonomous agent to achieve a given goal in unknown (but fully observable) environments. We illustrate the proposed framework using experiments conducted with an open source benchmark Mario AI for automated generation of BTs that can play the game character Mario to complete a certain level at various levels of difficulty to include enemies and obstacles.

Posted Content
TL;DR: In this article, the Lagrangian duality is used to assess the quality of a given candidate solution and provide a certificate of optimality for a candidate SLAM solution, and when the duality gap is zero, one can compute a guaranteed optimal solution from the dual problem.
Abstract: State-of-the-art techniques for simultaneous localization and mapping (SLAM) employ iterative nonlinear optimization methods to compute an estimate for robot poses. While these techniques often work well in practice, they do not provide guarantees on the quality of the estimate. This paper shows that Lagrangian duality is a powerful tool to assess the quality of a given candidate solution. Our contribution is threefold. First, we discuss a revised formulation of the SLAM inference problem. We show that this formulation is probabilistically grounded and has the advantage of leading to an optimization problem with quadratic objective. The second contribution is the derivation of the corresponding Lagrangian dual problem. The SLAM dual problem is a (convex) semidefinite program, which can be solved reliably and globally by off-the-shelf solvers. The third contribution is to discuss the relation between the original SLAM problem and its dual. We show that from the dual problem, one can evaluate the quality (i.e., the suboptimality gap) of a candidate SLAM solution, and ultimately provide a certificate of optimality. Moreover, when the duality gap is zero, one can compute a guaranteed optimal SLAM solution from the dual problem, circumventing non-convex optimization. We present extensive (real and simulated) experiments supporting our claims and discuss practical relevance and open problems.

Posted Content
TL;DR: This paper proposes a new approach to detecting grasp points on novel objects presented in clutter by using knowledge of the geometry of a good grasp to improve detection and generates a large automatically labeled training set that gives high classification accuracy.
Abstract: This paper proposes a new approach to detecting grasp points on novel objects presented in clutter. The input to our algorithm is a point cloud and the geometric parameters of the robot hand. The output is a set of hand configurations that are expected to be good grasps. Our key idea is to use knowledge of the geometry of a good grasp to improve detection. First, we use a geometrically necessary condition to sample a large set of high quality grasp hypotheses. We were surprised to find that using simple geometric conditions for detection can result in a relatively high grasp success rate. Second, we use the notion of an antipodal grasp (a standard characterization of a good two fingered grasp) to help us classify these grasp hypotheses. In particular, we generate a large automatically labeled training set that gives us high classification accuracy. Overall, our method achieves an average grasp success rate of 88% when grasping novels objects presented in isolation and an average success rate of 73% when grasping novel objects presented in dense clutter. This system is available as a ROS package at this http URL

Posted Content
TL;DR: A novel approach that instead uses geo-tagged panoramas from the Google Street View as a source of global positioning and model the problem of localization as a non-linear least squares estimation in two phases.
Abstract: Accurate metrical localization is one of the central challenges in mobile robotics. Many existing methods aim at localizing after building a map with the robot. In this paper, we present a novel approach that instead uses geotagged panoramas from the Google Street View as a source of global positioning. We model the problem of localization as a non-linear least squares estimation in two phases. The first estimates the 3D position of tracked feature points from short monocular camera sequences. The second computes the rigid body transformation between the Street View panoramas and the estimated points. The only input of this approach is a stream of monocular camera images and odometry estimates. We quantified the accuracy of the method by running the approach on a robotic platform in a parking lot by using visual fiducials as ground truth. Additionally, we applied the approach in the context of personal localization in a real urban scenario by using data from a Google Tango tablet.

Posted Content
TL;DR: In this article, a method for checking and enforcing multi-contact stability based on the zero-tilting moment point (ZMP) is proposed, which is a generalization of ZMP support areas to take into account frictional constraints and multiple non-coplanar contacts.
Abstract: We propose a method for checking and enforcing multi-contact stability based on the Zero-tilting Moment Point (ZMP). The key to our development is the generalization of ZMP support areas to take into account (a) frictional constraints and (b) multiple non-coplanar contacts. We introduce and investigate two kinds of ZMP support areas. First, we characterize and provide a fast geometric construction for the support area generated by valid contact forces, with no other constraint on the robot motion. We call this set the full support area. Next, we consider the control of humanoid robots using the Linear Pendulum Mode (LPM). We observe that the constraints stemming from the LPM induce a shrinking of the support area, even for walking on horizontal floors. We propose an algorithm to compute the new area, which we call pendular support area. We show that, in the LPM, having the ZMP in the pendular support area is a necessary and sufficient condition for contact stability. Based on these developments, we implement a whole-body controller and generate feasible multi-contact motions where an HRP-4 humanoid locomotes in challenging multi-contact scenarios.

Posted Content
TL;DR: In this paper, a shared-control assistance framework is proposed to balance the operator's capabilities and feelings of comfort and control while compensating for a task's difficulty, and experimental results demonstrate that shared assistance mitigates perceived user difficulty and even enables successful performance on previously infeasible tasks.
Abstract: Robot teleoperation systems face a common set of challenges including latency, low-dimensional user commands, and asymmetric control inputs. User control with Brain-Computer Interfaces (BCIs) exacerbates these problems through especially noisy and erratic low-dimensional motion commands due to the difficulty in decoding neural activity. We introduce a general framework to address these challenges through a combination of computer vision, user intent inference, and arbitration between the human input and autonomous control schemes. Adjustable levels of assistance allow the system to balance the operator's capabilities and feelings of comfort and control while compensating for a task's difficulty. We present experimental results demonstrating significant performance improvement using the shared-control assistance framework on adapted rehabilitation benchmarks with two subjects implanted with intracortical brain-computer interfaces controlling a seven degree-of-freedom robotic manipulator as a prosthetic. Our results further indicate that shared assistance mitigates perceived user difficulty and even enables successful performance on previously infeasible tasks. We showcase the extensibility of our architecture with applications to quality-of-life tasks such as opening a door, pouring liquids from containers, and manipulation with novel objects in densely cluttered environments.

Posted Content
TL;DR: In this paper, the authors describe a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion.
Abstract: Many functional elements of human homes and workplaces consist of rigid components which are connected through one or more sliding or rotating linkages. Examples include doors and drawers of cabinets and appliances; laptops; and swivel office chairs. A robotic mobile manipulator would benefit from the ability to acquire kinematic models of such objects from observation. This paper describes a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion. We envision that in future, a machine newly introduced to an environment could be shown by its human user the articulated objects particular to that environment, inferring from these "visual demonstrations" enough information to actuate each object independently of the user. Our method employs sparse (markerless) feature tracking, motion segmentation, component pose estimation, and articulation learning; it does not require prior object models. Using the method, a robot can observe an object being exercised, infer a kinematic model incorporating rigid, prismatic and revolute joints, then use the model to predict the object's motion from a novel vantage point. We evaluate the method's performance, and compare it to that of a previously published technique, for a variety of household objects.