scispace - formally typeset
Search or ask a question

Showing papers by "Willow Garage published in 2011"


Proceedings ArticleDOI
06 Nov 2011
TL;DR: This paper proposes a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise, and demonstrates through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations.
Abstract: Feature matching is at the base of many computer vision problems, such as object recognition or structure from motion. Current methods rely on costly descriptors for detection and matching. In this paper, we propose a very fast binary descriptor based on BRIEF, called ORB, which is rotation invariant and resistant to noise. We demonstrate through experiments how ORB is at two orders of magnitude faster than SIFT, while performing as well in many situations. The efficiency is tested on several real-world applications, including object detection and patch-tracking on a smart phone.

8,702 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: PCL (Point Cloud Library) is presented, an advanced and extensive approach to the subject of 3D perception that contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation.
Abstract: With the advent of new, low-cost 3D sensing hardware such as the Kinect, and continued efforts in advanced point cloud processing, 3D perception gains more and more importance in robotics, as well as other fields. In this paper we present one of our most recent initiatives in the areas of point cloud perception: PCL (Point Cloud Library - http://pointclouds.org). PCL presents an advanced and extensive approach to the subject of 3D perception, and it's meant to provide support for all the common 3D building blocks that applications need. The library contains state-of-the art algorithms for: filtering, feature estimation, surface reconstruction, registration, model fitting and segmentation. PCL is supported by an international community of robotics and perception researchers. We provide a brief walkthrough of PCL including its algorithmic capabilities and implementation strategies.

4,501 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: G2o, an open-source C++ framework for optimizing graph-based nonlinear error functions, is presented and demonstrated that while being general g2o offers a performance comparable to implementations of state-of-the-art approaches for the specific problems.
Abstract: Many popular problems in robotics and computer vision including various types of simultaneous localization and mapping (SLAM) or bundle adjustment (BA) can be phrased as least squares optimization of an error function that can be represented by a graph. This paper describes the general structure of such problems and presents g2o, an open-source C++ framework for optimizing graph-based nonlinear error functions. Our system has been designed to be easily extensible to a wide range of problems and a new problem typically can be specified in a few lines of code. The current implementation provides solutions to several variants of SLAM and BA. We provide evaluations on a wide range of real-world and simulated datasets. The results demonstrate that while being general g2o offers a performance comparable to implementations of state-of-the-art approaches for the specific problems.

2,192 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: It is experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based methods like CHOMP can get stuck in.
Abstract: We present a new approach to motion planning using a stochastic trajectory optimization framework. The approach relies on generating noisy trajectories to explore the space around an initial (possibly infeasible) trajectory, which are then combined to produced an updated trajectory with lower cost. A cost function based on a combination of obstacle and smoothness cost is optimized in each iteration. No gradient information is required for the particular optimization algorithm that we use and so general costs for which derivatives may not be available (e.g. costs corresponding to constraints and motor torques) can be included in the cost function. We demonstrate the approach both in simulation and on a mobile manipulation system for unconstrained and constrained tasks. We experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based methods like CHOMP can get stuck in.

817 citations


Journal ArticleDOI
TL;DR: A novel robotic grasp controller that allows a sensorized parallel jaw gripper to gently pick up and set down unknown objects once a grasp location has been selected, inspired by the control scheme that humans employ for such actions.
Abstract: We present a novel robotic grasp controller that allows a sensorized parallel jaw gripper to gently pick up and set down unknown objects once a grasp location has been selected. Our approach is inspired by the control scheme that humans employ for such actions, which is known to centrally depend on tactile sensation rather than vision or proprioception. Our controller processes measurements from the gripper's fingertip pressure arrays and hand-mounted accelerometer in real time to generate robotic tactile signals that are designed to mimic human SA-I, FA-I, and FA-II channels. These signals are combined into tactile event cues that drive the transitions between six discrete states in the grasp controller: Close, Load, Lift and Hold, Replace, Unload, and Open. The controller selects an appropriate initial grasping force, detects when an object is slipping from the grasp, increases the grasp force as needed, and judges when to release an object to set it down. We demonstrate the promise of our approach through implementation on the PR2 robotic platform, including grasp testing on a large number of real-world objects.

381 citations


Proceedings ArticleDOI
06 Nov 2011
TL;DR: A novel and general optimisation framework for visual SLAM, which scales for both local, highly accurate reconstruction and large-scale motion with long loop closures, and takes a two-level approach that combines accurate pose-point constraints in the primary region of interest with a stabilising periphery of pose-pose soft constraints.
Abstract: We present a novel and general optimisation framework for visual SLAM, which scales for both local, highly accurate reconstruction and large-scale motion with long loop closures. We take a two-level approach that combines accurate pose-point constraints in the primary region of interest with a stabilising periphery of pose-pose soft constraints. Our algorithm automatically builds a suitable connected graph of keyposes and constraints, dynamically selects inner and outer window membership and optimises both simultaneously. We demonstrate in extensive simulation experiments that our method approaches the accuracy of offline bundle adjustment while maintaining constant-time operation, even in the hard case of very loopy monocular camera motion. Furthermore, we present a set of real experiments for various types of visual sensor and motion, including large scale SLAM with both monocular and stereo cameras, loopy local browsing with either monocular or RGB-D cameras, and dense RGB-D object model building.

326 citations


Proceedings ArticleDOI
01 Nov 2011
TL;DR: The Clustered Viewpoint Feature Histogram (CVFH) is described and it is shown that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data.
Abstract: This paper focuses on developing a fast and accurate 3D feature for use in object recognition and pose estimation for rigid objects. More specifically, given a set of CAD models of different objects representing our knoweledge of the world - obtained using high-precission scanners that deliver accurate and noiseless data - our goal is to identify and estimate their pose in a real scene obtained by a depth sensor like the Microsoft Kinect. Borrowing ideas from the Viewpoint Feature Histogram (VFH) due to its computational efficiency and recognition performance, we describe the Clustered Viewpoint Feature Histogram (CVFH) and the cameras roll histogram together with our recognition framework to show that it can be effectively used to recognize objects and 6DOF pose in real environments dealing with partial occlusion, noise and different sensors atributes for training and recognition data. We show that CVFH out-performs VFH and present recognition results using the Microsoft Kinect Sensor on an object set of 44 objects.

303 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: A novel interest keypoint extraction method that operates on range images generated from arbitrary 3D point clouds, which explicitly considers the borders of the objects identified by transitions from foreground to background, and a feature descriptor that takes the same information into account.
Abstract: In this paper we address the topic of feature extraction in 3D point cloud data for object recognition and pose identification. We present a novel interest keypoint extraction method that operates on range images generated from arbitrary 3D point clouds, which explicitly considers the borders of the objects identified by transitions from foreground to background. We furthermore present a feature descriptor that takes the same information into account. We have implemented our approach and present rigorous experiments in which we analyze the individual components with respect to their repeatability and matching capabilities and evaluate the usefulness for point feature based object detection methods.

274 citations


Proceedings ArticleDOI
09 May 2011
TL;DR: A new task-level executive system, SMACH, based on hierarchical concurrent state machines, which controls the overall behavior of the system and integrates several new components that are built on top of the PR2's current capabilities.
Abstract: As autonomous personal robots come of age, we expect certain applications to be executed with a high degree of repeatability and robustness. In order to explore these applications and their challenges, we need tools and strategies that allow us to develop them rapidly. Serving drinks (i.e., locating, fetching, and delivering), is one such application with well-defined environments for operation, requirements for human interfacing, and metrics for successful completion. In this paper we present our experiences and results while building an autonomous robotic assistant using the PR21 platform and ROS2. The system integrates several new components that are built on top of the PR2's current capabilities. Perception components include dynamic obstacle identification, mechanisms for identifying the refrigerator, types of drinks, and human faces. Planning components include navigation, arm motion planning with goal and path constraints, and grasping modules. One of the main contributions of this paper is a new task-level executive system, SMACH, based on hierarchical concurrent state machines, which controls the overall behavior of the system. We provide in-depth discussions on the solutions that we found in accomplishing our goal, and the implementation strategies that let us achieve them.

263 citations


Proceedings ArticleDOI
06 Mar 2011
TL;DR: Support is found for the hypothesis that perceptions of robots are influenced by robots showing forethought, the task outcome (success or failure), and showing goal-oriented reactions to those task outcomes.
Abstract: The animation techniques of anticipation and reaction can help create robot behaviors that are human readable such that people can figure out what the robot is doing, reasonably predict what the robot will do next, and ultimately interact with the robot in an effective way. By showing forethought before action and expressing a reaction to the task outcome (success or failure), we prototyped a set of human-robot interaction behaviors. In a 2 (forethought vs. none: between) x 2 (reaction to outcome vs. none: between) x 2 (success vs. failure task outcome: within) experiment, we tested the influences of forethought and reaction upon people's perceptions of the robot and the robot's readability. In this online video prototype experiment (N=273), we have found support for the hypothesis that perceptions of robots are influenced by robots showing forethought, the task outcome (success or failure), and showing goal-oriented reactions to those task outcomes. Implications for theory and design are discussed.

256 citations


Proceedings ArticleDOI
07 May 2011
TL;DR: This work found that the mobile embodiment of the remote worker evoked orientations toward the MRP both as a person and as a machine, leading to formation of new usage norms among remote and local coworkers.
Abstract: As geographically distributed teams become increasingly common, there are more pressing demands for communication work practices and technologies that support distributed collaboration. One set of technologies that are emerging on the commercial market is mobile remote presence (MRP) systems, physically embodied videoconferencing systems that remote workers use to drive through a workplace, communicating with locals there. Our interviews, observations, and survey results from people, who had 2-18 months of MRP use, showed how remotely-controlled mobility enabled remote workers to live and work with local coworkers almost as if they were physically there. The MRP supported informal communications and connections between distributed coworkers. We also found that the mobile embodiment of the remote worker evoked orientations toward the MRP both as a person and as a machine, leading to formation of new usage norms among remote and local coworkers.

Proceedings ArticleDOI
09 May 2011
TL;DR: This work presents a Reinforcement Learning based approach to acquiring new motor skills from demonstration that allows the robot to learn fine manipulation skills and significantly improve its success rate and skill level starting from a possibly coarse demonstration.
Abstract: Learning complex motor skills for real world tasks is a hard problem in robotic manipulation that often requires painstaking manual tuning and design by a human expert. In this work, we present a Reinforcement Learning based approach to acquiring new motor skills from demonstration. Our approach allows the robot to learn fine manipulation skills and significantly improve its success rate and skill level starting from a possibly coarse demonstration. Our approach aims to incorporate task domain knowledge, where appropriate, by working in a space consistent with the constraints of a specific task. In addition, we also present an approach to using sensor feedback to learn a predictive model of the task outcome. This allows our system to learn the proprioceptive sensor feedback needed to monitor subsequent executions of the task online and abort execution in the event of predicted failure. We illustrate our approach using two example tasks executed with the PR2 dual-arm robot: a straight and accurate pool stroke and a box flipping task using two chopsticks as tools.

01 Jan 2011
TL;DR: A C++ framework for performing the optimization of nonlinear least squares problems that can be embedded as a graph or in an hyper-graph, where go stands for General (Hyper) Graph Optimization.
Abstract: In this document we describe a C++ framework for performing the optimization of nonlinear least squares problems that can be embedded as a graph or in an hyper-graph. An hyper-graph is an extension of a graph where an edge can connect multiple nodes and not only two. Several problems in robotics and in computer vision require to find the optimum of an error function with respect of a set of parameters. Examples include, popular applications like SLAM and Bundle adjustment. In the literature, many approaches have been proposed to address this class of problems. The naive implementation of standard methods, like Levenberg-Marquardt or Gauss-newton can lead to acceptable results for most applications, when the correct parameterization is chosen. However, to achieve the maximum performances substantial efforts might be required. go stands for General (Hyper) Graph Optimization. The purposes of this framework are the following:

Proceedings ArticleDOI
06 Mar 2011
TL;DR: Mobile remote presence (MRP) systems as used by a population who could potentially benefit from more social connectivity and communication with remote people - older adults is focused on.
Abstract: While much of human-robot interaction research focuses upon people interacting with autonomous robots, there is also much to be gained from exploring human interpersonal interaction through robots. The current study focuses on mobile remote presence (MRP) systems as used by a population who could potentially benefit from more social connectivity and communication with remote people - older adults. Communication technologies are important for ensuring safety, independence, and social support for older adults, thereby potentially improving their quality of life and maintaining their independence [24]. However, before such technologies would be accepted and used by older adults, it is critical to understand their perceptions of the benefits, concerns, and adoption criteria for MRP systems. As such, we conducted a needs assessment with twelve volunteer participants (ages 63-88), who were given first-hand experience with both meeting a visitor via the MRP system and driving the MRP system to visit that person. The older adult participants identified benefits such as being able to see and be seen via the MRP system, reducing travel costs and hassles, and reducing social isolation. Among the concerns identified were etiquette of using the MRP, personal privacy, and overuse of the system. Some new use-cases were identified that have not yet been explored in prior work, for example, going to museums, attending live performances, and visiting friends who are hospitalized. The older adults in the current study preferred to operate the MRP themselves, rather than to be visited by others operating the MRP system. More findings are discussed in terms of their implications for design.

Proceedings ArticleDOI
01 Nov 2011
TL;DR: A system for detecting and tracking people from image and depth sensors on board a mobile robot that combines an ensemble of detectors in a unified framework, is efficient, and has the potential to incorporate multiple sensor inputs.
Abstract: The goal of personal robotics is to create machines that help us with the tasks of daily living, co-habiting with us in our homes and offices. These robots must interact with people on a daily basis, navigating with and around people, and approaching people to serve them. To enable this coexistence, personal robots must be able to detect and track people in their environment. Excellent progress has been made in the vision community in detecting people outdoors, in surveillance scenarios, in Internet images, or in specific scenarios such as video game play in living rooms. The indoor robot perception problem differs, however, in that the platform is moving, the subjects are frequently occluded or truncated by the field-of-view, there is large scale variation, the subjects take on a wider range of poses than pedestrians, and computation must take place in near real time. In this paper, we describe a system for detecting and tracking people from image and depth sensors on board a mobile robot. To cope with the challenges of indoor mobile perception, our system combines an ensemble of detectors in a unified framework, is efficient, and has the potential to incorporate multiple sensor inputs. The performance of our algorithm surpasses other approaches on two challenging data sets, including a new robot-based data set.

Proceedings ArticleDOI
09 May 2011
TL;DR: This work presents an approach for navigation in hybrid maps consisting of a topological graph overlaid with local occupancy grids, built on top of a graph SLAM system, which can be efficiently optimized even for very large environments.
Abstract: We present an approach for navigation in hybrid maps consisting of a topological graph overlaid with local occupancy grids. The topological graph is built on top of a graph SLAM system, which can be efficiently optimized even for very large environments. The novel feature of our system is that it navigates locally using local metric maps, while the overall plan is formed on the topological graph. Unlike many current SLAM methods, we never reconstruct a full occupancy grid of the environment for localization or path planning. We show that our method generates near-optimal plans, and deals gracefully with changes to the map.

Journal ArticleDOI
TL;DR: This paper presents a tactile perception strategy that allows a mobile robot with tactile sensors in its gripper to measure a generic set of tactile features while manipulating an object, and proposes a switching velocity-force controller that grasps an object safely and reveals, at the same time, its deformation properties.
Abstract: Tactile information is valuable in determining properties of objects that are inaccessible from visual perception. In this paper, we present a tactile perception strategy that allows a mobile robot with tactile sensors in its gripper to measure a generic set of tactile features while manipulating an object. We propose a switching velocity-force controller that grasps an object safely and reveals, at the same time, its deformation properties. By gently rolling the object, the robot can extract additional information about the contents of the object. As an application, we show that a robot can use these features to distinguish the internal state of bottles and cans-purely from tactile sensing-from a small training set. The robot can distinguish open from closed bottles and cans and full ones from empty ones. We also show how the high-frequency component in tactile information can be used to detect movement inside a container, e.g., in order to detect the presence of liquid. To prove that this is a hard recognition problem, we also conducted a comparative study with 17 human test subjects. The recognition rates of the human subjects were comparable with that of the robot.

Journal ArticleDOI
TL;DR: This work presents two methods for generating time-extended coordination solutions—solutions where more than one task is assigned to each agent—for domains with intra-path constraints and compares these approaches with a range of single task allocation approaches in a simulated disaster response domain.
Abstract: Many applications require teams of robots to cooperatively execute tasks. Among these domains are those in which successful coordination must respect intra-path constraints, which are constraints that occur on the paths of agents and affect route planning. This work focuses on multi-agent coordination for disaster response with intra-path precedence constraints, a compelling application that is not well addressed by current coordination methods. In this domain a group of fire truck agents attempt to address fires spread throughout a city in the wake of a large-scale disaster. The disaster has also caused many city roads to be blocked by impassable debris, which can be cleared by bulldozer robots. A high-quality coordination solution must determine not only a task allocation but also what routes the fire trucks should take given the intra-path precedence constraints and which bulldozers should be assigned to clear debris along those routes. This work presents two methods for generating time-extended coordination solutions--solutions where more than one task is assigned to each agent--for domains with intra-path constraints. Our first approach uses tiered auctions and two heuristic techniques, clustering and opportunistic path planning, to perform a bounded search of possible time-extended schedules and allocations. Our second method uses a centralized, non-heuristic, genetic algorithm-based approach that provides higher quality solutions but at substantially greater computational cost. We compare our time-extended approaches with a range of single task allocation approaches in a simulated disaster response domain.

Proceedings ArticleDOI
27 Jun 2011
TL;DR: A new method to model the spatial distribution of oriented local features on an object is presented, which is used to infer object pose given small sets of observed local features.
Abstract: The success of personal service robotics hinges upon reliable manipulation of everyday household objects, such as dishes, bottles, containers, and furniture. In order to accurately manipulate such objects, robots need to know objects’ full 6-DOF pose, which is made difficult by clutter and occlusions. Many household objects have regular structure that can be used to effectively guess object pose given an observation of just a small patch on the object. In this paper, we present a new method to model the spatial distribution of oriented local features on an object, which we use to infer object pose given small sets of observed local features. The orientation distribution for local features is given by a mixture of Binghams on the hypersphere of unit quaternions, while the local feature distribution for position given orientation is given by a locally-weighted (Quaternion kernel) likelihood. Experiments on 3D point cloud data of cluttered and uncluttered scenes generated from a structured light stereo image sensor validate our approach.

Proceedings ArticleDOI
Andreas Paepcke1, Bianca Soto1, Leila Takayama1, Frank Koenig, Blaise Gassend1 
16 Oct 2011
TL;DR: The prototyped and empirically evaluated the effect of sidetone to help operators self regulate their speaking loudness, and found that engaging in more social tasks and more intellectually demanding tasks influenced how loudly people spoke.
Abstract: In our field deployments of mobile remote presence (MRP) systems in offices, we observed that remote operators of MRPs often unintentionally spoke too loudly. This disrupted their local co-workers, who happened to be within earshot of the MRP system. To address this issue, we prototyped and empirically evaluated the effect of sidetone to help operators self regulate their speaking loudness. Sidetone is the intentional, attenuated feedback of speakers' voices to their ears while they are using a telecommunication device. In a 3-level (no sidetone vs. low sidetone vs. high sidetone) within- participants pair of experiments, people interacted with a confederate through an MRP system. The first experiment involved MRP operators using headsets with boom microphones (N=20). The second experiment involved MRP operators using loudspeakers and desktop microphones (N=14). While we detected the effects of the sidetone manipulation in our audio-visual context, the effect was attenuated in comparison to earlier audio-only studies. We hypothesize that the strong visual component of our MRP system interferes with the sidetone effect. We also found that engaging in more social tasks (e.g., a getting-to-know-you activity) and more intellectually demanding tasks (e.g., a creativity exercise) influenced how loudly people spoke. This suggests that testing such sidetone effects in the typical read-aloud setting is insufficient for generalizing to more interactive, communication tasks. We conclude that MRP application support must reach beyond the time honored audio-only technologies to solve the problem of excessive speaker loudness.

Proceedings ArticleDOI
09 May 2011
TL;DR: Assisted teleoperation helped people avoid obstacles, however, assisted teleoperation also increased time to complete an obstacle course when human-oriented dimensions were evaluated, gaming experience and locus of control affected speed of completing the course.
Abstract: As mobile remote presence (MRP) systems become more pervasive in everyday environments such as office spaces, it is important for operators to navigate through remote locations without running into obstacles. Human-populated environments frequently change (e.g., doors open and close, furniture is moved around) and mobile remote presence systems must be able to adapt to such changes and to avoid running into obstacles. As such, we implemented an assisted teleoperation feature for a MRP system and evaluated its effectiveness with a controlled user study, focusing on both the system-oriented dimensions (e.g., autonomous assistance vs. no assistance) and human-oriented dimensions (e.g., gaming experience, locus of control, and spatial cognitive abilities) (N=24). In a systems-only analysis, we found that the assisted teleoperation helped people avoid obstacles. However, assisted teleoperation also increased time to complete an obstacle course. When human-oriented dimensions were evaluated, gaming experience and locus of control affected speed of completing the course. Implications for future research and design are discussed.

Proceedings ArticleDOI
09 May 2011
TL;DR: This work presents a framework that combines two approaches to grasp planning based on perceived sensor data of an object, aiming to find consensus on how the object should be grasped by using the information from each object representation according to their confidence levels.
Abstract: Grasp planning based on perceived sensor data of an object can be performed in different ways, depending on the chosen semantic interpretation of the sensed data. For example, if the object can be recognized and a complete 3D model is available, a different planning tool can be selected compared to the situation in which only the raw sensed data, such as a single point cloud, is available. Instead of choosing between these options, we present a framework that combines them, aiming to find consensus on how the object should be grasped by using the information from each object representation according to their confidence levels. We show that this method is robust to common errors in perception, such as incorrect object recognition, while also taking into account potential grasp execution errors due to imperfect robot calibration. We illustrate this method on the PR2 robot by grasping objects common in human environments.

Proceedings ArticleDOI
09 May 2011
TL;DR: The results demonstrate the effectiveness of the planner in efficiently navigating cluttered spaces; the method generates consistent, low-cost motion trajectories, and guarantees the search is complete with bounds on the suboptimality of the solution.
Abstract: In this paper, we present a search-based motion planning algorithm for manipulation that handles the high dimensionality of the problem and minimizes the limitations associated with employing a strict set of pre-defined actions. Our approach employs a set of adaptive motion primitives comprised of static motions with variable dimensionality and on-the-fly motions generated by two analytical solvers. This method results in a slimmer, multi-dimensional lattice and offers the ability to satisfy goal constraints with precision. To validate our approach, we used a 7DOF manipulator to perform experiments on a real mobile manipulation platform (Willow Garage's PR2). Our results demonstrate the effectiveness of the planner in efficiently navigating cluttered spaces; the method generates consistent, low-cost motion trajectories, and guarantees the search is complete with bounds on the suboptimality of the solution.

01 Jul 2011
TL;DR: Hsiao et al. as discussed by the authors presented a decision-theoretic approach to problems that require accurate placement of a robot relative to an object of known shape, such as grasping for assembly or tool use.
Abstract: This paper presents a decision-theoretic approach to problems that require accurate placement of a robot relative to an object of known shape, such as grasping for assembly or tool use. The decision process is applied to a robot hand with tactile sensors, to localize the object on a table and ultimately achieve a target placement by selecting among a parameterized set of grasping and information-gathering trajectories. The process is demonstrated in simulation and on a real robot. This work has been previously presented in Hsiao et al. (Workshop on Algorithmic Foundations of Robotics (WAFR), 2008; Robotics Science and Systems (RSS), 2010) and Hsiao (Relatively robust grasping, Ph.D. thesis, Massachusetts Institute of Technology, 2009).

Patent
11 Apr 2011
Abstract: Systems and methods related to construction, configuration, and utilization of humanoid robotic systems and aspects thereof are described. A system may include a mobile base, a spine structure, a body structure, and at least one robotic arm, each of which is movably configured to have significant human-scale capabilities in prescribed environments. The one or more robotic arms may be rotatably coupled to the body structure, which may be mechanically associated with the mobile base, which is preferably configured for holonomic or semi-holonomic motion through human scale travel pathways that are ADA compliant. Aspects of the one or more arms may be counterbalanced with one or more spring-based counterbalancing mechanisms which facilitate backdriveability and payload features.

Journal ArticleDOI
TL;DR: A decision-theoretic approach to problems that require accurate placement of a robot relative to an object of known shape, such as grasping for assembly or tool use, by selecting among a parameterized set of grasping and information-gathering trajectories is presented.
Abstract: This paper presents a decision-theoretic approach to problems that require accurate placement of a robot relative to an object of known shape, such as grasping for assembly or tool use. The decision process is applied to a robot hand with tactile sensors, to localize the object on a table and ultimately achieve a target placement by selecting among a parameterized set of grasping and information-gathering trajectories. The process is demonstrated in simulation and on a real robot. This work has been previously presented in Hsiao et al. (Workshop on Algorithmic Foundations of Robotics (WAFR), 2008; Robotics Science and Systems (RSS), 2010) and Hsiao (Relatively robust grasping, Ph.D. thesis, Massachusetts Institute of Technology, 2009).

Proceedings ArticleDOI
09 May 2011
TL;DR: This work presents a planning and control approach to navigation of a humanoid robot while pushing a cart and shows how immediate information about the environment can be integrated into this approach to achieve safer navigation in the presence of dynamic obstacles.
Abstract: Robust navigation in cluttered environments has been well addressed for mobile robotic platforms, but the problem of navigating with a moveable object like a cart has not been widely examined. In this work, we present a planning and control approach to navigation of a humanoid robot while pushing a cart. We show how immediate information about the environment can be integrated into this approach to achieve safer navigation in the presence of dynamic obstacles. We demonstrate the robustness of our approach through long-running experiments with the PR2 mobile manipulation robot in a typical indoor office environment, where the robot faced narrow and high-traffic passageways with very limited clearance.

Proceedings ArticleDOI
09 May 2011
TL;DR: This paper presents the implementation of an architecture that is able to combine a multitude of 2D/3D object recognition and pose estimation techniques in parallel as dynamically loadable plugins, ReIn (REcognition INfrastructure), and introduces two new classifiers designed for robot perception needs.
Abstract: A robust robot perception system intended to enable object manipulation needs to be able to accurately identify objects and their pose at high speeds. Since objects vary considerably in surface properties, rigidity and articulation, no single detector or object estimation method has been shown to provide reliable detection across object types to date. This indicates the need for an architecture that is able to quickly swap detectors, pose estimators, and filters, or to run them in parallel or serial and combine their results, preferably without any code modifications at all. In this paper, we present our implementation of such an infrastructure, ReIn (REcognition INfrastructure), to answer these needs. ReIn is able to combine a multitude of 2D/3D object recognition and pose estimation techniques in parallel as dynamically loadable plugins. It also provides an extremely efficient data passing architecture, and offers the possibility to change the parameters and initial settings of these techniques during their execution. In the course of this work we introduce two new classifiers designed for robot perception needs: BiGGPy (Binarized Gradient Grid Pyramids) for scalable 2D classification and VFH (Viewpoint Feature Histograms) for 3D classification and pose. We then show how these two classifiers can be easily combined using ReIn to solve object recognition and pose identification problems.

Proceedings ArticleDOI
05 Dec 2011
TL;DR: A novel multi-resolution approach to efficiently mapping 3D environments that models the environment as a hierarchy of probabilistic 3D maps, in which each submap is updated and transformed individually.
Abstract: In this paper, we present a novel multi-resolution approach to efficiently mapping 3D environments. Our representation models the environment as a hierarchy of probabilistic 3D maps, in which each submap is updated and transformed individually. In addition to the formal description of the approach, we present an implementation for tabletop manipulation tasks and an information-driven exploration algorithm for autonomously building a hierarchical map from sensor data. We evaluate our approach using real-world as well as simulated data. The results demonstrate that our method is able to efficiently represent 3D environments at high levels of detail. Compared to a monolithic approach, our maps can be generated significantly faster while requiring significantly less memory.

Proceedings ArticleDOI
06 Mar 2011
TL;DR: It is argued that depth and context can improve frontal face detection, in turn improving the ability of robots to interact with humans, and supported this claim with encouraging preliminary experimental results.
Abstract: The information available to a robot through a variety of sensors and contextual awareness is rich and unique. In this paper, we have argued that depth and context can improve frontal face detection, in turn improving the ability of robots to interact with humans, and supported this claim with encouraging preliminary experimental results. As future work, we will attempt to apply the same concepts to the much more difficult problem of detecting faces in profile, further expanding the population with which a robot can interact.