scispace - formally typeset
Open AccessProceedings ArticleDOI

Real-time perception-guided motion planning for a personal robot

TLDR
A modular and distributed architecture is proposed, which seamlessly integrates the creation of 3D maps for collision detection and semantic annotations, with a real-time motion replanning framework.
Abstract
This paper presents significant steps towards the online integration of 3D perception and manipulation for personal robotics applications. We propose a modular and distributed architecture, which seamlessly integrates the creation of 3D maps for collision detection and semantic annotations, with a real-time motion replanning framework. To validate our system, we present results obtained during a comprehensive mobile manipulation scenario, which includes the fusion of the above components with a higher level executive.

read more

Content maybe subject to copyright    Report

Real-time Perception-Guided Motion Planning for a Personal Robot
Radu Bogdan Rusu
, Ioan Alexandru S¸ucan
, Brian Gerkey
, Sachin Chitta
, Michael Beetz
, Lydia E. Kavraki
Intelligent Autonomous Systems, Department of Computer Science, Technische Universit
¨
at M
¨
unchen, Germany
{rusu, beetz}@cs.tum.edu
Department of Computer Science, Rice University, USA
{isucan, kavraki}@rice.edu
Willow Garage, USA
{gerkey, sachinc}@willowgarage.com
Abstract This paper presents significant steps towards the
online integration of 3D perception and manipulation for
personal robotics applications. We propose a modular and dis-
tributed architecture, which seamlessly integrates the creation
of 3D maps for collision detection and semantic annotations,
with a real-time motion replanning framework. To validate our
system, we present results obtained during a comprehensive
mobile manipulation scenario, which includes the fusion of the
above components with a higher level executive.
I. INTRODUCTION
In this paper we report on our investigations into the
realization of a basic manipulation layer for a personal
robot performing pick-and-place tasks in environments where
humans live and work. The overall goal of our research
effort is developing a framework that enables a mobile
manipulation system to enter a previously unknown room
and clean a table together with a person. The objects to
be picked up are easy to grasp, rigid everyday objects such
as cups, soda cans, tea boxes, bowls, etc., but they are not
known to the robot beforehand.
This seemingly trivial manipulation task poses serious
challenges for the state of the art in autonomous mobile
manipulation. These challenges include the perception of
object constellations on the table, the geometric reconstruc-
tion of the unknown objects to be picked up from partial
sensor views, the treatment of the rest of the scene as
dynamic obstacles in particular the person reaching into
the operating space of the robot, the planning and seamless
execution of planned motions when reaching for objects in
the presence of moving obstacles, and the determination of
appropriate grasps based on partial and possibly inaccurate
geometric reconstruction of the objects to be picked up.
Manipulation systems that aim to solve tasks such as
the one presented above require the successful operation of
many complex components, such as 3D perception, motion
planning and more. Furthermore, these components must
function robustly and reliably over extended periods of time
on real hardware. These constraints may explain why rela-
tively few systems are intended to address the domain that
we study [1]–[6]. Due to their complexity, complete systems
have not been fully realized yet, and the research problem
remains open. Our work describes a manipulation system
Fig. 1. Snapshot of the PR2 robot used during the experiments.
with emphasis on robustness and software re-usability, in
addition to safety and functionality.
Our long term goal is to provide robots with safe and
goal-directed real-time manipulation behavior. The robots
should be capable of performing well while using realistic
sensors and making few assumptions about the environment
setup. We aim to produce a system that is safe and robust,
can run autonomously, can handle failures and situations it
has not been presented before. Reliability and flexibility is
not only achieved through the capabilities of the individual
components (perception, planning, etc.) but also through the
task management that synchronizes and parameterizes the
activities in dynamic, context-specific ways and detects and
locally recovers from failures. We use no scripting for any of
the experiments, as all our code runs online and all decisions
the robot makes are in real-time. To this end, we report
on the experiments carried out using our perception and
planning approach on the mobile manipulation system shown
in Figure 1, based on the Personal Robot 2 (PR2).
Our research contributes to the state-of-the-art in au-
tonomous mobile manipulation in the following ways:
3D scene perception for manipulation: We propose a
3D scene perception component that computes:
A dynamic obstacle map, which is represented by a
voxel-based collision map for 3D motion planning.

Navigation
Navigation
Task executive
Task executive
3D Perception
3D Perception
PR2 Hardware Controllers
PR2 Hardware Controllers
Manipulation
Manipulation
Base controller
Base controller
Laser tilt controller
Laser tilt controller
Arm joint position controller
Arm joint position controller
Reaching
Reaching
Sampling-based
Sampling-based
motion (re)planning
motion (re)planning
Grid planner
Grid planner
Grasping
Grasping
Partial-view grasp
Partial-view grasp
planner
planner
Dynamic Obstacle
Dynamic Obstacle
Map
Map
Scene Interpretor
Scene Interpretor
for Manipulation
for Manipulation
Basic Control
Basic Control
Layer
Layer
Functional Layer
Functional Layer
Task Layer
Task Layer
CollisionMap
CollisionMap
End effecto r position
End effecto r position
and orientat ion
and orientat ion
Tab le/ObjectsOnTable
Tab le/ObjectsOnTable
Tilt speed
Tilt speed
Base goal position
Base goal position
Grasp pose
Grasp pose
Base executive
Base executive
Arm executive
Arm executive
Fig. 2. The overall architecture of our system using ROS. Boxes with continuous lines define nodes. Boxes with dashed lines delimit groups of nodes
with related functionality. Communication between nodes is represented by arrows.
A labeled manipulation scene, which provides a
classification of 3D points into various semantic
categories such as walls and tables.
Sampling-based motion (re-)planning for reaching: We
propose a sampling-based motion (re-)planning module
that operates on abstract robot specifications and uses
the obstacle map provided by the 3D scene perception
for collision detection. The motion planning component
uses the Open Motion Planning Library (OMPL), which
provides leading-edge motion planning algorithms and
tools for optimizing and smoothing motion trajectories
(more information in Section IV).
All software components necessary to perform the tasks
discussed in this work are modular so they can be reused
individually, and are available Open Source.
1
Because of
their modularity, they can be used for robots other than the
PR2 as long as a respective robot description is available.
II. SYSTEM ARCHITECTURE
A. Hardware
Our experimental system is an alpha prototype of the
PR2 mobile manipulation platform (Figure 1). The PR2
comprises an omni-directional wheeled base, telescoping
spine, two force-controlled 7-DOF arms and an actuated
sensor head. Each arm has a 1-DOF gripper attached to
it. The robot can negotiate ADA-compliant
2
wheelchair-
accessible environments, and its manipulation workspace is
similar to that of an average-height adult.
1
Details on retrieving, building, and running all code described in this paper are at
http://pr.willowgarage.com/wiki/tabletop_manipulation.
2
http://www.ada.gov/
The sensor head comprises a Hokuyo UTM-30 planar laser
range-finder on a tilt stage, and a Videre stereo camera on a
pan-tilt stage. The laser is tilted up and down continuously,
providing a 3D view of the area in front of the robot. The
resulting point clouds are the input to our perception system,
which in turns drives our manipulation system. We can, but
do not currently make use of data from the stereo camera. A
second Hokuyo UTM-30 laser sensor, attached to the base,
is used for navigation.
The PR2 carries multiple computers, connected by a
gigabit LAN. The current computing configuration is four
dual-core 2.6GHz machines running Linux. One of the four
machines is modified to run a real-time kernel, providing
a guaranteed 1KHz control loop, via EtherCAT,
3
over the
robot’s motors and encoders.
B. Software
A system as complex as the PR2 is driven by a number of
subsystems that need to easily be able to communicate with
each other: hardware drivers, controllers, visual perception,
motion planning, higher level control, etc. The computational
needs of these components virtually necessitate the use
of a distributed computing environment. To accommodate
these needs, we use ROS,
4
which provides a communication
framework that hides the complexities of transferring data
between processes, regardless whether the processes run on
the same machine or not.
Processes using ROS are called nodes. Nodes communi-
cate primarily in a publish/subscribe fashion, using common
3
http://www.ethercat.org/
4
ROS (Robot Operating System - http://ros.sourceforge.net) is an
Open Source project to build a meta-operating system for mobile manipu-
lation systems.

data formats on channels called topics. Nodes can also call
services provided by other nodes, in a manner akin to remote
procedure calls. Abstracting communication between running
processes in this way, through topics and services, allows for
the implementation of a modular system. Modules can be
written in a variety of languages and can execute anywhere in
the network. In addition to the basic communication system,
ROS offers an extensive suite of tools for process control,
system inspection, and data visualization.
The software architecture of our mobile manipulation
framework is illustrated in Figure 2. The figure shows the
complete system, but the focus of this paper is on the
components depicted in color: 3D perception and motion
planning. The architecture can be viewed as a 3-tiered one.
The lowest layer provides the PR2 hardware controllers. The
middle layer contains the functional modules including the
ones discussed in this paper. The top layer is represented
by the task executive that manages the processes at the
lower layers. This is where the tasks that the robot performs
are specified. In particular, the task executive parameterizes
and synchronizes processes such as perception, navigation,
reaching and grasping. Its purpose is to also provide context-
specific control, responsiveness to sensory events, and failure
detection, analysis, and recovery.
The 3D perception system consists of two main nodes
which build two separate maps: a dynamic obstacle map
used for collision detection in 3D while performing arm
movements, and a scene interpretor which aggregates the
acquired point cloud dataset with semantic annotations (e.g.,
class labels such as: floor, walls, tables, objects on table [7]).
Motion planning requiring 3D collision avoidance (for
example when moving one of the arms) is done using
sampling-based algorithms [8], [9]. From an architectural
standpoint, sampling-based motion planning relies on a set
of libraries, following a modular design: actual motion
planning, collision checking and robot modeling are kept
separate. These libraries are used together in a node that
monitors the current robot state and the dynamic obstacle
map, and provides the interface (service calls, topics) to
perform motion (re-)planning.
The modular structure of our framework allows seamlessly
swapping nodes that provide the same interface. This is an
option we often make use of in the development process and
it enables the advancement of the platform.
III. 3D PERCEPTION
To be able to interact with its environment, a robotic
system must first be able to perceive it, and it must do so
accurately and detect pertinent changes as they occur. This
perception component is fundamental to both localization
and motion planning.
The input to the 3D perception module is a set of 2D laser
scan messages, acquired from the tilting Hokuyo laser sensor
installed on the head of the PR2 robot. Figure 3 presents a
detailed version of the 3D perception pipeline used for the
experiments in this paper.
2D laser scans
2D laser scans
(
(
LaserSc an
LaserSc an
)
)
3D Perception
3D Perception
Laser scan filtering toolbox
Laser scan filtering toolbox
3D Cartesian filtered
3D Cartesian filtered
point cloud
point cloud
(
(
PointCl oud
PointCl oud
)
)
Dynamic Obstacle Map
Dynamic Obstacle Map
Scene Interpreter
Scene Interpreter
3D collision map
3D collision map
(
(
Collisi onMap
Collisi onMap
)
)
Tables and objects
Tables and objects
(
(
Table/O bjectOn Table
Table/O bjectOn Table
)
)
table_object_detecto r
subtract_object
record_static_map
Shadow pixels filter
Shadow pixels filter
Robot self filter
Robot self filter
Fig. 3. Detailed diagram of the 3D perception pipeline presented in
Figure 2. The double arrow messages represent service requests served by
the nodes.
Before projecting each 2D laser scan into a 3D point cloud
Cartesian frame, a set of preprocessing filters are applied, in
order to clean the data of spurious measurements, as well
as to remove points from the scan which are sampled from
the robot body. The latter is achieved by treating the robot’s
body parts as convex hulls and checking the intersection of
the laser scans with the hulls, while the shadow pixels filter
removes erroneous points by comparing the angles formed
between the lines created by subsequent scan measurements
with the viewpoint, against a statistical threshold. However
these filters are part of a more general laser scan filtering
toolbox
5
. After the data is partially cleaned, we project all the
points left into a global 3D coordinate frame. This constitutes
the input of the two main nodes: the Dynamic Obstacle Map,
and the Scene Interpreter.
To formulate the perception computational models, we
introduce the following notations:
p
i
is a 3D point with {x
i
, y
i
, z
i
} geometric coordinates,
p
x
i
, p
y
i
, and p
z
i
refer to the individual x, y, respectively
z dimensions of the point;
n
i
is a surface normal estimate at point p
i
having a
{nx
i
, ny
i
, nz
i
} direction;
P
n
= {p
1
, p
2
, · · · } is a set of nD points (also repre-
sented as P for simplicity).
A. Dynamic Obstacle Map
The Dynamic Obstacle Map is comprised of a set of
geometric primitives created from the point cloud P. This
representation is updated in real-time for every new set of
points, and its main usage is to provide support for colli-
sion avoidance (i.e., reactive motion execution) for motion
planning.
The node implementing the Dynamic Obstacle Map takes
any P set of points that is available from the sensor and
inserts it into the map, which can generally be represented
in multiple formats, using: spheres, boxes, points, or triangle
5
The toolbox constitutes a separate component that we work on as part of the ROS
project, and falls outside the scope of this paper.

meshes. The latter is not recommended for situations where
the data is extremely sparse, as incremental model updates
become more difficult. In general, the node discretizes the
space into smaller, uniform parts (called leaves for the re-
maining of this article), with a certain user-defined resolution
and publishes them in one of the above mentioned formats
on the ROS network to the collision checker (see Section IV).
If the robot is to be allowed to operate and manipulate
at high speeds, the laser tilt unit must be actuated as fast
as possible, to account for the environment dynamics. This
however has the undesired effect that the total number of
points per sweep is diminished considerably, and small or
fast moving obstacles will not be present in the resultant
map. To compensate for these types of situations, the node
keeps a temporal sliding window (i.e., data queue) for all data
received, and computes the final obstacle map by integrating
all the data together.
A deciding factor on the average computation time per
map is represented by the number of data leaves that are
requested from the motion planner, that is, the resolution of
the world. Because there is no good fixed resolution that best
accounts for all possible scenarios, the Dynamic Obstacle
Map architecture is designed to be parameterizable on the
fly. Furthermore, the task executive can specify what the
space of interest around the robot is the space that needs
to be accounted for in the map. Taken together, these two
constraints lead to significant computational decreases in the
creation of the map. Algorithm 1 outlines the basic steps that
are used to construct the map in real-time.
Algorithm 1 Dynamic Obstacle Map creation
b
r
, r
r
// robot bounds and resolution
P = {p
1
· · · p
n
} // set of 3D points
t
min
+
= 1, t
max
+
= 1 // increment sliding window time
if ¬(∃L) // no spatial decomposition exists (first P)
L F(b
r
, r
r
) // create an empty list L of 3D leaves
for all p
i
P do
if (b
r
p
i
b
r
) // check if inside bounds
estimate (l
xyz
L, p
i
l
xyz
) // find the right leaf for p
i
add p
i
to l
xyz
for all l
xyz
L do
if (p
t
j
t
min
p
t
j
t
max
, p
t
j
l
xyz
) // check if outside sliding window
remove p
j
from l
xyz
M F(L) // create the final map from the set of leaves
In addition to the periodical map creation, the node sup-
ports two important service calls, underlined with a double
arrow in Figure 3. The record static map service call triggers
a special mode in the node which records a complete sweep
map as a static fixed reference map, that all recorded data
will be merged with until the next service call. This translates
into: M
f
= M M
r
, where M represents the map created
from the current set of leaves, M
r
is the static fixed reference
map recorded at the service call, and M
f
is the final 3D
dynamic obstacle map. Having a static reference map is
beneficial during longer manipulation scenarios, where the
estimated collision map may contain holes due to occlusions
caused by the robot’s arms at certain positions in time.
This may in turn negatively affect the motion planning
algorithms, which in the absence of pertinent solutions will
attempt to use this occluded space, even though it has a
certain probability of still being occupied. By recording
reference maps and combining them, this problem can be
alleviated. Furthermore, the Scene Interpreter can decompose
regions of interest in the reference map such as the table for
example, and remove everything else (such as the objects
sitting on top of it), as shown in Section III-B. Thus, the
reference maps can be updated only with those parts of the
environment which have a higher probability of being fixed
over subsequent pick and place operations.
Fig. 4. Left: a dynamic obstacle map representation M
f
over-imposed
on the point cloud P; right: the dynamic obstacle map M
f
alone. Please
notice that the object held in the gripper is part of the point cloud P, but
has been removed using the subtract object service call from the M
f
map.
The second service call, subtract object, introduces a
negative collision map created from a given object that can
be continuously subtracted from the computed maps, that
is: M
f
= M r M
o
, where M
o
represents the negative
map created from the given object. The object is given as
either a list of leaves or an oriented 3D bounding box. A
special use of this service call is to subtract objects from the
collision map currently being manipulated with the gripper
(see Figure 4). To achieve this, the node subscribes to the
ROS network and enquires for the current position of the
end-effector, and uses the given oriented object bounds to
subtract the data. Please note that the object present in the
gripper from the left part of Figure 4 is incomplete in P,
because parts of it are removed by the robot self filter at the
laser scan level, due to the convex hulls of the gripper being
much larger than the actual gripper itself.
B. Scene Interpreter
The second actor of the 3D perception pipeline is the
Scene Interpreter node, which continuously segments and
classifies surfaces by fitting planar models to them, in order
to abstract sensor data and make it more informative.
In general, the planar classification can be performed in
two different ways: i) using a set of heuristic rules, which
make use of the fact that the robot’s Z
r
axis is parallel to
the Z
w
axis of the world, and label points based on the
geometrical properties of the planar surfaces they belong to;
ii) using machine learning classifiers which can be trained
on simple sets of features extracted from segmented planar
regions, and the resultant models are used to annotate the
planes with the desired classes. For the purpose of the
experiments in this paper, we are mostly interested in pick
and place operations, and thus the most important areas that
we need to detect are tables and other horizontal planes
which can support objects. Therefore there is no need to
create a feature set and train a model, as it suffices to

construct a single heuristic rule for determining all major
horizontal planar areas in the world that the robot could
operate on. The vertical bounds of the search are given by
the physical constraints of the robot arms, i.e., what is the
lowest and highest place where the robot could pick an object
from. Figure 5 presents a more general scene interpretation
for planar areas segmented from a P point cloud dataset.
Fig. 5. A scene with point annotations labeled with the Scene Interpreter:
floor (dark red), ceiling (dark blue), yellow (walls), and light blue (tables).
The node has two operation modes: i) a continuous mode
where for each acquired P set of points, it labels all major
planar areas; and ii) a service mode, where the task executive
requests a list of tables and objects sitting on them in view.
Algorithm 2 presents a brief description of the computational
steps invoked for finding tables and objects on them when
the node is running in the service mode, and partial results
are presented in Figure 6.
Algorithm 2 Find tables and objects
b
z
min
, b
z
max
// min/max reachable arm positions on z, i.e., table bounds
P = {p
1
· · · p
n
} // set of 3D points
P
b
= {p
i
, b
z
min
p
z
i
b
z
max
} // subset of all 3D points within table bounds
for all p
i
P
b
do
estimate (n
i
from P
k
) // estimate surface normal from nearest neighbors
if (α = n
i
× Z 0) // check if the normal is parallel to the Z axis
P
z
p
i
// add p
i
to the P
z
set
estimate (C = {P
1
z
· · · P
n
z
}, P
i
z
P
z
) // break P
z
into Euclidean clusters
for all c
i
= P
i
z
C do
// find the best plane fit using sample consensus
estimate ({a, b, c, d}, a · p
x
i
+ b · p
y
i
+ c · p
z
i
+ d = 0, p
i
c
i
)
estimate (a
min
, a
max
) // find the min/max bounds of the planar area
M F (c
i
) // add the table parameters to the final map
for all p
i
P do
if (a
x
min
n
x
i
a
y
max
, a
y
min
n
y
i
a
y
max
) // within bounds?
P
o
p
i
// add p
i
to the P
o
set
estimate (O = {P
1
o
· · · P
n
o
}, P
i
o
P
o
) // break P
o
into Euclidean clusters
for all o
i
O do
M F(o
i
) // add the object parameters to the final map
The resultant map M contains a set of tables with their
planar equations and bounds, and a set of object models O =
{o
1
· · · o
n
}. These object models represent compact partial-
views representations for the objects supported by tables in
the real world. Their usage is twofold: i) on one hand they
provide the exact goal positions for the gripper; and ii) they
support the subtract object service call from the Dynamic
Obstacle Map, once the object is picked up.
Fig. 6. The extraction of a table surface and the individual object clusters
supported by it, from a partial view.
IV. MOTION PLANNING
Computing safe motions for a robotic system is the basic
component for achieving any useful task. The motion plan-
ning component is what computes these motions. To perform
planning that is fast enough for the PR2, different planners
are used depending on the task.
A. Grid-based motion planning for navigation
When moving the robot base, without moving the arms,
we employ a grid-based global planner and a local controller
designed for efficient motion in tight spaces. Given a goal by
the task-level executive, the planner computes the gradient of
the global navigation function [10], determining an optimal
X-Y (ignoring yaw) kinematic path. This kinematic path is
passed to a trajectory rollout controller [11], which accounts
for the robot’s exact shape and dynamic constraints in
computing desired X, Y, and yaw velocities.
Both the planner and the controller consult a 2D obstacle
map constructed by projecting obstacle data from all avail-
able sources (in our case the horizontal laser and tilting laser)
onto the X-Y plane. This projection produces conservative
navigation behavior: the robot will never choose an unsafe
path, at the cost of excluding some safe paths, such as
moving the base partially underneath a table. In future work,
we will relax the latter constraint by planning for the base
and arm together, possibly by employing the sampling-based
techniques described in the next section.
B. Sampling-based motion planning for reaching
When the robot is supposed to move its arm in order to
reach an object, a motion planner capable of planning in a
7-dimensional state space is needed. The dynamic obstacle
map received from the 3D perception pipeline is to be used
for collision avoidance. The task executive specifies a goal
position / goal region as input. The goal can be specified in
one of two ways:
explicit: the goal positions are specified for each of the
joints in the arm
implicit: constraints on the goal position are imposed.
The motion planner is to find a path that takes the robot
to a state that satisfies these constraints.
The output of the motion planner is a kinematic path that
takes the arm from the its current position to the goal while
avoiding collisions.
In addition to constraints specified for implicit goals, it is
also possible that constraints for all states along the solution

Citations
More filters
Journal ArticleDOI

Semantic 3D Object Maps for Everyday Manipulation in Human Living Environments

TL;DR: The dissertation presented in this article proposes Semantic 3D Object Models as a novel representation of the robot’s operating environment that satisfies these requirements and shows how these models can be automatically acquired from dense 3D range data.
Proceedings ArticleDOI

STOMP: Stochastic trajectory optimization for motion planning

TL;DR: It is experimentally show that the stochastic nature of STOMP allows it to overcome local minima that gradient-based methods like CHOMP can get stuck in.
Proceedings ArticleDOI

An open-source research kit for the da Vinci® Surgical System

TL;DR: A telerobotics research platform that provides complete access to all levels of control via open-source electronics and software, and is currently installed at 11 research institutions, with additional installations underway, thereby creating a research community around a commonopen-source hardware and software platform.
Journal ArticleDOI

Integrated task and motion planning in belief space

TL;DR: It is shown that a relatively small set of symbolic operators can give rise to task-oriented perception in support of the manipulation goals and form a vocabulary of logical expressions that describe sets of belief states, which are goals and subgoals in the planning process.
Proceedings ArticleDOI

Towards autonomous robotic butlers: Lessons learned with the PR2

TL;DR: A new task-level executive system, SMACH, based on hierarchical concurrent state machines, which controls the overall behavior of the system and integrates several new components that are built on top of the PR2's current capabilities.
References
More filters
MonographDOI

Planning Algorithms: Introductory Material

TL;DR: This coherent and comprehensive book unifies material from several sources, including robotics, control theory, artificial intelligence, and algorithms, into planning under differential constraints that arise when automating the motions of virtually any mechanical system.
Journal ArticleDOI

Probabilistic roadmaps for path planning in high-dimensional configuration spaces

TL;DR: Experimental results show that path planning can be done in a fraction of a second on a contemporary workstation (/spl ap/150 MIPS), after learning for relatively short periods of time (a few dozen seconds).
Book

Planning Algorithms

Proceedings ArticleDOI

RRT-connect: An efficient approach to single-query path planning

TL;DR: A simple and efficient randomized algorithm is presented for solving single-query path planning problems in high-dimensional configuration spaces by incrementally building two rapidly-exploring random trees rooted at the start and the goal configurations.
Journal ArticleDOI

Randomized kinodynamic planning

TL;DR: In this paper, the authors presented the first randomized approach to kinodynamic planning (also known as trajectory planning or trajectory design), where the task is to determine control inputs to drive a robot from an unknown position to an unknown target.
Frequently Asked Questions (21)
Q1. What have the authors contributed in "Real-time perception-guided motion planning for a personal robot" ?

This paper presents significant steps towards the online integration of 3D perception and manipulation for personal robotics applications. The authors propose a modular and distributed architecture, which seamlessly integrates the creation of 3D maps for collision detection and semantic annotations, with a real-time motion replanning framework. To validate their system, the authors present results obtained during a comprehensive mobile manipulation scenario, which includes the fusion of the above components with a higher level executive. 

To validate its efficiency, the authors showed the integration of 3D collision maps and semantic annotations created in real-time from sensed laser data with a motion replanning framework as building blocks towards efficient and robust pick and place scenarios. While this is still work in progress, as the authors plan to deploy and integrate their architecture with grasping planners to close the sensing-action loop, they already make their software available as an off-the-shelf component of the ROS project. 

For the purpose of the experiments in this paper, the authors are mostly interested in pick and place operations, and thus the most important areas that the authors need to detect are tables and other horizontal planes which can support objects. 

Its purpose is to also provide contextspecific control, responsiveness to sensory events, and failure detection, analysis, and recovery. 

The second actor of the 3D perception pipeline is the Scene Interpreter node, which continuously segments and classifies surfaces by fitting planar models to them, in order to abstract sensor data and make it more informative. 

For the purposes of their experiments, constraints that restrict the position of certain robot parts (the gripper) at the goal state were all that was necessary. 

The record static map service call triggers a special mode in the node which records a complete sweep map as a static fixed reference map, that all recorded data will be merged with until the next service call. 

If the robot is to be allowed to operate and manipulate at high speeds, the laser tilt unit must be actuated as fast as possible, to account for the environment dynamics. 

Currently the library contains implementations of RRT (Rapidly-exploring Random Trees) [14], [15], EST (Expansive Space Trees) [16], [17], SBL (Single-query Bi-directional probabilistic roadmap planner with Lazy collision checking) [18] and the version of KPIECE presented in Algorithm 3. 

Having a static reference map is beneficial during longer manipulation scenarios, where the estimated collision map may contain holes due to occlusions caused by the robot’s arms at certain positions in time. 

A system as complex as the PR2 is driven by a number of subsystems that need to easily be able to communicate with each other: hardware drivers, controllers, visual perception, motion planning, higher level control, etc. 

In order to address these needs, sampling-based motion planners, which have been shown to be able to handle such problems [8], [9], are used. 

The main idea behind sampling-based motion planning [8], [9] is to sample the state space X of the robotic system and maintain a data-structure of the samples that can be used to guide the robot to the goal [13]. 

A solution to a motion planning problem is defined to be the set S = {x0, ...,xk}, where x0 is the initial state and xk is in the goal region. 

In general, a constraint is a function f : X → {true, false}, meaning that for any state x ∈ X , it can be decided whether it satisfies a constraint or not. 

When moving the robot base, without moving the arms, the authors employ a grid-based global planner and a local controller designed for efficient motion in tight spaces. 

Please note that the object present in the gripper from the left part of Figure 4 is incomplete in P , because parts of it are removed by the robot self filter at the laser scan level, due to the convex hulls of the gripper being much larger than the actual gripper itself. 

A deciding factor on the average computation time per map is represented by the number of data leaves that are requested from the motion planner, that is, the resolution of the world. 

Because there is no good fixed resolution that best accounts for all possible scenarios, the Dynamic Obstacle Map architecture is designed to be parameterizable on the fly. 

In addition to planning algorithms, OMPL contains code that can do inverse kinematics using a genetic algorithm: GAIK (Genetic Algorithm Inverse Kinematics) [19]. 

As shown, both for laser sweeps with a period of 1 or 2 seconds, all components manage to finish their computations in real-time, that is before the next state update.