What have the authors contributed in "Integrated task and motion planning in belief space" ?

The authors describe an integrated strategy for planning, perception, state-estimation and action in complex mobile manipulation domains based on planning in the belief space of probability distributions over states using hierarchical goal regression ( pre-image back-chaining ). The authors develop a vocabulary of logical expressions that describe sets of belief states, which are goals and subgoals in the planning process. The authors show that a relatively small set of symbolic operators can give rise to task-oriented perception in support of the manipulation goals.

What future works have the authors mentioned in the paper "Integrated task and motion planning in belief space" ?

The additional operators to extend to the kitchen domain were relatively easy to construct. It is an important area of future work to explore more highly decomposed and non-analytic ( particlebased ) representations of belief, and to characterize operator pre-images in those representations. This is a critical area of future research, but the authors are optimistic that it will be possible to represent a wide variety of types of uncertainty over very large domains. This integrated general-purpose mechanism current supports robust, flexible, solution of simple mobile manipulation problems in the current implementation, and the authors expect it to serve as a foundation for the solution of significantly more complex problems in the future.

Why do the authors want to keep the basic structure of planning?

Because the authors are interested in regression-based planning using logical representations of sets of world states, the authors wish to retain the basic structure of planning to achieve a goal, rather than optimizing the sum of state and action costs over a fixed finite horizon or over the infinite horizon with value discounting.

What is the probability of a plan reaching a goal state?

In the probabilistic case, as long as each plan that is constructed has a finite probability of success, then eventually a goal state will be reached.

What is the way to make planning with simplified domain models robust?

Planning with simplified domain models is efficient and can be made robust by detecting execution failures and replanning online.

What is the idea of characterizing a distribution in terms of pnm and computing?

The idea of characterizing a distribution in terms of pnm and computing its regression applies more generally, for example, to non-parametric distributions represented as sets of samples.

What is the approach to handling probabilistic uncertainty in the outcomes of actions?

Their approach to handling probabilistic uncertainty in the outcomes of actions is to: construct a deterministic approximation of the domain, plan a path from the current state to a state satisfying the goal condition; execute the first step of the plan; observe the resulting state; plan a new path to the goal; execute the first step; etc.

How many look operations are needed to rule out a move?

Continuing the search, the authors find that at least one look operation is necessary before attempting to move the object, in order to establish that the object is in the starting location of the move.

What is the way to achieve believing the object is in location 0?

There is only one way to achieve believing the object is in location 0 with high confidence, which is to look in location 0 to verify that the object is there.

What does the interleaved hierarchical planning and execution architecture do?

Interleaved hierarchical planning and execution fits beautifully with information gain: the system makes a high-level plan to gather information and then uses it, and the interleaved hierarchical planning and execution architecture ensures that planning that depends on the information naturally takes place after the information has been gathered; and•

What is the odometry error model used to represent the distribution over the product space?

This approach is convenient because the product of multiple tangent spaces together with regular real spaces for the other dimensions can be taken, and a single multivariate Gaussian used to represent the entire joint distribution over the product space [Fletcher et al., 2003].

(Open Access) Integrated task and motion planning in belief space (2013) | Leslie Pack Kaelbling

Integrated task and motion planning in belief space

Leslie Pack Kaelbling and Tom´as Lozano-P´erez

CSAIL, MIT, Cambridge, MA 02139

{lpk, tlp}@csail.mit.edu

Abstract

We describe an integrated strategy for planning, perception, state-estimation and action in

complex mobile manipulation domains based on planning in the belief space of probability distri-

butions over states using hierarchical goal regression (pre-image back-chaining). We develop a

vocabulary of logical expressions that describe sets of belief states, which are goals and subgoals

in the planning process. We show that a relatively small set of symbolic operators can give rise

to task-oriented perception in support of the manipulation goals. An implementation of this

method is demonstrated in simulation and on a real PR2 robot, showing robust, ﬂexible solution

of mobile manipulation problems with multiple objects and substantial uncertainty.

1 Introduction

As robots become more capable of sophisticated sensing, navigation, and manipulation, we would

like them to carry out increasingly complex tasks autonomously. A robot that helps in a house-

hold must select actions over the scale of hours or days, considering abstract features such as the

desires of the occupants of the house, as well as detailed geometric models that support locating

and manipulating objects. The complexity of such tasks derives from very long time horizons,

large numbers of objects to be considered and manipulated, and fundamental uncertainty about

properties and locations of those objects. Specifying control policies directly, in the form of tables

or state machines, becomes intractable as the size and variability of the domain increases. However,

good control decisions can be made with a compact speciﬁcation by using a model of the world

dynamics to do on-line planning and execution.

The uncertainty in problems of this kind is pervasive and fundamental: it is, in general, impos-

sible to sense suﬃciently to remove all of the important uncertainty. The robot will not know the

contents of all of the containers in a house, or where someone left their car keys, or the owner’s

preference for dinner. In order to behave eﬀectively in the face of such uncertainty, the robot must

explicitly take actions to gain information: look in a cupboard, remove an occluding object, or ask

someone a question.

We have developed an approach to integrated task and motion planning that combines geo-

metric and symbolic representations in an aggressively hierarchical planning architecture, called

hpn (for hierarchical planning in the now), which we summarize in section 1.2 and discuss in de-

tail in [Kaelbling and Lozano-P´erez, 2011, 2012a]. The hierarchical decomposition allows eﬃcient

solution of problems with very long horizons; the symbolic representations support abstraction in

complex domains with large numbers of objects and are integrated eﬀectively with the detailed

geometric models that support motion planning. In this paper, we extend the hpn approach to

Figure 1: PR2 robot manipulating objects

handle two types of uncertainty: future-state uncertainty about what the outcome of an action

will be, and current-state uncertainty about what the current state actually is. Future-state uncer-

tainty is handled by planning in approximate deterministic models, performing careful execution

monitoring, and replanning when necessary. Current-state uncertainty is handled by planning in

belief space: the space of probability distributions over possible underlying world states. Explicitly

modeling the robot’s uncertainty during planning enables the selection of actions based both on

their ability to gather information as well as their ability to change the state of the world.

This paper describes a tightly integrated approach, weaving together perception, estimation,

geometric reasoning, symbolic task planning, and control to generate behavior in a real robot that

robustly achieves tasks in complex, uncertain domains. We have formulated this method in the

context of a mobile manipulator doing household tasks, such as the one shown in ﬁgure 1, but

it can be applied much more broadly. Problems of surveillance or locating objects of interest are

naturally formulated in this framework, as are more complex operational tasks, such as disaster

relief or managing logistical operations in a dynamic uncertain domain.

1.1 Handling uncertainty

The decision-theoretic optimal approach to planning in domains with probabilistic dynamics is to

make a conditional plan, in the form of a tree or policy, supplying an action to take in response

to any possible outcome of a preceding action. Figure 2(a) illustrates one strategy, which consists

of building a tree starting at the current world state s, branching on the robot’s choice of actions

a and then, for each action, branching on the probability of each resulting state s

. To select

actions, one grows the tree out to some depth k, then evaluates it from the bottom up using the

“expectimax” rule. A static evaluation function assigns a value ρ

to each leaf node. Then, the

value of a probabilistic node is computed as the expected value of its children, and the value of an

s' s' s' s' s' s's' s' s'

s'' s'' s'' s'' s'' s''s'' s'' s''

...

expectation

maximum

(a) Probabilistic policy tree

state

estimator

controller

action

belief state

observation

environment

(b) pomdp estimation and control architecture

Figure 2: Models of decision-making in uncertain domains.

action-choice node is computed as the maximum of the values of its children. The policy consists

of selecting the maximizing action at any action-choice node that is reached.

The process of constructing and evaluating a tree of this kind can be prohibitively expensive;

but there have been recent advances in eﬀective sample-based approaches [Gelly and Silver, 2008].

For eﬃciency and robustness, our approach to action selection is to construct a deterministic

approximation of the dynamics, use the approximate dynamics to build a sequential non-branching

plan, execute the plan while perceptually monitoring the world for deviations from the expected

outcomes of the actions, and replan when deviations occur. This method has worked well in control

applications [Mayne and Michalska, 1990] as well as symbolic planning domains [Yoon et al., 2007].

Replanning approaches that work in the state space address uncertainty in the future outcomes

of actions, but not uncertainty about the current world state. Current-state uncertainty is un-

avoidable in most real-world applications. The optimal solution to such problems is found in the

formulation of partially observable Markov decision processes (pomdps) [Smallwood and Sondik,

1973] and involves planning in the belief space rather than the underlying state space of the domain.

The belief space is the space of probability distributions over underlying world states. A controller

in such a problem is divided into two components, as shown in ﬁgure 2(b). The state estimator

is a process (such as a Bayesian ﬁlter or a Kalman ﬁlter), which maintains a current distribution

over underlying world states, conditioned on the history of actions and observations the system has

made. The controller is a policy that maps belief states to actions: it can be computed oﬀ-line and

stored in a representation that allows eﬃcient execution, or it can itself be a program that does

signiﬁcant computation on-line to select an appropriate action for the current belief state.

In the traditional pomdp literature, the goal is to ﬁnd an optimal or near-optimal policy using

oﬀ-line computation; the policy can then be executed on-line with little further computation, simply

determining which of a ﬁnite number of categories the belief state belongs to, and executing the

associated action. Recent point-based solvers [Kurniawati et al., 2008] can ﬁnd near-optimal policies

for domains with thousands of states. However, the problem of computing a near-optimal policy

for uncertain manipulation domains with many movable objects is still prohibitively expensive.

Our strategy will be to construct a policy “in the now”: that is, to apply the hpn approach to

interleaved planning and execution, but in the space of beliefs, using a determinized version of the

belief-space dynamics. Belief space is continuous (the space of probability distributions) and so is

not amenable to standard discrete planning approaches. We address it with hpn in the same way

that we address planning for geometric problems: by dynamically constructing discretizations that

are appropriate for the problem at hand.

We will use symbolic descriptions to characterize aspects of the robot’s belief state that specify

goals and subgoals during planning. For example, the condition “With probability greater than

0.95, the cup is in the cupboard,” can be written as BIn(cup, cupboard, 0.05), and might serve

as a goal for planning. Our description of the eﬀects of the robot’s actions, encoded as operator

descriptions, will not be in terms of their eﬀect on the state of the external world, which is not

observable, but in terms of their eﬀect on the logical assertions that characterize the robot’s belief.

In general, it will be very diﬃcult to characterize the exact pre-conditions of an operation in belief

space; we will strive to provide an approximation that supports the construction of reasonable plans

and rely on execution monitoring and replanning to handle errors due to approximation. We will

describe and illustrate this approach in detail in the rest of the paper.

1.2 HPN in observable domains

hpn [Kaelbling and Lozano-P´erez, 2011, 2012a] is a hierarchical approach to solving long-horizon

problems, which performs a temporal decomposition by planning operations at multiple levels of

abstraction; this ensures that problems to be addressed by the planner always have a reasonably

short horizon, making planning feasible.

In order to plan with abstract operators, we must be able to characterize their preconditions

and eﬀects at various levels of abstraction. Even at abstract levels, geometric properties of the

domain may be critical for planning; but if we plan using abstracted models of the operations, we

will not be able to determine a detailed geometric conﬁguration that results from performing an

operation. To support our hierarchical planning strategy we, instead, plan backwards from the goal

using the process known as goal regression [Ghallab et al., 2004] in task planning and pre-image

backchaining in motion planning [Lozano-P´erez et al., 1984]. Starting from the set of states that

satisﬁes the goal condition, we work backward, constructing descriptions of pre-images of the goal

under various abstract operations. The pre-image is the set of states such that, if the operation

were executed, a goal state would result. The key observation is that, whereas the description of

the detailed world state is an enormously complicated structure, the descriptions of the goal set,

and of pre-images of the goal set, are often simple conjunctions of a few logical requirements.

In a continuous space, pre-images may be characterized geometrically: if the goal is a circle of

locations in x, y space, then the operation of moving one meter in x will have a pre-image that is

also a circle of locations, but with the x coordinate displaced by a meter. In a logical, or combined

logical and geometric space, the deﬁnition of pre-image is the same, but computing pre-images will

require a combination of logical and geometric reasoning. We support abstract geometric reasoning

by constructing and referring to salient geometric objects in the logical representation used by the

planner. So, for example, we can say that a region of space must be clear of obstacles before an

object can be placed in its goal location, without specifying a particular geometric arrangement of

the obstacles in the domain.

The complexity of planning depends both on the horizon and the branching factor. We use

hierarchy to reduce the horizon, but the branching factor, in general, remains inﬁnite: there are

innumerable places to put an object, innumerable paths for the robot to follow from one conﬁg-

uration to another, and so on. We use generators: functions that make use both of constraints

from the goal and heuristic information from the starting state to make choices from these inﬁnite

domains that are likely to be successful: Our approach can be extended to support sample-based

backtracking over these choices if they fail. Because the value-generation process can depend on

the goal and the initial state, the generated values are much more likely to be successful than ones

chosen through an arbitrary sampling or discretization process.

To handle execution errors robustly, we interleave planning with execution, so that all planning

problems have short horizons and start from a current, known initial state. If an action has an

unexpected eﬀect, a new plan can be constructed at not too great a cost, and execution resumed.

We refer to this overall approach as hpn for “hierarchical planning in the now.”

1.3 Related work

Advances in perception, navigation and motion planning have enabled sophisticated manipulation

systems that interleave perception, planning and acting in realistic domains (e.g., those of Srinivasa

et al. [2009], Rusu et al. [2009], Pangercic et al. [2010]). Although these systems use various forms

of planning, there is no systematic planning framework that can cope with manipulation tasks

that involve abstract goals (such as “cook dinner”), that can plan long sequences of actions in the

presence of substantial uncertainty, both in the current state and in the eﬀect of actions, and that

can plan for acquiring information.

The work of Dogar and Srinivasa [2012] comes closest among existing systems to satisfying our

goals. They tackle manipulation of multiple objects using multiple types of primitives, including

grasping and pushing. The inclusion of pushing enables very eﬃcient solutions to some problems

of manipulation in cluttered environments. They also include explicit modeling of uncertainty in

object poses and the eﬀect of the actions on this uncertainty. Their implementation does not

currently encompass actions intended to gather information, but it appears that their approach

could readily be extended to such actions. A fundamental diﬀerence from our approach is that

they plan at the level of object poses and do not integrate task-level reasoning nor do they reason

hierarchically.

There have been several recent approaches to integrated task and motion planning [Cambon

et al., 2009, Plaku and Hager, 2010, Marthi et al., 2010, Dornhege et al., 2009] with the potential

of scaling to mobile manipulation problems; we review this related work in detail in [Kaelbling and

Lozano-P´erez, 2012a]. However, none of these approaches addresses uncertainty directly.

A number of recent papers [Alterovitz et al., 2007, Levihn et al., 2012] tackle motion planning

problems with substantial future-state uncertainty. Most relevant is the work of Levihn et al., which

addresses the problem of navigating among movable obstacles in the presence of uncertainty in the

eﬀect of robot actions on the obstacles. However, this work assumes that there is no uncertainty

in the current state.

There have been attempts to formulate and solve problems of planning robot motions un-

der uncertainty in non-probabilistic frameworks dating back to the “pre-image backchaining” pa-

per [Lozano-P´erez et al., 1984]. This work seeks to construct strategies, based on information

sets, that are guaranteed to reach a goal and to know that they have reached it, with guaranteed

termination conditions [Brafman et al., 1997, Erdmann, 1994, Donald, 1988]. Excellent summaries

Integrated task and motion planning in belief space

Figures

Citations

A Concise Introduction to Decentralized POMDPs

Integrated Task and Motion Planning

Incremental Task and Motion Planning: A Constraint-Based Approach

Planning in the continuous domain

Inner Monologue: Embodied Reasoning through Planning with Language Models

References

Probabilistic Robotics

Planning Algorithms: Introductory Material

Robot Motion Planning

Unscented filtering and nonlinear estimation

Planning Algorithms

Related Papers (5)

Combined task and motion planning through an extensible planner-independent interface layer

Planning and Acting in Partially Observable Stochastic Domains

Strips: A new approach to the application of theorem proving to problem solving

Probabilistic roadmaps for path planning in high-dimensional configuration spaces

RRT-connect: An efficient approach to single-query path planning

Frequently Asked Questions (13)

Q1. What have the authors contributed in "Integrated task and motion planning in belief space" ?

Q2. What future works have the authors mentioned in the paper "Integrated task and motion planning in belief space" ?

Q3. Why do the authors want to keep the basic structure of planning?

Q4. What is the probability of a plan reaching a goal state?

Q5. What is the approach to planning in domains with probabilistic dynamics?

Q6. What is the way to make planning with simplified domain models robust?

Q7. What is the idea of characterizing a distribution in terms of pnm and computing?

Q8. What is the approach to handling probabilistic uncertainty in the outcomes of actions?

Q9. How many look operations are needed to rule out a move?

Q10. What is the way to achieve believing the object is in location 0?

Q11. What does the interleaved hierarchical planning and execution architecture do?

Q12. What is the definition of pre-images in a continuous space?

Q13. What is the odometry error model used to represent the distribution over the product space?