Multi-time Models for Temporally Abstract Planning

Open AccessProceedings Article

Multi-time Models for Temporally Abstract Planning

Doina Precup, +1 more

- Vol. 10, pp 1050-1056

Chats0

TLDR

A more general form of temporally abstract model is introduced, the multi-time model, and its suitability for planning and learning by virtue of its relationship to the Bellman equations is established.

Abstract:

Planning and learning at multiple levels of temporal abstraction is a key problem for artificial intelligence. In this paper we summarize an approach to this problem based on the mathematical framework of Markov decision processes and reinforcement learning. Current model-based reinforcement learning is based on one-step models that cannot represent common-sense higher-level actions, such as going to lunch, grasping an object, or flying to Denver. This paper generalizes prior work on temporally abstract models [Sutton, 1995] and extends it from the prediction setting to include actions, control, and planning. We introduce a more general form of temporally abstract model, the multi-time model, and establish its suitability for planning and learning by virtue of its relationship to the Bellman equations. This paper summarizes the theoretical framework of multi-time models and illustrates their potential advantages in a grid world planning task.

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Deep learning in neural networks

Jürgen Schmidhuber

- 01 Jan 2015 -

Neural Networks

TL;DR: This historical survey compactly summarizes relevant work, much of it from the previous millennium, review deep supervised learning, unsupervised learning, reinforcement learning & evolutionary computation, and indirect search for short programs encoding deep and large networks.

...read moreread less

Journal ArticleDOI

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Richard S. Sutton, +2 more

- 01 Aug 1999 -

Artificial Intelligence

TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.

...read moreread less

Journal ArticleDOI

Hierarchical reinforcement learning with the MAXQ value function decomposition

Thomas G. Dietterich

- 01 Aug 2000 -

Journal of Artificial Intelligence Resea...

TL;DR: The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges with probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction.

...read moreread less

Journal ArticleDOI

Recent Advances in Hierarchical Reinforcement Learning

Andrew G. Barto, +1 more

- 01 Jan 2003 -

Discrete Event Dynamic Systems

TL;DR: This work reviews several approaches to temporal abstraction and hierarchical organization that machine learning researchers have recently developed and discusses extensions of these ideas to concurrent activities, multiagent coordination, and hierarchical memory for addressing partial observability.

...read moreread less

Proceedings Article

Reinforcement Learning with Hierarchies of Machines

Ronald Parr, +1 more

TL;DR: This work presents provably convergent algorithms for problem-solving and learning with hierarchical machines and demonstrates their effectiveness on a problem with several thousand states.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Reinforcement Learning: An Introduction

Richard S. Sutton, +1 more

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.

...read moreread less

Book

Introduction to Reinforcement Learning

Richard S. Sutton, +1 more

TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.

...read moreread less

Journal ArticleDOI

Learning from delayed rewards

Ben Kröse

- 01 Oct 1995 -

Robotics and Autonomous Systems

TL;DR: The invention relates to a circuit for use in a receiver which can receive two-tone/stereo signals which is intended to make a choice between mono or stereo reproduction of signal A or of signal B and vice versa.

...read moreread less

Book

Reinforcement Learning

Richard S. Sutton

TL;DR: Reinforcement learning as mentioned in this paper is an approach to artificial intelligence that emphasizes learning by the individual from its interaction with its environment, and it has been shown that exploration and exploitation can be pursued exclusively without failing at the task.

...read moreread less

Book

A Structure for Plans and Behavior

Earl D. Sacerdoti

TL;DR: Progress to date in the ability of a computer system to understand and reason about actions is described, and the structure of a plan of actions is as important for problem solving and execution monitoring as the nature of the actions themselves.

...read moreread less

Multi-time Models for Temporally Abstract Planning

Citations

Deep learning in neural networks

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Hierarchical reinforcement learning with the MAXQ value function decomposition

Recent Advances in Hierarchical Reinforcement Learning

Reinforcement Learning with Hierarchies of Machines

References

Reinforcement Learning: An Introduction

Introduction to Reinforcement Learning

Learning from delayed rewards

Reinforcement Learning

A Structure for Plans and Behavior

Related Papers (5)

Reinforcement Learning: An Introduction

Reinforcement Learning with Hierarchies of Machines

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Hierarchical reinforcement learning with the MAXQ value function decomposition

Markov Decision Processes: Discrete Stochastic Dynamic Programming