Prioritized goal decomposition of Markov decision processes: toward a synthesis of classical and decision theoretic planning

Open AccessProceedings Article

Prioritized goal decomposition of Markov decision processes: toward a synthesis of classical and decision theoretic planning

Craig Boutilier, +2 more

- pp 1156-1162

Chats0

TLDR

An abstraction mechanism is used to generate abstract MDPs associated with different objectives, and several methods for merging the policies for these different objectives are considered.

Abstract:

We describe an approach to goal decomposition for a certain class of Markov decision processes (MDPs). An abstraction mechanism is used to generate abstract MDPs associated with different objectives, and several methods for merging the policies for these different objectives are considered. In one t echnique, causal (least-commitment) structures are generated for

Citations

PDF

Open Access

More filters

Journal ArticleDOI

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Richard S. Sutton, +2 more

- 01 Aug 1999 -

Artificial Intelligence

TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.

...read moreread less

Journal ArticleDOI

Decision-theoretic planning: structural assumptions and computational leverage

Craig Boutilier, +2 more

- 01 Jul 1999 -

Journal of Artificial Intelligence Resea...

TL;DR: In this article, the authors present an overview and synthesis of MDP-related methods, showing how they provide a unifying framework for modeling many classes of planning problems studied in AI.

...read moreread less

Proceedings Article

FeUdal Networks for Hierarchical Reinforcement Learning

Alexander Vezhnevets, +6 more

TL;DR: This work introduces FeUdal Networks (FuNs), a novel architecture for hierarchical reinforcement learning inspired by the feudal reinforcement learning proposal of Dayan and Hinton, and gains power and efficacy by decoupling end-to-end learning across multiple levels -- allowing it to utilise different resolutions of time.

...read moreread less

Journal ArticleDOI

Stochastic dynamic programming with factored representations

Craig Boutilier, +2 more

- 01 Aug 2000 -

Artificial Intelligence

TL;DR: This work uses dynamic Bayesian networks (with decision trees representing the local families of conditional probability distributions) to represent stochastic actions in an MDP, together with a decision-tree representation of rewards, and develops versions of standard dynamic programming algorithms that directly manipulate decision-Tree representations of policies and value functions.

...read moreread less

Posted Content

FeUdal Networks for Hierarchical Reinforcement Learning

Alexander Vezhnevets, +6 more

- 03 Mar 2017 -

arXiv: Artificial Intelligence

TL;DR: FeUdal Networks (FuNs) as mentioned in this paper is a hierarchical reinforcement learning architecture that employs a Manager module and a Worker module, where the Manager operates at lower temporal resolution and sets abstract goals which are conveyed to and enacted by the Worker.

...read moreread less

Collapse

References

PDF

Open Access

More filters

Book

Dynamic Programming

Richard Ernest Bellman

TL;DR: The more the authors study the information processing aspects of the mind, the more perplexed and impressed they become, and it will be a very long time before they understand these processes sufficiently to reproduce them.

...read moreread less

Book

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Martin L. Puterman

TL;DR: Puterman as discussed by the authors provides a uniquely up-to-date, unified, and rigorous treatment of the theoretical, computational, and applied research on Markov decision process models, focusing primarily on infinite horizon discrete time models and models with discrete time spaces while also examining models with arbitrary state spaces, finite horizon models, and continuous time discrete state models.

...read moreread less

Book

Decisions with Multiple Objectives: Preferences and Value Trade-Offs

Ralph L. Keeney, +2 more

TL;DR: In this article, a confused decision maker, who wishes to make a reasonable and responsible choice among alternatives, can systematically probe his true feelings in order to make those critically important, vexing trade-offs between incommensurable objectives.

...read moreread less

Book

Dynamic Programming and Markov Processes

Ronald A. Howard

Journal ArticleDOI

A model for reasoning about persistence and causation

Thomas Dean, +1 more

TL;DR: A model of causal reasoning that accounts for knowledge concerning cause‐and‐effect relationships and knowledge concerning the tendency for propositions to persist or not as a function of time passing is described.

...read moreread less

Prioritized goal decomposition of Markov decision processes: toward a synthesis of classical and decision theoretic planning

Citations

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Decision-theoretic planning: structural assumptions and computational leverage

FeUdal Networks for Hierarchical Reinforcement Learning

Stochastic dynamic programming with factored representations

FeUdal Networks for Hierarchical Reinforcement Learning

References

Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Decisions with Multiple Objectives: Preferences and Value Trade-Offs

Dynamic Programming and Markov Processes

A model for reasoning about persistence and causation

Related Papers (5)

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Decision-theoretic planning: structural assumptions and computational leverage

Dynamic Programming and Markov Processes

Reinforcement Learning with Hierarchies of Machines

Neuro-dynamic programming