scispace - formally typeset
Open AccessProceedings Article

Generalizing plans to new environments in relational MDPs

TLDR
This paper presents an approach to the generalization problem based on a new framework of relational Markov Decision Processes (RMDPs), and proves that a polynomial number of sampled environments suffices to achieve performance close to the performance achievable when optimizing over the entire space.
Abstract
A longstanding goal in planning research is the ability to generalize plans developed for some set of environments to a new but similar environment, with minimal or no replanning. Such generalization can both reduce planning time and allow us to tackle larger domains than the ones tractable for direct planning. In this paper, we present an approach to the generalization problem based on a new framework of relational Markov Decision Processes (RMDPs). An RMDP can model a set of similar environments by representing objects as instances of different classes. In order to generalize plans to multiple environments, we define an approximate value function specified in terms of classes of objects and, in a multiagent setting, by classes of agents. This class-based approximate value function is optimized relative to a sampled subset of environments, and computed using an efficient linear programming method. We prove that a polynomial number of sampled environments suffices to achieve performance close to the performance achievable when optimizing over the entire space. Our experimental results show that our method generalizes plans successfully to new, significantly larger, environments, with minimal loss of performance relative to environment-specific planning. We demonstrate our approach on a real strategic computer war game.

read more

Content maybe subject to copyright    Report

Citations
More filters
Journal ArticleDOI

Transfer Learning for Reinforcement Learning Domains: A Survey

TL;DR: This article presents a framework that classifies transfer learning methods in terms of their capabilities and goals, and then uses it to survey the existing literature, as well as to suggest future directions for transfer learning work.
Journal ArticleDOI

Markov Decision Processes

TL;DR: The theory of Markov Decision Processes is the theory of controlled Markov chains as mentioned in this paper, which has found applications in various areas like e.g. computer science, engineering, operations research, biology and economics.
Journal ArticleDOI

Knowledge in Action: Logical Foundations for Specifying and Implementing Dynamical Systems

Alex M. Andrew
- 01 Aug 2002 - 
TL;DR: When I started out as a newly hatched PhD student, one of the first articles I read and understood was Ray Reiter’s classic article on default logic, and I became fascinated by both default logic and, more generally, non-monotonic logics.
Posted Content

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

TL;DR: The hierarchical-DQN framework as discussed by the authors integrates hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning, allowing for flexible goal specifications, such as functions over entities and relations.
Proceedings Article

Hierarchical deep reinforcement learning: integrating temporal abstraction and intrinsic motivation

TL;DR: h-DQN is presented, a framework to integrate hierarchical value functions, operating at different temporal scales, with intrinsically motivated deep reinforcement learning, and allows for flexible goal specifications, such as functions over entities and relations.
References
More filters
Book

Reinforcement Learning: An Introduction

TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Book

Introduction to Reinforcement Learning

TL;DR: In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning.
Book ChapterDOI

Some philosophical problems from the standpoint of artificial intelligence

TL;DR: In this paper, the authors consider the problem of reasoning about whether a strategy will achieve a goal in a deterministic world and present a method to construct a sentence of first-order logic which will be true in all models of certain axioms if and only if a certain strategy can achieve a certain goal.
Book

A mathematical introduction to logic

TL;DR: A comparison of first- and second-order logic in the case of SETs shows that the former is more likely to be correct and the latter is less likely.
Journal ArticleDOI

Hierarchical reinforcement learning with the MAXQ value function decomposition

TL;DR: The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges with probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction.