scispace - formally typeset
Proceedings ArticleDOI

Cascading Decomposition and State Abstractions for Reinforcement Learning

Reads0
Chats0
TLDR
This work proposes cascading decomposition algorithm based on the spectral analysis on a normalized graph Laplacian to decompose the problem into several sub-problems and conduct parameter relevance analysis on each sub-problem to perform dynamic state abstraction.
Abstract
Problem decomposition and state abstractions applied in the hierarchical problem solving often requires manual construction of a hierarchy structure in advance. This work is to provide some automatic algorithms for dimension reduction in problem solving. We propose cascading decomposition algorithm based on the spectral analysis on a normalized graph Laplacian to decompose the problem into several sub-problems and conduct parameter relevance analysis on each sub-problem to perform dynamic state abstraction. In each decomposed sub-problem, only parameters in the projected state space related to its sub-goal are reserved, and identical sub-problems are integrated into one through feature comparison. The whole problem is transformed into a combination of projected sub-problems, and problem solving in the abstracted space is more efficient. The paper demonstrates the performance improvement on reinforcement learning based on the proposed state space decomposition and abstraction methods using a capture-the-flag scenario.

read more

Citations
More filters
Journal Article

Q-Cut: Dynamic discovery of sub-goals in reinforcement learning

TL;DR: The Q-Cut algorithm as mentioned in this paper is a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm.
Book ChapterDOI

Autonomous discovery of subgoals using acyclic state trajectories

TL;DR: This work proposes an approach to make agent can discover autonomously subgoals for task decomposition to accelerate reinforcement learning using divide and rule to solve large and complex problems.
DissertationDOI

Reinforcement learning in a Multi-agent Framework for Pedestrian Simulation

TL;DR: The authors use reinforcement learning to generate simulators plausible of peatones in different entornos. But, as shown in Fig. 1, the simulators are robust and present capacidades de abstraccion (comportamientos a niveles tactico and de planificacion) similar to una dinamica de peatone.
Proceedings ArticleDOI

Abstract Concept Learning Approach Based on Behavioural Feature Extraction

TL;DR: This approach provides a useful tool for non-episodic problems, where agent must search the environment to find special concepts, and yielded abstract representation of the concepts can be used in further high level planning tasks.
References
More filters
Journal ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Proceedings ArticleDOI

Normalized cuts and image segmentation

TL;DR: This work treats image segmentation as a graph partitioning problem and proposes a novel global criterion, the normalized cut, for segmenting the graph, which measures both the total dissimilarity between the different groups as well as the total similarity within the groups.
Book

Network Flows: Theory, Algorithms, and Applications

TL;DR: In-depth, self-contained treatments of shortest path, maximum flow, and minimum cost flow problems, including descriptions of polynomial-time algorithms for these core models are presented.
Book

Spectral Graph Theory

TL;DR: Eigenvalues and the Laplacian of a graph Isoperimetric problems Diameters and eigenvalues Paths, flows, and routing Eigen values and quasi-randomness
Journal ArticleDOI

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

TL;DR: It is shown that options enable temporally abstract knowledge and action to be included in the reinforcement learning frame- work in a natural and general way and may be used interchangeably with primitive actions in planning methods such as dynamic pro- gramming and in learning methodssuch as Q-learning.
Related Papers (5)