Top 4 papers published by Thomas G. Dietterich from Oregon State University in 1999

Proceedings Article•

State Abstraction in MAXQ Hierarchical Reinforcement Learning

[...]

29 Nov 1999

TL;DR: This paper defines five conditions under which state abstraction can be combined with the MAXQ value function decomposition and proves that the MAX Q learning algorithm converges under these conditions and shows experimentally that state abstraction is important for the successful application of MAXQ-Q learning.

...read moreread less

Abstract: Many researchers have explored methods for hierarchical reinforcement learning (RL) with temporal abstractions, in which abstract actions are defined that can perform many primitive actions before terminating. However, little is known about learning with state abstractions, in which aspects of the state space are ignored. In previous work, we developed the MAXQ method for hierarchical RL. In this paper, we define five conditions under which state abstraction can be combined with the MAXQ value function decomposition. We prove that the MAXQ-Q learning algorithm converges under these conditions and show experimentally that state abstraction is important for the successful application of MAXQ-Q learning.

...read moreread less

74 citations

Posted Content•

Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

21 May 1999-arXiv: Learning

TL;DR: The MAXQ-Q algorithm as mentioned in this paper decomposes the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposes value function of the target MDP into an additive combination of the value functions of the smaller mDPs, and proves that it converges wih probability 1 to a kind of locally-optimal policy known as a recursively optimal policy.

...read moreread less

Abstract: This paper presents the MAXQ approach to hierarchical reinforcement learning based on decomposing the target Markov decision process (MDP) into a hierarchy of smaller MDPs and decomposing the value function of the target MDP into an additive combination of the value functions of the smaller MDPs. The paper defines the MAXQ hierarchy, proves formal results on its representational power, and establishes five conditions for the safe use of state abstractions. The paper presents an online model-free learning algorithm, MAXQ-Q, and proves that it converges wih probability 1 to a kind of locally-optimal policy known as a recursively optimal policy, even in the presence of the five kinds of state abstraction. The paper evaluates the MAXQ representation and MAXQ-Q through a series of experiments in three domains and shows experimentally that MAXQ-Q (with state abstractions) converges to a recursively optimal policy much faster than flat Q learning. The fact that MAXQ learns a representation of the value function has an important benefit: it makes it possible to compute and execute an improved, non-hierarchical policy via a procedure similar to the policy improvement step of policy iteration. The paper demonstrates the effectiveness of this non-hierarchical execution experimentally. Finally, the paper concludes with a comparison to related work and a discussion of the design tradeoffs in hierarchical reinforcement learning.

...read moreread less

20 citations

Learning decision trees for loss minimization in multi-class problems

[...]

Dragos D. Margineantu, Thomas G. Dietterich

01 Jan 1999

TL;DR: This work studies methods for modifying C4.5 to incorporate arbitrary loss matrices and tests several methods: a wrapper method and some simple heuristics, and shows that this measure can predict when more e cient methods will be applied and when the wrapper method must be applied.

...read moreread less

Abstract: Many machine learning applications require classi ers that minimize an asymmetric loss function rather than the raw misclassi cation rate. We study methods for modifying C4.5 to incorporate arbitrary loss matrices. One way to incorporate loss information into C4.5 is to manipulate the weights assigned to the examples from di erent classes. For 2-class problems, this works for any loss matrix, but for k > 2 classes, it is not su cient. Nonetheless, we ask what is the set of class weights that best approximates an arbitrary k k loss matrix, and we test and compare several methods: a wrapper method and some simple heuristics. The best method is a wrapper method that directly optimizes the loss using a holdout data set. We de ne complexity measure for loss matrices and show that this measure can predict when more e cient methods will su ce and when the wrapper method must be applied.

...read moreread less

19 citations

Posted Content•

State Abstraction in MAXQ Hierarchical Reinforcement Learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

21 May 1999-arXiv: Learning

TL;DR: In this paper, five conditions under which state abstraction can be combined with the MAXQ value function decomposition are defined and shown experimentally that state abstraction is important for the successful application of MAXQ-Q learning.

...read moreread less

Abstract: Many researchers have explored methods for hierarchical reinforcement learning (RL) with temporal abstractions, in which abstract actions are defined that can perform many primitive actions before terminating. However, little is known about learning with state abstractions, in which aspects of the state space are ignored. In previous work, we developed the MAXQ method for hierarchical RL. In this paper, we define five conditions under which state abstraction can be combined with the MAXQ value function decomposition. We prove that the MAXQ-Q learning algorithm converges under these conditions and show experimentally that state abstraction is important for the successful application of MAXQ-Q learning.

...read moreread less

7 citations

Showing papers by "Thomas G. Dietterich published in 1999"