Top 10 papers published by Thomas G. Dietterich from Oregon State University in 2012

Proceedings Article•

A Conditional Multinomial Mixture Model for Superset Label Learning

[...]

Li-Ping Liu¹, Thomas G. Dietterich¹•Institutions (1)

03 Dec 2012

TL;DR: A probabilistic model, the Logistic Stick-Breaking Conditional Multinomial Model (LSB-CMM), is proposed, derived from the logistic stick-breaking process, to solve the superset label learning problem by maximizing the likelihood of the candidate label sets of training instances.

...read moreread less

Abstract: In the superset label learning problem (SLL), each training instance provides a set of candidate labels of which one is the true label of the instance. As in ordinary regression, the candidate label set is a noisy version of the true label. In this work, we solve the problem by maximizing the likelihood of the candidate label sets of training instances. We propose a probabilistic model, the Logistic Stick-Breaking Conditional Multinomial Model (LSB-CMM), to do the job. The LSB-CMM is derived from the logistic stick-breaking process. It first maps data points to mixture components and then assigns to each mixture component a label drawn from a component-specific multinomial distribution. The mixture components can capture underlying structure in the data, which is very useful when the model is weakly supervised. This advantage comes at little cost, since the model introduces few additional parameters. Experimental tests on several real-world problems with superset labels show results that are competitive or superior to the state of the art. The discovered underlying structures also provide improved explanations of the classification predictions.

...read moreread less

165 citations

Proceedings Article•

Active imitation learning via reduction to I.I.D. active learning

[...]

Kshitij Judah¹, Alan Fern¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

14 Aug 2012

TL;DR: This paper considers active imitation learning with the goal of reducing this effort by querying the expert about the desired action at individual states, which are selected based on answers to past queries and the learner's interactions with an environment simulator.

...read moreread less

Abstract: In standard passive imitation learning, the goal is to learn a target policy by passively observing full execution trajectories of it. Unfortunately, generating such trajectories can require substantial expert effort and be impractical in some cases. In this paper, we consider active imitation learning with the goal of reducing this effort by querying the expert about the desired action at individual states, which are selected based on answers to past queries and the learner's interactions with an environment simulator. We introduce a new approach based on reducing active imitation learning to i.i.d. active learning, which can leverage progress in the i.i.d. setting. Our first contribution, is to analyze reductions for both non-stationary and stationary policies, showing that the label complexity (number of queries) of active imitation learning can be substantially less than passive learning. Our second contribution, is to introduce a practical algorithm inspired by the reductions, which is shown to be highly effective in four test domains compared to a number of alternatives.

...read moreread less

46 citations

Posted Content•

Probabilistic Models for Anomaly Detection in Remote Sensor Data Streams

[...]

Ethan W. Dereszynski¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

20 Jun 2012-arXiv: Artificial Intelligence

TL;DR: In this paper, a Dynamic Bayesian Network (DBN) model is proposed for analyzing sensor observations and distinguishing sensor failures from valid data for the case of air temperature measured at 15 minute time resolution.

...read moreread less

Abstract: Remote sensors are becoming the standard for observing and recording ecological data in the field. Such sensors can record data at fine temporal resolutions, and they can operate under extreme conditions prohibitive to human access. Unfortunately, sensor data streams exhibit many kinds of errors ranging from corrupt communications to partial or total sensor failures. This means that the raw data stream must be cleaned before it can be used by domain scientists. In our application environment|the H.J. Andrews Experimental Forest|this data cleaning is performed manually. This paper introduces a Dynamic Bayesian Network model for analyzing sensor observations and distinguishing sensor failures from valid data for the case of air temperature measured at 15 minute time resolution. The model combines an accurate distribution of long-term and short-term temperature variations with a single generalized fault model. Experiments with historical data show that the precision and recall of the method is comparable to that of the domain expert. The system is currently being deployed to perform real-time automated data cleaning.

...read moreread less

30 citations

Posted Content•

Learning from Sparse Data by Exploiting Monotonicity Constraints

[...]

Eric E. Altendorf¹, Angelo C. Restificar¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

04 Jul 2012-arXiv: Learning

TL;DR: This paper shows how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms.

...read moreread less

Abstract: When training data is sparse, more domain knowledge must be incorporated into the learning algorithm in order to reduce the effective size of the hypothesis space. This paper builds on previous work in which knowledge about qualitative monotonicities was formally represented and incorporated into learning algorithms (e.g., Clark & Matwin's work with the CN2 rule learning algorithm). We show how to interpret knowledge of qualitative influences, and in particular of monotonicities, as constraints on probability distributions, and to incorporate this knowledge into Bayesian network learning algorithms. We show that this yields improved accuracy, particularly with very small training sets (e.g. less than 10 examples).

...read moreread less

15 citations

Journal Article•DOI•

An Ensemble Architecture for Learning Complex Problem-Solving Techniques from Demonstration

[...]

Xiaoqin Shelley Zhang¹, Bhavesh Shrestha¹, Sungwook Yoon², Subbarao Kambhampati², Phillip DiBona³, Jinhong K. Guo³, Daniel McFarlane³, Martin Hofmann³, Kenneth R. Whitebread³, Darren Scott Appling⁴, Elizabeth Whitaker⁴, Ethan Trewhitt⁴, Li Ding⁵, James Michaelis⁵, Deborah L. McGuinness⁵, James A. Hendler⁵, Janardhan Rao Doppa⁶, Charles Parker⁶, Thomas G. Dietterich⁶, Prasad Tadepalli⁶, Weng-Keen Wong⁶, Derek Green⁷, Anton Rebguns⁷, Diana F. Spears⁷, Ugur Kuter⁸, Geoff Levine⁹, Gerald DeJong⁹, Reid MacTavish¹⁰, Santiago Ontañón¹⁰, Jainarayan Radhakrishnan¹⁰, Ashwin Ram¹⁰, Hala Mostafa¹, Huzaifa Zafar¹, Chongjie Zhang¹, Daniel D. Corkill¹, Victor Lesser¹, Zhexuan Song¹¹ - Show less +33 more•Institutions (11)

University of Massachusetts Amherst¹, Arizona State University², Lockheed Martin Advanced Technology Laboratories³, Georgia Tech Research Institute⁴, Rensselaer Polytechnic Institute⁵, Oregon State University⁶, University of Wyoming⁷, University of Maryland, College Park⁸, University of Illinois at Urbana–Champaign⁹, Georgia Institute of Technology¹⁰, Fujitsu¹¹

01 Sep 2012-ACM Transactions on Intelligent Systems and Technology

TL;DR: The application of this novel learning and problem solving architecture to the domain of airspace management, where multiple requests for the use of airspaces need to be deconflicted, reconciled, and managed automatically is described.

...read moreread less

Abstract: We present a novel ensemble architecture for learning problem-solving techniques from a very small number of expert solutions and demonstrate its effectiveness in a complex real-world domain. The key feature of our “Generalized Integrated Learning Architecture” (GILA) is a set of heterogeneous independent learning and reasoning (ILR) components, coordinated by a central meta-reasoning executive (MRE). The ILRs are weakly coupled in the sense that all coordination during learning and performance happens through the MRE. Each ILR learns independently from a small number of expert demonstrations of a complex task. During performance, each ILR proposes partial solutions to subproblems posed by the MRE, which are then selected from and pieced together by the MRE to produce a complete solution. The heterogeneity of the learner-reasoners allows both learning and problem solving to be more effective because their abilities and biases are complementary and synergistic. We describe the application of this novel learning and problem solving architecture to the domain of airspace management, where multiple requests for the use of airspaces need to be deconflicted, reconciled, and managed automatically. Formal evaluations show that our system performs as well as or better than humans after learning from the same training data. Furthermore, GILA outperforms any individual ILR run in isolation, thus demonstrating the power of the ensemble architecture for learning and problem solving.

...read moreread less

15 citations

Proceedings Article•DOI•

Machine learning for computational sustainability

[...]

Thomas G. Dietterich¹, Ethan W. Dereszynski¹, Rebecca A. Hutchinson¹, Daniel Sheldon¹•Institutions (1)

Oregon State University¹

04 Jun 2012

TL;DR: A novel approach to modeling the migration of birds is described and a major challenge for all of these methods is to scale up to large, spatially-distributed systems.

...read moreread less

Abstract: To avoid ecological collapse, we must manage Earth's ecosystems sustainably. Viewed as a control problem, the two central challenges of ecosystem management are to acquire a model of the system that is sufficient to guide good decision making and then optimize the control policy against that model. This paper describes three efforts aimed at addressing the first of these challenges—machine learning methods for modeling ecosystems. The first effort focuses on automated quality control of environmental sensor data. Next, we consider the problem of learning species distribution models from citizen science observational data. Finally, we describe a novel approach to modeling the migration of birds. A major challenge for all of these methods is to scale up to large, spatially-distributed systems.

...read moreread less

9 citations

Posted Content•

Inferring Strategies from Limited Reconnaissance in Real-time Strategy Games

[...]

Jesse Hostetler¹, Ethan W. Dereszynski¹, Thomas G. Dietterich¹, Alan Fern¹•Institutions (1)

Oregon State University¹

16 Oct 2012-arXiv: Artificial Intelligence

TL;DR: In this article, a dynamic Bayes net model of strategies in the RTS game Starcraft is presented, which combines a generative model of how strategies relate to observable quantities with a principled framework for incorporating evidence gained via scouting.

...read moreread less

Abstract: In typical real-time strategy (RTS) games, enemy units are visible only when they are within sight range of a friendly unit. Knowledge of an opponent's disposition is limited to what can be observed through scouting. Information is costly, since units dedicated to scouting are unavailable for other purposes, and the enemy will resist scouting attempts. It is important to infer as much as possible about the opponent's current and future strategy from the available observations. We present a dynamic Bayes net model of strategies in the RTS game Starcraft that combines a generative model of how strategies relate to observable quantities with a principled framework for incorporating evidence gained via scouting. We demonstrate the model's ability to infer unobserved aspects of the game from realistic observations.

...read moreread less

8 citations

Proceedings Article•

Inferring strategies from limited reconnaissance in real-time strategy games

[...]

Jesse Hostetler¹, Ethan W. Dereszynski¹, Thomas G. Dietterich¹, Alan Fern¹•Institutions (1)

Oregon State University¹

14 Aug 2012

TL;DR: This work presents a dynamic Bayes net model of strategies in the RTS game Starcraft that combines a generative model of how strategies relate to observable quantities with a principled framework for incorporating evidence gained via scouting and demonstrates the model's ability to infer unobserved aspects of the game from realistic observations.

...read moreread less

Abstract: In typical real-time strategy (RTS) games, enemy units are visible only when they are within sight range of a friendly unit. Knowledge of an opponent's disposition is limited to what can be observed through scouting. Information is costly, since units dedicated to scouting are unavailable for other purposes, and the enemy will resist scouting attempts. It is important to infer as much as possible about the opponent's current and future strategy from the available observations. We present a dynamic Bayes net model of strategies in the RTS game Starcraft that combines a generative model of how strategies relate to observable quantities with a principled framework for incorporating evidence gained via scouting. We demonstrate the model's ability to infer unobserved aspects of the game from realistic observations.

...read moreread less

6 citations

Posted Content•

Active Imitation Learning via Reduction to I.I.D. Active Learning

[...]

Kshitij Judah¹, Alan Fern¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

16 Oct 2012-arXiv: Learning

TL;DR: In this paper, a new approach based on reducing active imitation learning to i.i.d. active learning is proposed, which can leverage progress in the i.d setting.

...read moreread less

Abstract: In standard passive imitation learning, the goal is to learn a target policy by passively observing full execution trajectories of it. Unfortunately, generating such trajectories can require substantial expert effort and be impractical in some cases. In this paper, we consider active imitation learning with the goal of reducing this effort by querying the expert about the desired action at individual states, which are selected based on answers to past queries and the learner's interactions with an environment simulator. We introduce a new approach based on reducing active imitation learning to i.i.d. active learning, which can leverage progress in the i.i.d. setting. Our first contribution, is to analyze reductions for both non-stationary and stationary policies, showing that the label complexity (number of queries) of active imitation learning can be substantially less than passive learning. Our second contribution, is to introduce a practical algorithm inspired by the reductions, which is shown to be highly effective in four test domains compared to a number of alternatives.

...read moreread less

3 citations

The MAXQMethod for Hierarchical Reinforcement Learning

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Jan 2012

TL;DR: The paper defines a hierarchical Q learning algorithm, proves its convergence, and shows experimentally that it can learn much faster than ordinary “flat” Q learning.

...read moreread less

Abstract: This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semantics—as a subroutine hierarchy—and a declarative semantics—as a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kaelbling, and Dayan and Hinton. Conditions under which the MAXQ decomposition can represent the optimal value function are derived. The paper defines a hierarchical Q learning algorithm, proves its convergence, and shows experimentally that it can learn much faster than ordinary “flat” Q learning. Finally, the paper discusses some interesting issues that arise in hierarchical reinforcement learning including the hierarchical credit assignment problem and non-hierarchical execution of the MAXQ hierarchy.

...read moreread less

2 citations

Showing papers by "Thomas G. Dietterich published in 2012"