Showing papers by "Thomas G. Dietterich published in 2010"

PDF

Open Access

Book Chapter•DOI•

Cyber SA : situational awareness for cyber defense

[...]

Paul Barford¹, Marc Dacier², Thomas G. Dietterich³, Matt Fredrikson¹, Jonathon Giffin⁴, Sushil Jajodia⁵, Somesh Jha¹, Jason Li, Peng Liu⁶, Peng Ning⁷, Xinming Ou⁸, Dawn Song⁹, Laura Strater, Vipin Swarup, George P. Tadda¹⁰, Cliff Wang, John Yen⁶ - Show less +13 more•Institutions (10)

University of Wisconsin-Madison¹, Symantec², Oregon State University³, Georgia Institute of Technology⁴, George Mason University⁵, Pennsylvania State University⁶, North Carolina State University⁷, Kansas State University⁸, University of California⁹, Air Force Research Laboratory¹⁰

01 Dec 2010

TL;DR: Intrusion detection is a very primitive element of this aspect of situation perception, it identifies an event that may be part of an attack once that event adds to a recognition or identification activity.

...read moreread less

Abstract: 1. Be aware of the current situation. This aspect can also be called situation perception. Situation perception includes both situation recognition and identification. Situation identification can include identifying the type of attack (recognition is only recognizing that an attack is occurring), the source (who, what) of an attack, the target of an attack, etc. Situation perception is beyond intrusion detection. Intrusion detection is a very primitive element of this aspect. An IDS (intrusion detection system) is usually only a sensor, it neither identifies nor recognizes an attack but simply identifies an event that may be part of an attack once that event adds to a recognition or identification activity.

...read moreread less

129 citations

Proceedings Article•

Reinforcement learning via practice and critique advice

[...]

Kshitij Judah¹, Saikat Roy¹, Alan Fern¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

11 Jul 2010

TL;DR: This work considers the problem of incorporating end-user advice into reinforcement learning (RL) and proposes an approach for integrating all of the information gathered during practice and critiques in order to effectively optimize a parametric policy.

...read moreread less

Abstract: We consider the problem of incorporating end-user advice into reinforcement learning (RL). In our setting, the learner alternates between practicing, where learning is based on actual world experience, and end-user critique sessions where advice is gathered. During each critique session the end-user is allowed to analyze a trajectory of the current policy and then label an arbitrary subset of the available actions as good or bad. Our main contribution is an approach for integrating all of the information gathered during practice and critiques in order to effectively optimize a parametric policy. The approach optimizes a loss function that linearly combines losses measured against the world experience and the critique data. We evaluate our approach using a prototype system for teaching tactical battle behavior in a real-time strategy game engine. Results are given for a significant evaluation involving ten end-users showing the promise of this approach and also highlighting challenges involved in inserting end-users into the RL loop.

...read moreread less

77 citations

Journal Article•DOI•

Automated processing and identification of benthic invertebrate samples

[...]

David A. Lytle¹, Gonzalo Martínez-Muñoz¹, Wei Zhang¹, Natalia Larios², Linda G. Shapiro², Robert Paasch¹, Andrew R. Moldenke¹, Eric N. Mortensen¹, Sinisa Todorovic¹, Thomas G. Dietterich¹ - Show less +6 more•Institutions (2)

Oregon State University¹, University of Washington²

08 Jun 2010-Journal of The North American Benthological Society

TL;DR: BugID is the first system of its kind that allows users to select thresholds for rejection depending on the required use, and has several advantages over other automated insect classification systems, including automated handling of specimens, the able to isolate nontarget and novel species, and the ability to identify specimens across different stages of larval development.

...read moreread less

Abstract: We present a visually based method for the taxonomic identification of benthic invertebrates that automates image capture, image processing, and specimen classification. The BugID system automatically positions and images specimens with minimal user input. Images are then processed with interest operators (machine-learning algorithms for locating informative visual regions) to identify informative pattern features, and this information is used to train a classifier algorithm. Naive Bayes modeling of stacked decision trees is used to determine whether a specimen is an unknown distractor (taxon not in the training data set) or one of the species in the training set. When tested on images from 9 larval stonefly taxa, BugID correctly identified 94.5% of images, even though small or damaged specimens were included in testing. When distractor taxa (10 common invertebrates not present in the training set) were included to make classification more challenging, overall accuracy decreased but generally was close to 90%. At the equal error rate (EER), 89.5% of stonefly images were correctly classified and the accuracy of nonrejected stoneflies increased to 96.4%, a result suggesting that many difficult-to-identify or poorly imaged stonefly specimens had been rejected prior to classification. BugID is the first system of its kind that allows users to select thresholds for rejection depending on the required use. Rejected images of distractor taxa or difficult specimens can be identified later by a taxonomic expert, and new taxa ultimately can be incorporated into the training set of known taxa. BugID has several advantages over other automated insect classification systems, including automated handling of specimens, the ability to isolate nontarget and novel species, and the ability to identify specimens across different stages of larval development.

...read moreread less

65 citations

Proceedings Article•DOI•

The life and times of files and information: a study of desktop provenance

[...]

Carlos Jensen¹, Heather Lonsdale¹, Eleanor Wynn², Jill Cao¹, Michael Slater¹, Thomas G. Dietterich¹ - Show less +2 more•Institutions (2)

Oregon State University¹, Intel²

10 Apr 2010

TL;DR: A longitudinal study of knowledge workers at Intel Corporation tracking provenance events in their computer use and the effectiveness of provenance cues for document recall shows that provenance relationships are common, andprovenance cues aid recall.

...read moreread less

Abstract: In the field of Human-Computer Interaction, provenance refers to the history and genealogy of a document or file. Provenance helps us to understand the evolution and relationships of files; how and when different versions of a document were created, or how different documents in a collection build on each other through copy-paste events. Though methods for tracking provenance and the subsequent use of this meta-data have been proposed and developed into tools, there have been no studies documenting the types and frequency of provenance events in typical computer use. This is knowledge essential for the design of efficient query methods and information displays. We conducted a longitudinal study of knowledge workers at Intel Corporation tracking provenance events in their computer use. We also interviewed knowledge workers to determine the effectiveness of provenance cues for document recall. Our data shows that provenance relationships are common, and provenance cues aid recall.

...read moreread less

60 citations

Proceedings Article•DOI•

Haar Random Forest Features and SVM Spatial Matching Kernel for Stonefly Species Identification

[...]

Natalia Larios¹, Bilge Soran¹, Linda G. Shapiro¹, Gonzalo Martínez-Muñoz, Lin Junyuan², Thomas G. Dietterich² - Show less +2 more•Institutions (2)

University of Washington¹, Oregon State University²

23 Aug 2010

TL;DR: This paper proposes an image classification method based on extracting image features using Haar random forests and combining them with a spatial matching kernel SVM that has state-of-the-art or better performance, but with much higher efficiency.

...read moreread less

Abstract: This paper proposes an image classification method based on extracting image features using Haar random forests and combining them with a spatial matching kernel SVM. The method works by combining multiple efficient, yet powerful, learning algorithms at every stage of the recognition process. On the task of identifying aquatic stonefly larvae, the method has state-of-the-art or better performance, but with much higher efficiency.

...read moreread less

40 citations

Book Chapter•DOI•

Machine Learning Methods for High Level Cyber Situation Awareness

[...]

Thomas G. Dietterich¹, Xinlong Bao¹, Victoria Keiser¹, Jianqiang Shen¹•Institutions (1)

Oregon State University¹

01 Jan 2010

TL;DR: The goal is to develop an awareness of what desktop users are doing as they work, which has many potential applications including cyber situation awareness.

...read moreread less

Abstract: Cyber situation awareness needs to operate at many levels of abstraction. In this chapter, we discuss situation awareness at a very high level—the behavior of desktop computer users. Our goal is to develop an awareness of what desktop users are doing as they work. Such awareness has many potential applications including

...read moreread less

15 citations

Proceedings Article•

Towards learning rules from natural texts

[...]

Janardhan Rao Doppa¹, Mohammad NasrEsfahani¹, Mohammad S. Sorower¹, Thomas G. Dietterich¹, Xiaoli Z. Fern¹, Prasad Tadepalli¹ - Show less +2 more•Institutions (1)

Oregon State University¹

06 Jun 2010

TL;DR: This paper presents a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules.

...read moreread less

Abstract: In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts are radically incomplete since there are always too many facts to mention. Second, natural texts are systematically biased towards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decision-trees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.

...read moreread less

5 citations

Proceedings Article•

Learning Rules from Incomplete Examples: A Pragmatic Approach

[...]

Janardhan Rao Doppa¹, Mohammad NasrEsfahani¹, Mohammad S. Sorower¹, Thomas G. Dietterich¹, Xiaoli Z. Fern¹, Prasad Tadepalli¹ - Show less +2 more•Institutions (1)

Oregon State University¹

01 Jun 2010

TL;DR: A Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules, and an iterative ensemble colearning approach, where multiple decision trees are learned from bootstrap samples of the incomplete training data.

...read moreread less

Abstract: In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts areradically incompletesince there are always too many facts to mention. Second, natural texts aresystematically biasedtowards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrappingapproach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decisiontrees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.

...read moreread less