scispace - formally typeset
Search or ask a question

Showing papers by "Thomas G. Dietterich published in 2010"


Book ChapterDOI
01 Dec 2010
TL;DR: Intrusion detection is a very primitive element of this aspect of situation perception, it identifies an event that may be part of an attack once that event adds to a recognition or identification activity.
Abstract: 1. Be aware of the current situation. This aspect can also be called situation perception. Situation perception includes both situation recognition and identification. Situation identification can include identifying the type of attack (recognition is only recognizing that an attack is occurring), the source (who, what) of an attack, the target of an attack, etc. Situation perception is beyond intrusion detection. Intrusion detection is a very primitive element of this aspect. An IDS (intrusion detection system) is usually only a sensor, it neither identifies nor recognizes an attack but simply identifies an event that may be part of an attack once that event adds to a recognition or identification activity.

129 citations


Proceedings Article
11 Jul 2010
TL;DR: This work considers the problem of incorporating end-user advice into reinforcement learning (RL) and proposes an approach for integrating all of the information gathered during practice and critiques in order to effectively optimize a parametric policy.
Abstract: We consider the problem of incorporating end-user advice into reinforcement learning (RL). In our setting, the learner alternates between practicing, where learning is based on actual world experience, and end-user critique sessions where advice is gathered. During each critique session the end-user is allowed to analyze a trajectory of the current policy and then label an arbitrary subset of the available actions as good or bad. Our main contribution is an approach for integrating all of the information gathered during practice and critiques in order to effectively optimize a parametric policy. The approach optimizes a loss function that linearly combines losses measured against the world experience and the critique data. We evaluate our approach using a prototype system for teaching tactical battle behavior in a real-time strategy game engine. Results are given for a significant evaluation involving ten end-users showing the promise of this approach and also highlighting challenges involved in inserting end-users into the RL loop.

77 citations


Journal ArticleDOI
TL;DR: BugID is the first system of its kind that allows users to select thresholds for rejection depending on the required use, and has several advantages over other automated insect classification systems, including automated handling of specimens, the able to isolate nontarget and novel species, and the ability to identify specimens across different stages of larval development.
Abstract: We present a visually based method for the taxonomic identification of benthic invertebrates that automates image capture, image processing, and specimen classification. The BugID system automatically positions and images specimens with minimal user input. Images are then processed with interest operators (machine-learning algorithms for locating informative visual regions) to identify informative pattern features, and this information is used to train a classifier algorithm. Naive Bayes modeling of stacked decision trees is used to determine whether a specimen is an unknown distractor (taxon not in the training data set) or one of the species in the training set. When tested on images from 9 larval stonefly taxa, BugID correctly identified 94.5% of images, even though small or damaged specimens were included in testing. When distractor taxa (10 common invertebrates not present in the training set) were included to make classification more challenging, overall accuracy decreased but generally was close to 90%. At the equal error rate (EER), 89.5% of stonefly images were correctly classified and the accuracy of nonrejected stoneflies increased to 96.4%, a result suggesting that many difficult-to-identify or poorly imaged stonefly specimens had been rejected prior to classification. BugID is the first system of its kind that allows users to select thresholds for rejection depending on the required use. Rejected images of distractor taxa or difficult specimens can be identified later by a taxonomic expert, and new taxa ultimately can be incorporated into the training set of known taxa. BugID has several advantages over other automated insect classification systems, including automated handling of specimens, the ability to isolate nontarget and novel species, and the ability to identify specimens across different stages of larval development.

65 citations


Proceedings ArticleDOI
10 Apr 2010
TL;DR: A longitudinal study of knowledge workers at Intel Corporation tracking provenance events in their computer use and the effectiveness of provenance cues for document recall shows that provenance relationships are common, andprovenance cues aid recall.
Abstract: In the field of Human-Computer Interaction, provenance refers to the history and genealogy of a document or file. Provenance helps us to understand the evolution and relationships of files; how and when different versions of a document were created, or how different documents in a collection build on each other through copy-paste events. Though methods for tracking provenance and the subsequent use of this meta-data have been proposed and developed into tools, there have been no studies documenting the types and frequency of provenance events in typical computer use. This is knowledge essential for the design of efficient query methods and information displays. We conducted a longitudinal study of knowledge workers at Intel Corporation tracking provenance events in their computer use. We also interviewed knowledge workers to determine the effectiveness of provenance cues for document recall. Our data shows that provenance relationships are common, and provenance cues aid recall.

60 citations


Proceedings ArticleDOI
23 Aug 2010
TL;DR: This paper proposes an image classification method based on extracting image features using Haar random forests and combining them with a spatial matching kernel SVM that has state-of-the-art or better performance, but with much higher efficiency.
Abstract: This paper proposes an image classification method based on extracting image features using Haar random forests and combining them with a spatial matching kernel SVM. The method works by combining multiple efficient, yet powerful, learning algorithms at every stage of the recognition process. On the task of identifying aquatic stonefly larvae, the method has state-of-the-art or better performance, but with much higher efficiency.

40 citations


Book ChapterDOI
01 Jan 2010
TL;DR: The goal is to develop an awareness of what desktop users are doing as they work, which has many potential applications including cyber situation awareness.
Abstract: Cyber situation awareness needs to operate at many levels of abstraction. In this chapter, we discuss situation awareness at a very high level—the behavior of desktop computer users. Our goal is to develop an awareness of what desktop users are doing as they work. Such awareness has many potential applications including

15 citations


Proceedings Article
06 Jun 2010
TL;DR: This paper presents a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules.
Abstract: In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts are radically incomplete since there are always too many facts to mention. Second, natural texts are systematically biased towards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decision-trees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.

5 citations


Proceedings Article
01 Jun 2010
TL;DR: A Multiple-predicate Bootstrapping approach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules, and an iterative ensemble colearning approach, where multiple decision trees are learned from bootstrap samples of the incomplete training data.
Abstract: In this paper, we consider the problem of inductively learning rules from specific facts extracted from texts. This problem is challenging due to two reasons. First, natural texts areradically incompletesince there are always too many facts to mention. Second, natural texts aresystematically biasedtowards novelty and surprise, which presents an unrepresentative sample to the learner. Our solutions to these two problems are based on building a generative observation model of what is mentioned and what is extracted given what is true. We first present a Multiple-predicate Bootstrappingapproach that consists of iteratively learning if-then rules based on an implicit observation model and then imputing new facts implied by the learned rules. Second, we present an iterative ensemble colearning approach, where multiple decisiontrees are learned from bootstrap samples of the incomplete training data, and facts are imputed based on weighted majority.