scispace - formally typeset
Search or ask a question

Showing papers by "Thomas G. Dietterich published in 2008"



Journal ArticleDOI
28 Jan 2008
TL;DR: A computer vision approach to automated rapid-throughput taxonomic identification of stonefly larvae by evaluating this classification methodology on a task of discriminating among four stonefly taxa, two of which, Calineuria and Doroneuria, are difficult even for experts to discriminate.
Abstract: This paper describes a computer vision approach to automated rapid-throughput taxonomic identification of stonefly larvae. The long-term objective of this research is to develop a cost-effective method for environmental monitoring based on automated identification of indicator species. Recognition of stonefly larvae is challenging because they are highly articulated, they exhibit a high degree of intraspecies variation in size and color, and some species are difficult to distinguish visually, despite prominent dorsal patterning. The stoneflies are imaged via an apparatus that manipulates the specimens into the field of view of a microscope so that images are obtained under highly repeatable conditions. The images are then classified through a process that involves (a) identification of regions of interest, (b) representation of those regions as SIFT vectors (Lowe, in Int J Comput Vis 60(2):91–110, 2004) (c) classification of the SIFT vectors into learned “features” to form a histogram of detected features, and (d) classification of the feature histogram via state-of-the-art ensemble classification algorithms. The steps (a) to (c) compose the concatenated feature histogram (CFH) method. We apply three region detectors for part (a) above, including a newly developed principal curvature-based region (PCBR) detector. This detector finds stable regions of high curvature via a watershed segmentation algorithm. We compute a separate dictionary of learned features for each region detector, and then concatenate the histograms prior to the final classification step. We evaluate this classification methodology on a task of discriminating among four stonefly taxa, two of which, Calineuria and Doroneuria, are difficult even for experts to discriminate. The results show that the combination of all three detectors gives four-class accuracy of 82% and three-class accuracy (pooling Calineuria and Doro-neuria) of 95%. Each region detector makes a valuable contribution. In particular, our new PCBR detector is able to discriminate Calineuria and Doroneuria much better than the other detectors.

123 citations


Journal ArticleDOI
TL;DR: The goal of the current paper is to consider these emerging trends and chart out the strategic directions and open problems for the broader area of structured machine learning for the next 10 years.
Abstract: The field of inductive logic programming (ILP) has made steady progress, since the first ILP workshop in 1991, based on a balance of developments in theory, implementations and applications. More recently there has been an increased emphasis on Probabilistic ILP and the related fields of Statistical Relational Learning (SRL) and Structured Prediction. The goal of the current paper is to consider these emerging trends and chart out the strategic directions and open problems for the broader area of structured machine learning for the next 10 years.

123 citations


Proceedings ArticleDOI
05 Jul 2008
TL;DR: It is demonstrated empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.
Abstract: We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.

87 citations


Journal ArticleDOI
TL;DR: A language that consists of quantified conditional influence statements and captures most relational probabilistic models based on directed graphs is described and algorithms based on gradient descent and expectation maximization for different combining rules are derived and implemented.
Abstract: Many real-world domains exhibit rich relational structure and stochasticity and motivate the development of models that combine predicate logic with probabilities. These models describe probabilistic influences between attributes of objects that are related to each other through known domain relationships. To keep these models succinct, each such influence is considered independent of others, which is called the assumption of "independence of causal influences" (ICI). In this paper, we describe a language that consists of quantified conditional influence statements and captures most relational probabilistic models based on directed graphs. The influences due to different statements are combined using a set of combining rules such as Noisy-OR. We motivate and introduce multi-level combining rules, where the lower level rules combine the influences due to different ground instances of the same statement, and the upper level rules combine the influences due to different statements. We present algorithms and empirical results for parameter learning in the presence of such combining rules. Specifically, we derive and implement algorithms based on gradient descent and expectation maximization for different combining rules and evaluate them on synthetic data and on a real-world task. The results demonstrate that the algorithms are able to learn both the conditional probability distributions of the influence statements and the parameters of the combining rules.

49 citations


Journal Article
TL;DR: A new algorithm for training CRFs via gradient tree boosting, which scales linearly in the order of the Markov model and in theorder of the feature interactions, rather than exponentially as in previous algorithms based on iterative scaling and gradient descent.
Abstract: Conditional random fields (CRFs) provide a flexible and powerful model for sequence labeling problems. However, existing learning algorithms are slow, particularly in problems with large numbers of potential input features and feature combinations. This paper describes a new algorithm for training CRFs via gradient tree boosting. In tree boosting, the CRF potential functions are represented as weighted sums of regression trees, which provide compact representations of feature interactions. So the algorithm does not explicitly consider the potentially large parameter space. As a result, gradient tree boosting scales linearly in the order of the Markov model and in the order of the feature interactions, rather than exponentially as in previous algorithms based on iterative scaling and gradient descent. Gradient tree boosting also makes it possible to use instance weighting (as in C4.5) and surrogate splitting (as in CART) to handle missing values. Experimental studies of the effectiveness of these two methods (as well as standard imputation and indicator feature methods) show that instance weighting is the best method in most cases when feature values are missing at random.

28 citations


Journal ArticleDOI
TL;DR: An aquatic insect imaging system was designed as part of a system to automate aquatic insect classification and was tested using several species and size classes of stonefly (Plecoptera) to evaluate image quality, specimen handling, and system usability.
Abstract: Population counts of aquatic insects are a valuable tool for monitoring the water quality of rivers and streams. However, the handling of samples in the lab for species identification is time consuming and requires specially trained experts. An aquatic insect imaging system was designed as part of a system to automate aquatic insect classification and was tested using several species and size classes of stonefly (Plecoptera). The system uses ethanol to transport specimens via a transparent rectangular tube to a digital camera. A small jet is used to position and reorient the specimens so that sufficient pictures can be taken to classify them with pattern recognition. A mirror system is used to provide a split set of images 90° apart. The system is evaluated with respect to engineering requirements developed during the research, including image quality, specimen handling, and system usability.

22 citations


Proceedings ArticleDOI
01 Dec 2008
TL;DR: Experiments on benchmark object recognition datasets show that the system based on the new discriminative dictionaries and BDL classifier give performance comparable or superior to the state-of-art generic object recognition approaches.
Abstract: Visual dictionaries are widely employed in object recognition to map unordered bags of local region descriptors into feature vectors for image classification. Most visual dictionaries have been constructed by unsupervised clustering. This paper presents an efficient discriminative approach, called iterative discriminative clustering (IDC), for dictionary learning. In this approach, each dictionary entry is defined by a representative value and a learned distance metric. In IDC algorithm, the dictionary entries are initialized by unsupervised clustering and then locally adapted to improve their discriminative power. Motivated by studies of the characteristics of individual dictionary entries, we employ bagged decision lists (BDL) as our image classifier in order to explore the conjunctions of small number of informative dictionary entries for classification. Experiments on benchmark object recognition datasets show that the system based on the new discriminative dictionaries and BDL classifier give performance comparable or superior to the state-of-art generic object recognition approaches.

7 citations


Proceedings Article
13 Jul 2008
TL;DR: By integrating multiple learning components through Markov Logic, the performance of the system can be improved and that the Marginal Probability Architecture performs better than the MPE Architecture.
Abstract: This paper addresses the question of how statistical learning algorithms can be integrated into a larger AI system both from a practical engineering perspective and from the perspective of correct representation, learning, and reasoning. Our goal is to create an integrated intelligent system that can combine observed facts, hand-written rules, learned rules, and learned classifiers to perform joint learning and reasoning. Our solution, which has been implemented in the CALO system, integrates multiple learning components with a Markov Logic inference engine, so that the components can benefit from each other's predictions. We introduce two designs of the learning and reasoning layer in CALO: the MPE Architecture and the Marginal Probability Architecture. The architectures, interfaces, and algorithms employed in our two designs are described, followed by experimental evaluations of the performance of the two designs. We show that by integrating multiple learning components through Markov Logic, the performance of the system can be improved and that the Marginal Probability Architecture performs better than the MPE Architecture.

7 citations


Book ChapterDOI
15 Sep 2008
TL;DR: A regression tree algorithm in which each leaf node is modeled as a finite mixture of deterministic functions is introduced, which is approximated via a greedy set cover.
Abstract: This paper addresses the problem of learning dynamic Bayesian network (DBN) models to support reinforcement learning. It focuses on learning regression tree (context-specific dependence) models of the conditional probability distributions of the DBNs. Existing algorithms rely on standard regression tree learning methods (both propositional and relational). However, such methods presume that the stochasticity in the domain can be modeled as a deterministic function with additive noise. This is inappropriate for many RL domains, where the stochasticity takes the form of stochastic choice over deterministic functions. This paper introduces a regression tree algorithm in which each leaf node is modeled as a finite mixture of deterministic functions. This mixture is approximated via a greedy set cover. Experiments on three challenging RL domains show that this approach finds trees that are more accurate and that are more likely to correctly identify the conditional dependencies in the DBNs based on small samples.

6 citations