scispace - formally typeset
Search or ask a question

Showing papers by "Thomas G. Dietterich published in 2006"


Proceedings ArticleDOI
29 Jan 2006
TL;DR: This paper introduces TaskPredictor, a machine learning system that attempts to predict the user's current activity, and provides experimental results on data collected from TaskTracer users.
Abstract: The TaskTracer system seeks to help multi-tasking users manage the resources that they create and access while carrying out their work activities. It does this by associating with each user-defined activity the set of files, folders, email messages, contacts, and web pages that the user accesses when performing that activity. The initial TaskTracer system relies on the user to notify the system each time the user changes activities. However, this is burdensome, and users often forget to tell TaskTracer what activity they are working on. This paper introduces TaskPredictor, a machine learning system that attempts to predict the user's current activity. TaskPredictor has two components: one for general desktop activity and another specifically for email. TaskPredictor achieves high prediction precision by combining three techniques: (a) feature selection via mutual information, (b) classification based on a confidence threshold, and (c) a hybrid design in which a Naive Bayes classifier estimates the classification confidence but where the actual classification decision is made by a support vector machine. This paper provides experimental results on data collected from TaskTracer users.

124 citations


Journal ArticleDOI
TL;DR: In this article, the authors showed that classification error is not always a good predictor of errors in landscape pattern indices, and some types of image postprocessing (for example, smoothing) might result in the underestimation of habitat fragmentation.
Abstract: Although habitat fragmentation is one of the greatest threats to biodiversity worldwide, virtually no attention has been paid to the quantification of error in fragmentation statistics. Landscape pattern indices (LPIs), such as mean patch size and number of patches, are routinely used to quantify fragmentation and are often calculated using remote-sensing imagery that has been classified into different land-cover classes. No classified map is ever completely correct, so we asked if different maps with similar misclassification rates could result in widely different errors in pattern indices. We simulated landscapes with varying proportions of habitat and clumpiness (autocorrelation) and then simulated classification errors on the same maps. We simulated higher misclassification at patch edges (as is often observed), and then used a smoothing algorithm routinely used on images to correct salt-and-pepper classification error. We determined how well classification errors (and smoothing) corresponded to errors seen in four pattern indices. Maps with low misclassification rates often yielded errors in LPIs of much larger magnitude and substantial variability. Although smoothing usually improved classification error, it sometimes increased LPI error and reversed the direction of error in LPIs introduced by misclassification. Our results show that classification error is not always a good predictor of errors in LPIs, and some types of image postprocessing (for example, smoothing) might result in the underestimation of habitat fragmentation. Furthermore, our results suggest that there is potential for large errors in nearly every landscape pattern analysis ever published, because virtually none quantify the errors in LPIs themselves.

117 citations


Patent
30 May 2006
TL;DR: In this paper, a method for assisting multi-tasking computer users includes receiving from a user a specification of a task being performed by the user or an indication of completion of the task.
Abstract: A method for assisting multi-tasking computer users includes receiving from a user a specification of a task being performed by the user or an indication of completion of a task, collecting state changes in multiple executing programs, predicting a current task being performed by the user based on a recent state change event, a past specification of a task being performed by the user, and past events and associated tasks. Based on the predicted current task, user interface elements in multiple executing programs are adapted to facilitate performance of the task. The method may also allow a user to specify a new task based on a task template derived from a completed task to facilitate completion of the new task. The task templates also may be shared among users, and active tasks may also be team tasks shared among users.

113 citations


Proceedings ArticleDOI
29 Jan 2006
TL;DR: A software system that can reduce the cost of locating files in hierarchical folders by 50% by applying a cost-sensitive prediction algorithm to the user's previous file access information to predict the next folder that will be accessed.
Abstract: Helping computer users rapidly locate files in their folder hierarchies has become an important research topic in today's intelligent user interface design. This paper reports on FolderPredictor, a software system that can reduce the cost of locating files in hierarchical folders. FolderPredictor applies a cost-sensitive prediction algorithm to the user's previous file access information to predict the next folder that will be accessed. Experimental results show that, on average, FolderPredictor reduces the cost of locating a file by 50%. Another advantage of FolderPredictor is that it does not require users to adapt to a new interface, but rather meshes with the existing interface for opening files on the Windows platform.

52 citations


Proceedings ArticleDOI
20 Aug 2006
TL;DR: Experimental results show that the new hierarchical object recognition system outperforms the comparable solutions on most of the datasets tested.
Abstract: This paper proposes a new generic object recognition system based on multi-scale affine-invariant image regions. Image segments are obtained by a watershed transform of the principal curvature of a contrast enhanced image. Each region is described by an intensity-based statistical descriptor and a PCA-SIFT descriptor. The spatial relations between regions are represented by a cluster-index distribution histogram. With these new descriptors, we develop a hierarchical object recognition system which uses an improved boosting feature selection method (Opelt et al., 2004) to construct layer classifiers by automatically selecting the most discriminative features in each layer. All layer classifiers are then combined to give the final classification. This system is tested on various object recognition problems. Experimental results show that the new hierarchical system outperforms the comparable solutions on most of the datasets tested

21 citations


Proceedings ArticleDOI
17 Jun 2006
TL;DR: A reinforcement matching scheme is employed that provides greater robustness to occlusion and clutter than previous methods that non-discriminately compare accumulated bins values over the entire context and is compared to robust matching methods (RANSAC and PROSAC).
Abstract: Local feature-based matching is robust to both clutter and occlusion. However, a primary shortcoming of local features is a deficiency of global information that can cause ambiguities in matching. Local features combined with global relationships convey much more information, but global spatial information is often not robust to occlusion and/or non-rigid transformations. This paper proposes a new framework for including global context information into local feature matching, while still maintaining robustness to occlusion, clutter, and nonrigid transformations. To generate global context information, we extend previous fixed-scale, circular-bin methods by using affine-invariant log-polar elliptical bins. Further, we employ a reinforcement matching scheme that provides greater robustness to occlusion and clutter than previous methods that non-discriminately compare accumulated bins values over the entire context. We also present a more robust method of calculating a feature’s dominant orientation. We compare reinforcement matching to nearest neighbor matching without region context and to robust matching methods (RANSAC and PROSAC).

18 citations


01 Jan 2006
TL;DR: This chapter describes the development of general-purpose pattern-recognition algorithms for identification and classification of insects and mesofauna and the design and construction of mechanical devices for handling and photographing specimens.
Abstract: Many ecological science and environmental monitoring problems can benefit from inexpensive, automated methods of counting insect and mesofaunal populations. Existing methods for obtaining population counts require expensive and tedious manual identification by human experts. This chapter describes the development of general-purpose pattern-recognition algorithms for identification and classification of insects and mesofauna and the design and construction of mechanical devices for handling and photographing specimens. This chapter presents techniques being explored in the first two years of a four year project, along with the results obtained thus far. This project’s primary focus to date has been the classification of stonefly larvae for assessment of stream water quality. Imaging and specimen manipulation apparatus that semi-automatically provides high-resolution images of individual specimens from multiple angles has also been designed and assembled in the context of this project. An additional project target has been the development of robust classification algorithms based on interest operators, region descriptors, clustering, and 3D reconstruction to automatically classify each specimen from its images.

13 citations


Proceedings Article
01 Feb 2006
TL;DR: Probabilistic, Logical and Relational Learning - Towards a Synthesis was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl from 30.01.05 to 04.02.05 as discussed by the authors.
Abstract: From 30.01.05 to 04.02.05, the Dagstuhl Seminar 05051 ``Probabilistic, Logical and Relational Learning - Towards a Synthesis'' was held in the International Conference and Research Center (IBFI), Schloss Dagstuhl. During the seminar, several participants presented their current research, and ongoing work and open problems were discussed. Abstracts of the presentations given during the seminar as well as abstracts of seminar results and ideas are put together in this paper. The first section describes the seminar topics and goals in general. Links to extended abstracts or full papers are provided, if available.

3 citations


31 Aug 2006
TL;DR: The goal in this research effort was to develop a new methodology, called KI-LEARN (Knowledge Intensive LEARNing), that combines domain knowledge and sparse training data to construct high-performance systems and specifically shows how qualitative constraints can be incorporated into learning algorithms.
Abstract: : Knowledge Representation and Reasoning (KRR) has developed a wide range of methods for representing knowledge and reasoning from it to produce expert-level performance. Despite these accomplishments, there is one major problem preventing the wide-spread application of KRR technology: the inability to support learning. This makes KRR systems brittle and difficult to maintain. On the other hand, Machine Learning (ML) has developed a wide range of methods for learning from examples. However, there are two major problems preventing the wide-spread application of machine learning technology: the need for large amounts of training data and the high cost of manually designing the hypothesis space of the learning system. Our goal in this research effort was to develop a new methodology, called KI-LEARN (Knowledge Intensive LEARNing), that combines domain knowledge and sparse training data to construct high-performance systems. This report provides an overview of the major results we obtained on specific tasks as outlined in our proposal. More specifically, to address issues in knowledge representation and efficient learning we designed a language called First-Order Conditional Influence (FOCI) Language for expressing attributes relevant to learning. Our language extends probabilistic relational models (PRMs) which are themselves probabilistic representations most similar to first-order representation languages employed in KRR systems. A distinct feature of our language is its support for explicit expression of qualitative constraints such as monotonicity, saturation, and synergies. More importantly, we have demonstrated via mathematical proofs and experimental results how these qualitative constraints can be used and exploited when learning with sparse training data. We specifically show how qualitative constraints can be incorporated into learning algorithms. In addition, this report describes the models we constructed for our testbed domains.

1 citations