scispace - formally typeset
Search or ask a question

Showing papers by "Padhraic Smyth published in 1995"


Book ChapterDOI
09 Jul 1995
TL;DR: A hybrid scheme which uses decision trees to find the relevant structure in high-dimensional classification problems and then uses local kernel density estimates to fit smooth probability estimates within this structure is discussed.
Abstract: A novel method for combining decision trees and kernel density estimators is proposed. Standard classification trees, or class probability trees, provide piecewise constant estimates of class posterior probabilities. Kernel density estimators can provide smooth non-parametric estimates of class probabilities, but scale poorly as the dimensionality of the problem increases. This paper discusses a hybrid scheme which uses decision trees to find the relevant structure in high-dimensional classification problems and then uses local kernel density estimates to fit smooth probability estimates within this structure. Experimental results on simulated data indicate that the method provides substantial improvement over trees or density methods alone for certain classes of problems. The paper briefly discusses various extensions of the basic approach and the types of application for which the method is best suited.

81 citations


Journal Article
TL;DR: In this paper it is shown that given the experts' labels, one can compute simple bounds on the average classification accuracy of the experts relative to the unknown true labels.

40 citations


Journal ArticleDOI
TL;DR: This paper presents recent progress in developing interactive semi-automated image database exploration tools based on pattern recognition and machine learning technology and presents a completed and successful application that illustrates the basic approach.
Abstract: In areas as diverse as earth remote sensing, astronomy, and medical imaging, image acquisition technology has undergone tremendous improvements in recent years. The vast amounts of scientific data are potential treasure-troves for scientific investigation and analysis. Unfortunately, advances in our ability to deal with this volume of data in an effective manner have not paralleled the hardware gains. While special-purpose tools for particular applications exist, there is a dearth of useful general-purpose software tools and algorithms which can assist a scientist in exploring large scientific image databases. This paper presents our recent progress in developing interactive semi-automated image database exploration tools based on pattern recognition and machine learning technology. We first present a completed and successful application that illustrates the basic approach: the SKICAT system used for the reduction and analysis of a 3 terabyte astronomical data set. SKICAT integrates techniques from image processing, data classification, and database management. It represents a system in which machine learning played a powerful and enabling role, and solved a difficult, scientifically significant problem. We then proceed to discuss the general problem of automated image database exploration, the particular aspects of image databases which distinguish them from other databases, and how this impacts the application of off-the-shelf learning algorithms to problems of this nature. A second large image database is used to ground this discussion: Magellan's images of the surface of the planet Venus. The paper concludes with a discussion of current and future challenges.

22 citations


Journal Article
TL;DR: In this paper, the authors describe the development of a trainable cataloging system: the user indicates the location of the objects of interest for a number of training images and the system learns to detect and catalog these objects in the rest of the database.
Abstract: Users of digital image libraries are often not interested in image data per se but in derived products such as catalogs of objects of interest. Converting an image database into a usable catalog is typically carried out manually at present. For many larger image databases the purely manual approach is completely impractical. In this paper we describe the development of a trainable cataloging system: the user indicates the location of the objects of interest for a number of training images and the system learns to detect and catalog these objects in the rest of the database. In particular we describe the application of this system to the cataloging of small volcanoes in radar images of Venus. The volcano problem is of interest because of the scale (30,000 images, order of 1 million detectable volcanoes), technical difficulty (the variability of the volcanoes in appearance) and the scientific importance of the problem. The problem of uncertain or subjective ground truth is of fundamental importance in cataloging problems of this nature and is discussed in some detail. Experimental results are presented which quantify and compare the detection performance of the system relative to human detection performance. The paper concludes by discussing the limitations of the proposed system and the lessons learned of general relevance to the development of digital image libraries.

6 citations



01 Jun 1995
TL;DR: Graphical techniques for modeling the dependencies of random variables have been explored in a number of studies on the reinforcement learning of reinforcement learning.
Abstract: Graphical techniques for modeling the dependencies of random variables have been explored in a.

3 citations


Proceedings ArticleDOI
31 Jan 1995
TL;DR: Two such applications at JPL are presented: the SKICAT system used for the reduction and analysis of a 3 terabyte astronomical data set, and the JARtool system to be used in automatically analyzing the Magellan data set consisting of over 30,000 images of the surface of Venus.
Abstract: In areas as diverse as Earth remote sensing, astronomy, and medical imaging, there has been an explosive growth in the amount of image data available for creating digital image libraries. However, the lack of automated analysis and useful retrieval methods stands in the way of creating true digital image libraries. In order to perform query-by-content type searches, the query formulation problem needs to be addressed: it is often not possible for users to formulate the targets of their searches in terms of queries. We present a natural and powerful approach to this problem to assist scientists in exploring large digital image libraries. We target a system that the user trains to find certain patterns by providing it with examples. The learning algorithms use the training data to produce classifiers to detect and identify other targets in the large image collection. This forms the basis for query by content capabilities and for library indexing purposes. We ground the discussion by presenting two such applications at JPL: the SKICAT system used for the reduction and analysis of a 3 terabyte astronomical data set, and the JARtool system to be used in automatically analyzing the Magellan data set consisting of over 30,000 images of the surface of Venus. General issues which impact the application of learning algorithms to image analysis applications are discussed.© (1995) COPYRIGHT SPIE--The International Society for Optical Engineering. Downloading of the abstract is permitted for personal use only.

1 citations