scispace - formally typeset
Search or ask a question

Showing papers by "Thomas G. Dietterich published in 1998"


Journal ArticleDOI
TL;DR: This article reviews five approximate statistical tests for determining whether one learning algorithm outperforms another on a particular learning task and measures the power (ability to detect algorithm differences when they do exist) of these tests.
Abstract: This article reviews five approximate statistical tests for determining whether one learning algorithm outperforms another on a particular learning task. These test sare compared experimentally to determine their probability of incorrectly detecting a difference when no difference exists (type I error). Two widely used statistical tests are shown to have high probability of type I error in certain situations and should never be used: a test for the difference of two proportions and a paired-differences t test based on taking several random train-test splits. A third test, a paired-differences t test based on 10-fold cross-validation, exhibits somewhat elevated probability of type I error. A fourth test, McNemar's test, is shown to have low type I error. The fifth test is a new test, 5 × 2 cv, based on five iterations of twofold cross-validation. Experiments show that this test also has acceptable type I error. The article also measures the power (ability to detect algorithm differences when they do exist)...

3,356 citations


Book
01 Jan 1998
TL;DR: This book attempts to give an overview of the different recent efforts to deal with covariate shift, a challenging situation where the joint distribution of inputs and outputs differs between the training and test stages.
Abstract: All rights reserved. No part of this book may be reproduced in any form by any electronic or mechanical means (including photocopying, recording, or information storage and retrieval) without permission in writing from this is the last candidate. next esc will revert to uncompleted text. he publisher. Overview Dataset shift is a challenging situation where the joint distribution of inputs and outputs differs between the training and test stages. Covariate shift is a simpler particular case of dataset shift where only the input distribution changes (covariate denotes input), while the conditional distribution of the outputs given the inputs p(y|x) remains unchanged. Dataset shift is present in most practical applications for reasons ranging from the bias introduced by experimental design, to the mere irreproducibility of the testing conditions at training time. For example, in an image classification task, training data might have been recorded under controlled laboratory conditions, whereas the test data may show different lighting conditions. In other applications, the process that generates data is in itself adaptive. Some of our authors consider the problem of spam email filtering: successful " spammers " will try to build spam in a form that differs from the spam the automatic filter has been built on. Dataset shift seems to have raised relatively little interest in the machine learning community until very recently. Indeed, many machine learning algorithms are based on the assumption that the training data is drawn from exactly the same distribution as the test data on which the model will later be evaluated. Semi-supervised learning and active learning, two problems that seem very similar to covariate shift have received much more attention. How do they differ from covariate shift? Semi-supervised learning is designed to take advantage of unlabeled data present at training time, but is not conceived to be robust against changes in the input distribution. In fact, one can easily construct examples of covariate shift for which common SSL strategies such as the " cluster assumption " will lead to disaster. In active learning the algorithm is asked to select from the available unlabeled inputs those for which obtaining the label will be most beneficial for learning. This is very relevant in contexts where labeling data is very costly, but active learning strategies 2 Contents are not specifically design for dealing with covariate shift. This book attempts to give an overview of the different recent efforts that are being …

1,037 citations


Proceedings Article
24 Jul 1998
TL;DR: The paper defines a hierarchical Q learning algorithm, proves its convergence, and shows experimentally that it can learn much faster than ordinary “flat” Q learning.
Abstract: This paper presents a new approach to hierarchical reinforcement learning based on the MAXQ decomposition of the value function. The MAXQ decomposition has both a procedural semantics—as a subroutine hierarchy—and a declarative semantics—as a representation of the value function of a hierarchical policy. MAXQ unifies and extends previous work on hierarchical reinforcement learning by Singh, Kaelbling, and Dayan and Hinton. Conditions under which the MAXQ decomposition can represent the optimal value function are derived. The paper defines a hierarchical Q learning algorithm, proves its convergence, and shows experimentally that it can learn much faster than ordinary “flat” Q learning. Finally, the paper discusses some interesting issues that arise in hierarchical reinforcement learning including the hierarchical credit assignment problem and non-hierarchical execution of the MAXQ hierarchy.

330 citations


01 Jan 1998
TL;DR: This research explores the hypothesis that methods from decision theory and machine learning can be combined to provide practical solutions to current manufacturing control problems by developing an integrated approach to solving one manufacturing problem the optimization of die-level functional test.
Abstract: approved: Thomas G. Dietterich This research explores the hypothesis that methods from decision theory and machine learning can be combined to provide practical solutions to current manufacturing control problems. This hypothesis is explored by developing an integrated approach to solving one manufacturing problem the optimization of die-level functional test. An integrated circuit (IC) is an electronic circuit in which a number of devices are fabricated and interconnected on a single chip of semiconductor material. According to current manufacturing practice, integrated circuits are produced en masse in the form of processed silicon wafers. While still in wafer form the ICs are referred to as dice, an individual IC is called a die. The process of cutting the dice from wafers and embedding them into mountable containers is called packaging. During the manufacturing process the dice undergo a number of tests. One type of test is die-level functional test (DLFT). The conventional approach is to perform DLFT on all dice. An alternative to exhaustive die-level testing is selective testing. With this approach only a sample of the dice on each wafer is tested. Determining which dice to test and which to package is referred to as the "optimal test problem", and this problem provides the application focus for this research. Redacted for Privacy In this study, the optimal test problem is formulated as a partially observable Markov decision model that is evaluated in real time to provide answers to test questions such as which dice to test, which dice to package, and when to stop testing. Principles from decision theory (expected utility, value of information) are employed to generate tractable decision models, and machine learning techniques (Expectation Maximization, Gibbs Sampling) are employed to acquire the real-valued parameters of these models. Several problem formulations are explored and empirical tests are performed on historical test data from Hewlett-Packard Company. There are two significant results: (1) the selective test approach produces an expected net profit in manufacturing costs as compared to the current testing policy, and (2) the selective test approach greatly reduces the amount of testing performed while maintaining an appropriate level of performance monitoring. Just Enough Die-Level Test: Optimizing IC Test via Machine Learning and Decision Theory by Tony R. Fountain A THESIS submitted to Oregon State University in partial fulfillment of the requirements for the degree of Doctor of Philosophy Presented August 21, 1998 Commencement June 1999 Doctor of Philosophy thesis of Tony R. Fountain presented on August 21, 1998

3 citations