scispace - formally typeset
Search or ask a question

Showing papers by "Thomas G. Dietterich published in 1994"


Journal ArticleDOI
TL;DR: In this article, error-correcting output codes are employed as a distributed output representation to improve the performance of decision-tree algorithms for multiclass learning problems, such as C4.5 and CART.
Abstract: Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k "classes"). The definition is acquired by studying collections of training examples of the form (xi, f(xi)). Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which error-correcting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of overfitting avoidance techniques such as decision-tree pruning. Finally, we show that--like the other methods--the error-correcting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

2,542 citations


Posted Content
TL;DR: It is demonstrated that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.
Abstract: Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k ``classes''). The definition is acquired by studying collections of training examples of the form [x_i, f (x_i)]. Existing approaches to multiclass learning problems include direct application of multiclass algorithms such as the decision-tree algorithms C4.5 and CART, application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and application of binary concept learning algorithms with distributed output representations. This paper compares these three approaches to a new technique in which error-correcting codes are employed as a distributed output representation. We show that these output representations improve the generalization performance of both C4.5 and backpropagation on a wide range of multiclass learning tasks. We also demonstrate that this approach is robust with respect to changes in the size of the training sample, the assignment of distributed representations to particular classes, and the application of overfitting avoidance techniques such as decision-tree pruning. Finally, we show that---like the other methods---the error-correcting code technique can provide reliable class probability estimates. Taken together, these results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

2,455 citations


Journal ArticleDOI
TL;DR: Five algorithms that identify a subset of features sufficient to construct a hypothesis consistent with the training examples are presented and it is shown that any learning algorithm implementing the MIN-FEATURES bias requires ⊖(( ln ( l δ ) + [2 p + p ln n])/e) training examples to guarantee PAC-learning a concept having p relevant features out of n available features.

537 citations


Dissertation
01 Jan 1994
TL;DR: It is shown that the k-nearest neighbor algorithm (kNN) outperforms the first nearest neighbor algorithm only under certain conditions, and methods for choosing the value of k for kNN are investigated, and two methods for learning feature weights for a weighted Euclidean distance metric are proposed.
Abstract: Distance-based algorithms are machine learning algorithms that classify queries by computing distances between these queries and a number of internally stored exemplars. Exemplars that are closest to the query have the largest influence on the classification assigned to the query. Two specific distance-based algorithms, the nearest neighbor algorithm and the nearest-hyperrectangle algorithm, are studied in detail. It is shown that the k-nearest neighbor algorithm (kNN) outperforms the first nearest neighbor algorithm only under certain conditions. Data sets must contain moderate amounts of noise. Training examples from the different classes must belong to clusters that allow an increase in the value of k without reaching into clusters of other classes. Methods for choosing the value of k for kNN are investigated. It shown that one-fold cross-validation on a restricted number of values for k suffices for best performance. It is also shown that for best performance the votes of the k-nearest neighbors of a query should be weighted in inverse proportion to their distances from the query. Principal component analysis is shown to reduce the number of relevant dimensions substantially in several domains. Two methods for learning feature weights for a weighted Euclidean distance metric are proposed. These methods improve the performance of kNN and NN in a variety of domains. The nearest-hyperrectangle algorithm (NGE) is found to give predictions that are substantially inferior to those given by kNN in a variety of domains. Experiments performed to understand this inferior performance led to the discovery of several improvements to NGE. Foremost of these is BNGE, a batch algorithm that avoids construction of overlapping hyperrectangles from different classes. Although it is generally superior to NGE, BNGE is still significantly inferior to kNN in a variety of domains. Hence, a hybrid algorithm (KBNGE), that uses BNGE in parts of the input space that can be represented by a single hyperrectangle and kNN otherwise, is introduced. The primary contributions of this dissertation are (a) several improvements to existing distance-based algorithms, (b) several new distance-based algorithms, and (c) an experimentally supported understanding of the conditions under which various distance-based algorithms are likely to give good performance.

139 citations


Patent
20 May 1994
TL;DR: In this paper, the authors combine explicit representation of molecular shape of molecules with neural network learning methods to provide models with high predictive ability that generalize to different chemical classes where structurally diverse molecules exhibiting similar surface characteristics are treated as similar.
Abstract: Explicit representation of molecular shape of molecules is combined with neural network learning methods to provide models with high predictive ability that generalize to different chemical classes where structurally diverse molecules exhibiting similar surface characteristics are treated as similar. A new machine-learning methodology is disclosed that can accept multiple representations of objects (100) and construct models (102-114) that predict characteristics of those objects (116). An extension of this methodology can be applied in cases where the representations of the objects are determined by a set of adjustable parameters. An iterative process applies intermediate models to generate new representations of the objects by adjusting parameters (108) and repeatedly retrains the models to obtain better predictive models. This method can be applied to molecules because each molecule can have many orientations and conformations, or representations, that are determined by a set of translation, rotation, and torsion angle parameters.

102 citations


Journal ArticleDOI
TL;DR: A novel technique is presented that removes a major obstacle to accurate prediction by automatically selecting conformations and alignments of molecules without the benefit of a characterized active site, and the resulting models can provide graphical guidance for chemical modifications.
Abstract: Building predictive models for iterative drug design in the absence of a known target protein structure is an important challenge. We present a novel technique, Compass, that removes a major obstacle to accurate prediction by automatically selecting conformations and alignments of molecules without the benefit of a characterized active site. The technique combines explicit representation of molecular shape with neural network learning methods to produce highly predictive models, even across chemically distinct classes of molecules. We apply the method to predicting human perception of musk odor and show how the resulting models can provide graphical guidance for chemical modifications.

95 citations


Journal ArticleDOI
TL;DR: A system of rotating 3-year terms was developed so that over time a wide range of researchers in machine learning would have the opportunity to serve on the editorial board and contribute to the quality and success of the journal.
Abstract: As the field of machine learning has grown and matured, the number of active, senior researchers in the field has grown as well. Our journal faced a problem: If we added all of these excellent scientists to the editorial board, it would soon become large and unwieldy. Hence, at the 1993 meeting of the editorial board, it was decided to develop a system of rotating 3-year terms, so that over time a wide range of researchers in machine learning would have the opportunity to serve on the editorial board and contribute to the quality and success of the journal. To implement this rotating-term system, currently-serving editorial board members were randomly assigned to one of three groups with terms ending December 31, 1993, December 31, 1994, and December 31, 1995. I want to take this opportunity to recognize and thank those board members whose terms expired December 31, 1993:

1 citations


01 Jan 1994
TL;DR: This preliminary paper shows that learning to solve scheduling problems such as the Space Shuttle Payload Processing and the Automatic Guided Vehicle scheduling can be usefully studied in the reinforcement learning framework.
Abstract: The goal of this research is to apply reinforcement learning methods to real-world problems like scheduling. In this preliminary paper, we show that learning to solve scheduling problems such as the Space Shuttle Payload Processing and the Automatic Guided Vehicle (AGV) scheduling can be usefully studied in the reinforcement learning framework. We discuss some of the special challenges posed by the scheduling domain to these methods and propose some possible solutions we plan to implement.

1 citations