Showing papers by "Thomas G. Dietterich published in 1991"

PDF

Open Access

Proceedings Article•

[...]

Hussein Almuallim¹, Thomas G. Dietterich¹•Institutions (1)

14 Jul 1991

TL;DR: It is shown that any learning algorithm implementing the MIN-FEATURES bias requires Θ(1/e ln 1/δ+ 1/e[2p + p ln n]) training examples to guarantee PAC-learning a concept having p relevant features out of n available features, and suggests that training data should be preprocessed to remove irrelevant features before being given to ID3 or FRINGE.

...read moreread less

Abstract: In many domains, an appropriate inductive bias is the MIN-FEATURES bias, which prefers consistent hypotheses definable over as few features as possible. This paper defines and studies this bias. First, it is shown that any learning algorithm implementing the MIN-FEATURES bias requires Θ(1/e ln 1/δ+ 1/e[2p + p ln n]) training examples to guarantee PAC-learning a concept having p relevant features out of n available features. This bound is only logarithmic in the number of irrelevant features. The paper also presents a quasi-polynomial time algorithm, FOCUS, which implements MIN-FEATURES. Experimental studies are presented that compare FOCUS to the ID3 and FRINGE algorithms. These experiments show that-- contrary to expectations--these algorithms do not implement good approximations of MIN-FEATURES. The coverage, sample complexity, and generalization performance of FOCUS is substantially better than either ID3 or FRINGE on learning problems where the MIN-FEATURES bias is appropriate. This suggests that, in practical applications, training data should be preprocessed to remove irrelevant features before being given to ID3 or FRINGE.

...read moreread less

716 citations

Book•

Readings in Machine Learning

[...]

Jude W. Shavlik, Thomas E. Deitterich, Thomas G. Dietterich

01 Mar 1991

TL;DR: Readings in Machine Learning collects the best of the published machine learning literature, including papers that address a wide range of learning tasks, and that introduce a variety of techniques for giving machines the ability to learn.

...read moreread less

Abstract: From the Publisher: The ability to learn is a fundamental characteristic of intelligent behavior. Consequently, machine learning has been a focus of artificial intelligence since the beginnings of AI in the 1950s. The 1980s saw tremendous growth in the field, and this growth promises to continue with valuable contributions to science, engineering, and business. Readings in Machine Learning collects the best of the published machine learning literature, including papers that address a wide range of learning tasks, and that introduce a variety of techniques for giving machines the ability to learn. The editors, in cooperation with a group of expert referees, have chosen important papers that empirically study, theoretically analyze, or psychologically justify machine learning algorithms. The papers are grouped into a dozen categories, each of which is introduced by the editors.

...read moreread less

325 citations

Error-Correcting Output Codes: A General Method for Improving

[...]

Thomas G. Dietterich, Ghulum Bakiri

01 Jan 1991

TL;DR: It is demonstrated that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

...read moreread less

Abstract: Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k < 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form [xi, f(xi)]. Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decision-tree algorithms ID3 and CART, (b) application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and (c) application of binary concept learning algorithms with distributed output codes such as those employed by Sejnowski and Rosenberg in the NETtalk system. This paper compares these three approaches to a new technique in which BCH error-correcting codes are employed as a distributed output representation. We show that these output representations improve the performance of ID3 on the NETtalk task and of back propagation on an isolated-letter speech-recognition task. These results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

...read moreread less

212 citations

Book Chapter•DOI•

Error-correcting output codes: a general method for improving multiclass inductive learning programs

[...]

Thomas G. Dietterich¹, Ghulum Bakiri¹•Institutions (1)

Oregon State University¹

14 Jul 1991

TL;DR: In this paper, error-correcting output codes are employed as a distributed output representation to improve the performance of ID3 on the NETtalk task and of backpropagation on an isolated-letter speech-recognition task.

...read moreread less

Abstract: Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form 〈Xi, f(Xi)〉. Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decision-tree algorithms ID3 and CART, (b) application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and (c) application of binary concept learning algorithms with distributed output codes such as those employed by Sejnowski and Rosenberg in the NETtalk system. This paper compares these three approaches to a new technique in which BCH error-correcting codes are employed as a distributed output representation. We show that these output representations improve the performance of ID3 on the NETtalk task and of backpropagation on an isolated-letter speech-recognition task. These results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

...read moreread less

188 citations

Proceedings Article•

Improving the Performance of Radial Basis Function Networks by Learning Center Locations

[...]

Dietrich Wettschereck¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

02 Dec 1991

TL;DR: It is concluded that supervised learning of center locations can be very important for radial basis function learning.

...read moreread less

Abstract: Three methods for improving the performance of (gaussian) radial basis function (RBF) networks were tested on the NETtalk task. In RBF, a new example is classified by computing its Euclidean distance to a set of centers chosen by unsupervised methods. The application of supervised learning to learn a non-Euclidean distance metric was found to reduce the error rate of RBF networks, while supervised learning of each center's variance resulted in inferior performance. The best improvement in accuracy was achieved by networks called generalized radial basis function (GRBF) networks. In GRBF, the center locations are determined by supervised learning. After training on 1000 words, RBF classifies 56.5% of letters correct, while GRBF scores 73.4% letters correct (on a separate test set). From these and other experiments, we conclude that supervised learning of center locations can be very important for radial basis function learning.

...read moreread less

175 citations

Journal Article•DOI•

Knowledge Compilation: A Symposium

[...]

Ashok K. Goel, Tom Bylander, B. Chandrasekaran, Thomas G. Dietterich, Richard M. Keller, Chris Tong - Show less +2 more

01 Apr 1991-IEEE Intelligent Systems

26 citations

Converting English text to speech: a machine learning approach

[...]

Ghulum Bakiri, Thomas G. Dietterich

01 Jan 1991

TL;DR: A set of machine learning methods for automatically constructing letter-to-sound rules by analyzing a dictionary of words and their pronunciations are presented, showing that error-correcting output codes provide a domain-independent, algorithm-independent approach to multiclass learning problems.

...read moreread less

Abstract: The task of mapping spelled English words into strings of phonemes and stresses ("reading aloud") has many practical applications. Several commercial systems perform this task by applying a knowledge base of expert-supplied letter-to-sound rules. This dissertation presents a set of machine learning methods for automatically constructing letter-to-sound rules by analyzing a dictionary of words and their pronunciations. Taken together, these methods provide a substantial performance improvement over the best commercial system--DECtalk from Digital Equipment Corporation. In a performance test, the learning methods were trained on a dictionary of 19,002 words. Then, human subjects were asked to compare the performance of the resulting letter-to-sound rules against the dictionary for an additional 1,000 words not used during training. In a blind procedure, the subjects rated the pronunciations of both the learned rules and the DECtalk rules according to whether they were noticably different from the dictionary pronunciation. The error rate for the learned rules was 28.8% (288 words noticeably different), while the error rate for the DECtalk rules was 44.3% (433 words noticeably different). If, instead of using human judges, were required that the pronunciations of the letter-to-sound rules exactly match the dictionary to be counted correct, then the error rate for our learned rules is 35.2% and the error rate for DECtalk is 63.6%. Similar results were observed at the level of individual letters, phonemes, and stresses. To achieve these results, several techniques were combined. The key learning technique represents the output classes by the codewords of an error-correcting code. Boolean concept learning methods, such as the standard ID3 decision-tree algorithm, can be applied to learn the individual bits of these codewords. This converts the muticlass learning problem into a number of boolean concept learning problems. This method is shown to be superior to several other methods: multiclass ID3, one-tree-per-class ID3, the domain-specific distributed code employed by T. Sejnowski and C. Rosenberg in their NETtalk system, and a method developed by D. Wolpert. Similar results in the domain of isolated-letter speech recognition with the backpropagation algorithm show that error-correcting output codes provide a domain-independent, algorithm-independent approach to multiclass learning problems.

...read moreread less

19 citations

Journal Article•DOI•

Bridging the gap between specification and implementation

[...]

Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Apr 1991-IEEE Intelligent Systems

TL;DR: Two general approaches to closing the gap between specifications and run-time architectures are described, which have been the focus of work on model-directed reasoning and task-specific architectures.

...read moreread less

Abstract: The claim that in knowledge compilation the gap between specifications and run-time architectures is substantial is examined. The forces that create the gap are identified and discussed. Two general approaches to closing this gap are described. One approach, which has been the focus of knowledge compilation research, converts specifications into a form that the run-time architecture can interpret directly. The other approach, which has been the focus of work on model-directed reasoning and task-specific architectures, changes the run-time architecture so that it can interpret the given specifications directly. >

...read moreread less

9 citations

Proceedings Article•DOI•

A Knowledge Base Data Representation for Collaborative Mechanical Design

[...]

Richard L. Nagy, David G. Ullman¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

22 Sep 1991

8 citations

Book Chapter•DOI•

Knowledge Compilation to Speed Up Numerical Optimisation

[...]

Giuseppe Cerbone¹, Thomas G. Dietterich¹•Institutions (1)

Oregon State University¹

01 Jun 1991

TL;DR: A method to replace a single inefficient non-gradient-based optimization by a set of efficient numerical gradient-directed optimizations that can be performed in parallel and decreases the dependence of the numerical methods on having a good starting point is described.

...read moreread less

Abstract: Many important application problems can be formalized as constrained non-linear optimization tasks. However, numerical methods for solving such problems are brittle and do not scale well. This paper describes a method to speed up and increase the reliability of numerical optimization by (a) optimizing the computation of the objective function, and (b) splitting the objective function into special cases that possess differentiable closed forms. This allows us to replace a single inefficient non-gradient-based optimization by a set of efficient numerical gradient-directed optimizations that can be performed in parallel. In the domain of 2-dimensional structural design, this procedure yields a 95% speedup over traditional optimization methods and decreases the dependence of the numerical methods on having a good starting point.

...read moreread less

3 citations

Book Chapter•DOI•

Machine learning in engineering automation

[...]

Steve Chien, Bradley L. Whitehall, Thomas G. Dietterich, Richard J. Doyle, Brian Falkenhainer, James Garrett, Stephen C.-Y. Lu - Show less +3 more

01 Jun 1991

TL;DR: A taxonomy of engineering tasks for application of machine learning technology is described and described, including noisy data, continuous quantities, mathematical formulas, large problem spaces, incorporating multiple sources and forms of knowledge, and the need for user-system interaction.

...read moreread less

Abstract: Engineers need intelligent tools to assist them with problems such as design, planning, monitoring, control, diagnosis, and analysis. Manual construction of these tools can be costly or impossible due to problems such as large amounts of data, lack of problem understanding, and the expense of knowledge engineering. Machine learning techniques hold promise for assisting in solutions to many of these problems, but engineering domains present significant challenges to learning systems, including: noisy data, continuous quantities, mathematical formulas, large problem spaces, incorporating multiple sources and forms of knowledge, and the need for user-system interaction. This paper describes a number of challenges to learning systems motivated by engineering applications and describes a taxonomy of engineering tasks for application of machine learning technology.

...read moreread less

Symbolic Methods in Numerical Optimization

[...]

Wang Po, Theodore G. Lewis, Shreekant Thakkar, Giuseppe Cerbone, Thomas G. Dietterich - Show less +1 more

01 Jan 1991

TL;DR: A method to replace a single inefficient non-gradient-based optimization by a set of efficient numerical gradient-directed optimizations that can be performed in parallel and yields a 95% speedup over traditional optimization methods and de creases the dependence of the numerical methods on having a good starting point.

...read moreread less

Abstract: Many important application problems can be formalized as constrained non-linear optimization tasks. However, numerical methods for solving such problems are brittle and do not scale well. Furthermore, for large classes of engineering problems, the objective function cannot be converted into a differentiable closed form. This prevents the application of efficient gradient optimization methods--only slower, non-gradient methods can be applied. This paper describes a method to speed up and increase the reliability of numerical optimization by (a) optimizing the computation of the objective function, and (b) splitting the objective function into special cases that possess differentiable closed forms. This allows us to replace a single inefficient non-gradient-based optimization by a set of efficient numerical gradient-directed optimizations that can be performed in parallel. In the domain of 2-dimensional structural design, this procedure yields a 95% speedup over traditional optimization methods and de creases the dependence of the numerical methods on having a good starting point.

...read moreread less