scispace - formally typeset
Search or ask a question

Showing papers in "Machine Learning in 1989"


Journal ArticleDOI
TL;DR: A description and empirical evaluation of a new induction system, CN2, designed for the efficient induction of simple, comprehensible production rules in domains where problems of poor description language and/or noise may be present.
Abstract: Systems for inducing concept descriptions from examples are valuable tools for assisting in the task of knowledge acquisition for expert systems This paper presents a description and empirical evaluation of a new induction system, CN2, designed for the efficient induction of simple, comprehensible production rules in domains where problems of poor description language and/or noise may be present Implementations of the CN2, ID3, and AQ algorithms are compared on three medical classification tasks

2,193 citations


Journal ArticleDOI
TL;DR: An incremental algorithm for inducing decision trees equivalent to those formed by Quinlan's nonincremental ID3 algorithm, given the same training instances is presented, named ID5R.
Abstract: This article presents an incremental algorithm for inducing decision trees equivalent to those formed by Quinlan's nonincremental ID3 algorithm, given the same training instances. The new algorithm, named ID5R, lets one apply the ID3 induction process to learning tasks in which training instances are presented serially. Although the basic tree-building algorithms differ only in how the decision trees are constructed, experiments show that incremental training makes it possible to select training instances more carefully, which can result in smaller decision trees. The ID3 algorithm and its variants are compared in terms of theoretical complexity and empirical behavior.

805 citations


Journal ArticleDOI
TL;DR: This paper compares five methods for pruning decision trees, developed from sets of examples, and shows that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems.
Abstract: This paper compares five methods for pruning decision trees, developed from sets of examples. When used with uncertain rather than deterministic data, decision-tree induction involves three main stages—creating a complete tree able to classify all the training examples, pruning this tree to give statistical reliability, and processing the pruned tree to improve understandability. This paper concerns the second stage—pruning. It presents empirical comparisons of the five methods across several domains. The results show that three methods—critical value, error complexity and reduced error—perform well, while the other two may cause problems. They also show that there is no significant interaction between the creation and pruning methods.

635 citations


Journal ArticleDOI
TL;DR: The paper considers a number of different measures and experimentally examines their behavior in four domains and shows that the choice of measure affects the size of a tree but not its accuracy, which remains the same even when attributes are selected randomly.
Abstract: One approach to induction is to develop a decision tree from a set of examples. When used with noisy rather than deterministic data, the method involves three main stages – creating a complete tree able to classify all the examples, pruning this tree to give statistical reliability, and processing the pruned tree to improve understandability. This paper is concerned with the first stage – tree creation – which relies on a measure for “goodness of split,” that is, how well the attributes discriminate between classes. Some problems encountered at this stage are missing data and multi-valued attributes. The paper considers a number of different measures and experimentally examines their behavior in four domains. The results show that the choice of measure affects the size of a tree but not its accuracy, which remains the same even when attributes are selected randomly.

502 citations


Journal ArticleDOI
TL;DR: A computer-mediated method for acquiring strategic knowledge is presented with a human-computer dialog in which ASK acquires strategic knowledge for medical diagnosis and treatment and the contribution of knowledge representation to automated knowledge acquisition is discussed.
Abstract: Strategic knowledge is used by an agent to decide what action to perform next, where actions have consequences external to the agent. This article presents a computer-mediated method for acquiring strategic knowledge. The general knowledge acquisition problem and the special difficulties of acquiring strategic knowledge are analyzed in terms of representation mismatch: the difference between the form in which knowledge is available from the world and the form required for knowledge systems. ASK is an interactive knowledge acquisition tool that elicits strategic knowledge from people in the form of justifications for action choices and generates strategy rules that operationalize and generalize the expert's advice. The basic approach is demonstrated with a human–computer dialog in which ASK acquires strategic knowledge for medical diagnosis and treatment. The rationale for and consequences of specific design decisions in ASK are analyzed, and the scope of applicability and limitations of the approach are assessed. The paper concludes by discussing the contribution of knowledge representation to automated knowledge acquisition.

212 citations


Journal ArticleDOI
TL;DR: The results indicate that MACLEARN'S filtering heuristics all improve search performance, sometimes dramatically, and when the system was given practice on simpler training problems, it learned a set of macros that led to successful solutions of several much harder problems.
Abstract: This paper describes a heuristic approach to the discovery of useful macro-operators (macros) in problem solving. The approach has been implemented in a program, MACLEARN, that has three parts: macro-proposer, static filter, and dynamic filter. Learning occurs during problem solving, so that performance improves in the course of a single problem trial. Primitive operators and macros are both represented within a uniform representational framework that is closed under composition. This means that new macros can be defined in terms of others, which leads to a definitional hierarchy. The representation also supports the transfer of macros to related problems. MACLEARN is embedded in a supporting system that carries out best-first search. Experiments in macro learning were conducted for two classes of problems: peg solitaire (generalized “Hi-Q puzzle”), and tile sliding (generalized “Fifteen puzzle”). The results indicate that MACLEARN'S filtering heuristics all improve search performance, sometimes dramatically. When the system was given practice on simpler training problems, it learned a set of macros that led to successful solutions of several much harder problems.

163 citations


Journal ArticleDOI
TL;DR: This paper presents some results on the probabilistic analysis of learning, illustrating the applicability of these results to settings such as connectionist networks.
Abstract: This paper presents some results on the probabilistic analysis of learning, illustrating the applicability of these results to settings such as connectionist networks. In particular, it concerns the learning of sets and functions from examples and background information. After a formal statement of the problem, some theorems are provided identifying the conditions necessary and sufficient for efficient learning, with respect to measures of information complexity and computational complexity. Intuitive interpretations of the definitions and theorems are provided.

159 citations


Journal ArticleDOI
TL;DR: An empirical study evaluates three methods for solving the problem of identifying a correct concept definition from positive examples such that the concept is some specialization of a target concept defined by a domain theory, and concludes that the new method, IOE, does not exhibit these shortcomings.
Abstract: This paper formalizes a new learning-from-examples problem: identifying a correct concept definition from positive examples such that the concept is some specialization of a target concept defined by a domain theory. It describes an empirical study that evaluates three methods for solving this problem: explanation-based generalization (EBG), multiple example explanation-based generalization (mEBG), and a new method, induction over explanations (IOE). The study demonstrates that the two existing methods (EBG and mEBG) exhibit two shortcomings: (a) they rarely identify the correct definition, and (b) they are brittle in that their success depends greatly on the choice of encoding of the domain theory rules. The study demonstrates that the new method, IOE, does not exhibit these shortcomings. This method applies the domain theory to construct explanations from multiple training examples as in mEBG, but forms the concept definition by employing a similarity-based generalization policy over the explanations. IOE has the advantage that an explicit domain theory can be exploited to aid the learning process, the dependence on the initial encoding of the domain theory is significantly reduced, and the correct concepts can be learned from few examples. The study evaluates the methods in the context of an implemented system, called Wyl2, which learns a variety of concepts in chess including “skewer” and “knight-fork.”

144 citations


Journal ArticleDOI
TL;DR: WITT is a computational model of categorization and conceptual clustering that has been motivated and guided by research on human categorization, allowing the system a flexible representation scheme that can model common-feature categories and polymorphous categories.
Abstract: In this paper we describe WITT, a computational model of categorization and conceptual clustering that has been motivated and guided by research on human categorization. Properties of categories to which humans are sensitive include best or prototypical members, relative contrasts between categories, and polymorphy (neither necessary nor sufficient feature rules). The system uses pairwise feature correlations to determine the “similarity” between objects and clusters of objects, allowing the system a flexible representation scheme that can model common-feature categories and polymorphous categories. This intercorrelation measure is cast in terms of an information-theoretic evaluation function that directs WITT'S search through the space of clusterings. This information-theoretic similarity metric also can be used to explain basic-level and typicality effects that occur in humans. WITT has been tested on both artificial domains and on data from the 1985 World Almanac, and we have examined the effect of various system parameters on the quality of the model's behavior.

127 citations


Journal ArticleDOI
TL;DR: The challenge to knowledge acquisition today is to clarify what the authors are doing (computer modeling), clarify the difficult problems (the nature of knowledge and representations), and reformulate their research program accordingly.
Abstract: Machine learning will never progress beyond its current state until people realize that knowledge is not a substance that can be stored. Knowledge acquisition, in particular, is a process of developing computer models, often for the first time, not a process of transferring or accessing statements or diagrams that are already written down and filed away in an expert’s mind. The “knowledge acquisition bottleneck” is a wrong and misleading metaphor, suggesting that the problem is to squeeze a large amount of already-formed concepts and relations through a narrow communication channel; the metaphor seriously misconstrues the theory formation process of computer modeling. The difficulties of choosing and evaluating knowledge acquisition methods are founded on a number of related misconceptions, clarified as follows: 1) the primary concern of knowledge engineering is modeling systems in the world (not replicating how people think—a matter for psychology); 2) knowledge-level analysis is how observers describe and explain the recurrent behaviors of a situated system, that is, some system interacting with an embedding environment; the knowledge level describes the product of an evolving, adaptive interaction between the situated system and its environment, not the internal, physical processes of an isolated system; 3) modeling intelligent behavior is fraught with frame-of-reference confusions, requiring that we tease apart the roles and points of view of the human expert, the mechanical devices he interacts with, the social and physical environment, and the observer-theoretician (with his own interacting suite of recording devices, representations, and purposes). The challenge to knowledge acquisition today is to clarify what we are doing (computer modeling), clarify the difficult problems (the nature of knowledge and representations), and reformulate our research program accordingly.

85 citations


Journal ArticleDOI
TL;DR: The PROTÉGÉ knowledge-acquisition system addresses these two activities individually and facilitates the construction of expert systems when the same general model can be applied to a variety of application tasks.
Abstract: Building a knowledge-based system is like developing a scientific theory. Although a knowledge base does not constitute a theory of some natural phenomenon, it does represent a theory of how a class of professionals approaches an application task. As when scientists develop a natural theory, builders of expert systems first must formulate a model of the behavior that they wish to understand and then must corroborate and extend that model with the aid of specific examples. Thus there are two interrelated phases of knowledge-base construction: (1) model building and (2) model extension. Computer-based tools can assist developers with both phases of the knowledge-acquisition process. Workers in the area of knowledge acquisition have developed computer-based tools that emphasize either the building of new models or the extension of existing models. The PROTEGE knowledge-acquisition system addresses these two activities individually and facilitates the construction of expert systems when the same general model can be applied to a variety of application tasks.

Journal ArticleDOI
TL;DR: Protos is a knowledge-acquisition tool that adjusts the training it expects and assistance it provides as its knowledge grows and is addressed in the description of a second tool, KI, that evaluates new information to determine its consequences for existing knowledge.
Abstract: Developing knowledge bases using knowledge-acquisition tools is difficult because each stage of development requires performing a distinct knowledge-acquisition task. This paper describes these different tasks and surveys current tools that perform them. It also addresses two issues confronting tools for start-to-finish development of knowledge bases. The first issue is how to support multiple stages of development. This paper describes Protos, a knowledge-acquisition tool that adjusts the training it expects and assistance it provides as its knowledge grows. The second issue is how to integrate new information into a large knowledge base. This issue is addressed in the description of a second tool, KI, that evaluates new information to determine its consequences for existing knowledge.

Journal ArticleDOI
TL;DR: The state-of-the-art in knowledge acquisition research is briefly described and the technology of interactive knowledge acquisition is discussed, including a descriptive framework, dimensions of use, and research patterns.
Abstract: Notes from the organizers of a series of knowledge acquisition workshops are presented here. The state-of-the-art in knowledge acquisition research is briefly described. Then the technology of interactive knowledge acquisition is discussed, including a descriptive framework, dimensions of use, and research patterns. Finally, dissemination of information from knowledge acquisition workshops is detailed.

Journal ArticleDOI
TL;DR: This editorial examines seven dichotomies that have emerged in recent years to partition the field of machine learning and argues that long-term progress will occur only if the authors can find ways to unify these apparently competing views into a coherent whole.
Abstract: Machine learning is a diverse discipline that acts as host to a variety of research goals, learning techniques, and methodological approaches. Researchers are making continual progress on all of these fronts tackling new problems, formulating innovative solutions to those problems, and devising new ways to evaluate their solutions. Such variety is the sign of a healthy and growing field. However, diversification also has its dangers. Subdisciplines can emerge that focus on one goal or evaluation scheme to the exclusion of others, and similarities among methods can be obscured by different notations and terminology. Thus, it is equally important to search for basic principles that unify the different paradigms within a field. Just as the twin forces of gravity and pressure hold a star in dynamic equilibrium while generating energy, so the joint processes of diversification and unification can hold a science together while fostering progress. In this editorial, I examine seven dichotomies that have emerged in recent years to partition the field of machine learning. I begin with three issues related to research goals and evaluation methodologies, then turn to four more substantive issues about learning methods themselves. In each case, I argue that long-term progress will occur only if we can find ways to unify these apparently competing views into a coherent whole.

Journal ArticleDOI
TL;DR: It is argued that both the forms of declarative knowledge required for problem solving as well as problem-solving strategies are functions of the problem-Solving task and have identified a family of generic tasks that can be used as building blocks for the construction of knowledge systems.
Abstract: One of the old saws about learning in AI is that an agent can only learn what it can be told, i.e., the agent has to have a vocabulary for the target structure which is to be acquired by learning. What this vocabulary is, for various tasks, is an issue that is common to whether one is building a knowledge system by learning or by other more direct forms of knowledge acquisition. I long have argued that both the forms of declarative knowledge required for problem solving as well as problem-solving strategies are functions of the problem-solving task and have identified a family of generic tasks that can be used as building blocks for the construction of knowledge systems. In this editorial, I discuss the implication of this line of research for knowledge acquisition and learning.

Journal ArticleDOI
TL;DR: Today’s expert systems have no ability to learn from experience, and learning capabilities are needed for intelligent systems that can remain useful in the face of changing environments or changing standards of expertise.
Abstract: Today’s expert systems have no ability to learn from experience. This commonly heard criticism, unfortunately, is largely true. Except for simple classification systems, expert systems do not employ a learning component to construct parts of their knowledge bases from libraries of previously solved cases. And none that I know of couples learning into closedloop modification based on experience, although the SOAR architecture [Rosenbloom and Newell 1985] comes the closest to being the sort of integrated system needed for continuous learning. Learning capabilities are needed for intelligent systems that can remain useful in the face of changing environments or changing standards of expertise. Why are the learning methods we know how to implement not being used to build or maintain expert systems in the commercial world?

Journal ArticleDOI
TL;DR: Why don’t the authors' learning programs just keep on going and become generally intelligent?
Abstract: Why don’t our learning programs just keep on going and become generally intelligent? The source of the problem is that most of our learning occurs at the fringe of what we already know. The more you know, the more (and faster) you can learn.

Journal ArticleDOI
TL;DR: An explanation-based learning system based on a version of Newell, Shaw, and Simon's LOGIC-THEORIST (LT) is descibed, to characterize and analyze differences between non-learning, rote learning (LT's original learning method), and EBL.
Abstract: This paper descibes an explanation-based learning (EBL) system based on a version of Newell, Shaw, and Simon's LOGIC-THEORIST (LT). Results of applying this system to propositional calculus problems from Principia Mathematica are compared with results of applying several other versions of the same performance element to these problems. The primary goal of this study is to characterize and analyze differences between non-learning, rote learning (LT's original learning method), and EBL. Another aim is to provide a characterization of the performance of a simple problem solver in the context of the Principia problems, in the hope that these problems can be used as a benchmark for testing improved learning methods, just as problems like chess and the eight puzzle have been used as benchmarks in research on search methods.

Journal ArticleDOI
TL;DR: An algorithm is presented for a common induction problem, the specialization of overly general concepts, based on manipulation of bit vectors, that has provided good performance in practice.
Abstract: An algorithm is presented for a common induction problem, the specialization of overly general concepts. A concept is too general when it matches a negative example. The particular case addressed here assumes that concepts are represented as conjunctions of positive literals, that specialization is performed by conjoining literals to the overly general concept, and that the resulting specializations are to be as general as possible. Although the problem is NP-hard, there exists an algorithm, based on manipulation of bit vectors, that has provided good performance in practice.

Book ChapterDOI
TL;DR: From the perspective of one who simply wants an understanding of the computational requirements of a class of tasks, the application program over-commits.
Abstract: Each task that anyone, man or machine, might want to perform imposes some set of computational requirements on the performer. This is obvious—how could it be otherwise? But if you ask anyone what the computational requirements are that a class of tasks imposes, you surely won’t get a very good answer. As application programmers, we can expose the computational requirements of the task we just wrote a program to solve. But we typically do so by pointing to the program; this doesn’t provide much insight, either to ourselves or to others, into the requirements that other, similar tasks impose. From the perspective of one who simply wants an understanding of the computational requirements of a class of tasks, the application program over-commits.

Journal ArticleDOI
TL;DR: It is shown that many learnable concept classes are also ss-learnable, in which a collection of disjoint concepts is to be simultaneously learned with only partial information concerning concept membership available to the learning algorithm.
Abstract: The distribution-independent model of (supervised) concept learning due to Valiant (1984) is extended to that of semi-supervised learning (ss-learning), in which a collection of disjoint concepts is to be simultaneously learned with only partial information concerning concept membership available to the learning algorithm. It is shown that many learnable concept classes are also ss-learnable. A new technique of learning, using an intermediate oracle, is introduced. Sufficient conditions for a collection of concept classes to be ss-learnable are given.

Journal ArticleDOI
TL;DR: This special issue is devoted to invited editorials and technical papers on knowledge acquisition, where a subfield might be characterized by a particular method of machine learning, such as genetic algorithms.
Abstract: This special issue is devoted to invited editorials and technical papers on knowledge acquisition. In the past, special issues have been devoted to recognized subfields of machine learning, where a subfield might be characterized by a particular method of machine learning, such as genetic algorithms. The relationship between machine learning and knowledge acquisition is not so clearcut as the field-subfield one. Neither are the methods of knowledge acquisition so homogeneous and easily characterized as for genetic algorithms.