scispace - formally typeset
Search or ask a question

Showing papers by "Padhraic Smyth published in 1988"


Journal ArticleDOI
TL;DR: A communication theory approach to decision tree design based on a top-town mutual information algorithm that is equivalent to a form of Shannon-Fano prefix coding, and several fundamental bounds relating decision-tree parameters are derived.
Abstract: A communication theory approach to decision tree design based on a top-town mutual information algorithm is presented. It is shown that this algorithm is equivalent to a form of Shannon-Fano prefix coding, and several fundamental bounds relating decision-tree parameters are derived. The bounds are used in conjunction with a rate-distortion interpretation of tree design to explain several phenomena previously observed in practical decision-tree design. A termination rule for the algorithm called the delta-entropy rule is proposed that improves its robustness in the presence of noise. Simulation results are presented, showing that the tree classifiers derived by the algorithm compare favourably to the single nearest neighbour classifier. >

83 citations


Proceedings Article
01 Aug 1988
TL;DR: The problem of induction or "learning from exalgorithms often incorporate incremental learning as amples" can roughly be divided into two distinct cata basic mechanism.
Abstract: 1 Background and motivation incremental learning (e.g., in decision tree design the entire tree algorithm must be re-run) while symbolic The problem of induction or "learning from exalgorithms often incorporate incremental learning as amples" can roughly be divided into two distinct cata basic mechanism. On the other hand, symbolic egories, namely the symbolic manipulation approach techniques cannot handle noise in the instance data

29 citations


Proceedings Article
01 Jan 1988
TL;DR: These architectures for executing probabilistic rule-bases in a parallel manner are discussed, using as a theoretical basis recently introduced information-theoretic models.
Abstract: We discuss in this paper architectures for executing probabilistic rule-bases in a parallel manner, using as a theoretical basis recently introduced information-theoretic models. We will begin by describing our (non-neural) learning algorithm and theory of quantitative rule modelling, followed by a discussion on the exact nature of two particular models. Finally we work through an example of our approach, going from database to rules to inference network, and compare the network's performance with the theoretical limits for specific problems.

23 citations


Dissertation
01 Jan 1988
TL;DR: An information-theoretic model for rules and rule-based systems is developed and the ability to specialise and generalise in a quantitative manner is demonstrated from a simple definition of rule information content.
Abstract: This thesis examines the problems of designing decision trees and expert systems from an information-theoretic viewpoint. A well-known greedy algorithm using mutual information for tree design is analysed. A basic model for tree design is developed leading to a series of bounds relating tree performance parameters. Analogies with prefix-coding and rate-distortion theory lead to interesting interpretations and results. The problem of finding termination rules for such greedy algorithms is discussed in the context of the theoretical models derived earlier, and several experimentally observed phenomena are explained in this manner. In two classification experiments, involving alphanumeric LEDS and local edge detection, the hierarchical approach is seen to offer significant advantages over alternative techniques. The second part of the thesis begins by analysing the difficulties in designing rule-based expert systems. The inability to model uncertainty in an effective manner is identified as a key limitation of existing approaches. Accordingly, an information-theoretic model for rules and rule-based systems is developed. From a simple definition of rule information content, the ability to specialise and generalise (akin to cognitive processes) in a quantitative manner is demonstrated. The problem of generalised rule induction is posed and the ITRULE algorithm is described which derives optimal rule sets from data. The problem of probabilistic updating in inference nets is discussed and a new maximum-likelihhod rule is proposed based on bounded probabilities. Utility functions and statistical decision theory concepts are used to develop a model of implicit control for rule-based inference. The theory is demonstrated by deriving rules from expert-supplied data and performing backward and forward chaining based on decision-theoretic criteria. The thesis concludes by outlining the many problems which remain to be solved in this area, and by briefly discussing the analogies between rule-based inference nets and neural networks.

4 citations