Showing papers on "Active learning (machine learning) published in 1995"

PDF

Open Access

Journal Article•DOI•

[...]

Corinna Cortes¹, Vladimir Vapnik¹•Institutions (1)

15 Sep 1995-Machine Learning

TL;DR: High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated and the performance of the support- vector network is compared to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

...read moreread less

Abstract: The support-vector network is a new learning machine for two-group classification problems. The machine conceptually implements the following idea: input vectors are non-linearly mapped to a very high-dimension feature space. In this feature space a linear decision surface is constructed. Special properties of the decision surface ensures high generalization ability of the learning machine. The idea behind the support-vector network was previously implemented for the restricted case where the training data can be separated without errors. We here extend this result to non-separable training data. High generalization ability of support-vector networks utilizing polynomial input transformations is demonstrated. We also compare the performance of the support-vector network to various classical learning algorithms that all took part in a benchmark study of Optical Character Recognition.

...read moreread less

37,861 citations

Journal Article•DOI•

Boosting a weak learning algorithm by majority

[...]

Yoav Freund¹•Institutions (1)

Bell Labs¹

01 Sep 1995-Information & Computation

TL;DR: An algorithm for improving the accuracy of algorithms for learning binary concepts by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples, is presented.

...read moreread less

Abstract: We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples. Our algorithm is based on ideas presented by Schapire and represents an improvement over his results, The analysis of our algorithm provides general upper bounds on the resources required for learning in Valiant′s polynomial PAC learning framework, which are the best general upper bounds known today. We show that the number of hypotheses that are combined by our algorithm is the smallest number possible. Other outcomes of our analysis are results regarding the representational power of threshold circuits, the relation between learnability and compression, and a method for parallelizing PAC learning algorithms. We provide extensions of our algorithms to cases in which the concepts are not binary and to the case where the accuracy of the learning algorithm depends on the distribution of the instances.

...read moreread less

1,632 citations

Book•

Elements of Machine Learning

[...]

Pat Langley

15 Sep 1995

TL;DR: Elements of Machine Learning by Pat Langley examines the science of machine learning, methodology, and prospects for machine learning in the coming years.

...read moreread less

Abstract: Elements of Machine Learning by Pat Langley Preface 1. An overview of machine learning 1.1 The science of machine learning 1.2 Nature of the environment 1.3 Nature of representation and performance 1.4 Nature of the learning component 1.5 Five paradigms for machine learning 1.6 Summary of the chapter 2. The induction of logical conjunctions 2.1 General issues in logical induction 2.2 Nonincremental induction of logical conjunctions 2.3 Heuristic induction of logical conjunctions 2.4 Incremental induction of logical conjunctions 2.5 Incremental hill climbing for logical conjunctions 2.6 Genetic algorithms for logical concept induction 2.7 Summary of the chapter 3. The induction of threshold concepts 3.1 General issues for threshold concepts 3.2 Induction of criteria tables 3.3 Induction of linear threshold units 3.4 Induction of spherical threshold units 3.5 Summary of the chapter 4. The induction of competitive concepts 4.1 Instance-based learning 4.2 Learning probabilistic concept descriptions 4.3 Summary of the chapter 5. The construction of decision lists 5.1 General issues in disjunctive concept induction 5.2 Nonincremental learning using separate and conquer 5.3 Incremental induction using separate and conquer 5.4 Induction of decision lists through exceptions 5.5 Induction of competitive disjunctions 5.6 Instance-storing algorithms 5.7 Complementary beam search for disjunctive concepts 5.8 Summary of the chapter 6. Revision and extension of inference networks 6.1 General issues surrounding inference network 6.2 Extending an incomplete inference network 6.3 Inducing specialized concepts with inference networks 6.4 Revising an incorrect inference network 6.5 Network construction and term generation 6.6 Summary of the chapter 7. The formation of concept hierarchies 7.1 General issues concerning concept hierarchies 7.2 Nonincremental divisive formation of hierarchies 7.3 Incremental formation of concept hierarchies 7.4 Agglomerative formation of concept hierarchies 7.5 Variations on hierarchies into other structures 7.7 Summary of the chapter 8. Other issues in concept induction 8.1 Overfitting and pruning 8.2 Selecting useful features 8.3 Induction for numeric prediction 8.4 Unsupervised concept induction 8.5 Inducing relational concepts 8.6 Handling missing features 8.7 Summary of the chapter 9. The formation of transition networks 9.1 General issues for state-transition networks 9.2 Constructing finite-state transition networks 9.3 Forming recursive transition networks 9.4 Learning rules and networks for prediction 9.5 Summary of the chapter 10. The acquisition of search-control knowledge 10.1 General issues in search control 10.2 Reinforcement learning 10.3 Learning state-space heuristics from solution traces 10.4 Learning control knowledge for problem reduction 10.5 Learning control knowledge for means-ends analysis 10.6 The utility of search-control knowledge 10.7 Summary of the chapter 11. The formation of macro-operators 11.1 General issues related to macro-operators 11.2 The creation of simple macro-operators 11.3 The formation of flexible macro-operators 11.4 Problem solving by analogy 11.5 The utility of macro-operators 11.6 Summary of the chapter 12. Prospects for machine learning 12.1 Additional areas of machine learning 12.2 Methodological trends in machine learning 12.3 The future of machine learning References Index

...read moreread less

538 citations

Proceedings Article•

Is Learning The n-th Thing Any Easier Than Learning The First?

[...]

Sebastian Thrun¹•Institutions (1)

Carnegie Mellon University¹

27 Nov 1995

TL;DR: It is shown that across the board, lifelong learning approaches generalize consistently more accurately from less training data, by their ability to transfer knowledge across learning tasks.

...read moreread less

Abstract: This paper investigates learning in a lifelong context. Lifelong learning addresses situations in which a learner faces a whole stream of learning tasks. Such scenarios provide the opportunity to transfer knowledge across multiple learning tasks, in order to generalize more accurately from less training data. In this paper, several different approaches to lifelong learning are described, and applied in an object recognition domain. It is shown that across the board, lifelong learning approaches generalize consistently more accurately from less training data, by their ability to transfer knowledge across learning tasks.

...read moreread less

474 citations

Journal Article•DOI•

A neural fuzzy control system with structure and parameter learning

[...]

Chin-Teng Lin¹•Institutions (1)

National Chiao Tung University¹

20 Mar 1995-Fuzzy Sets and Systems

TL;DR: A reinforcement learning algorithm is proposed, which can construct a neural fuzzy control network automatically and dynamically through a reward-penalty signal, which combines a proposed on-line supervised structure-parameter learning technique, the temporal difference prediction method, and the stochastic exploratory algorithm.

...read moreread less

327 citations

Journal Article•DOI•

Rule-based machine learning methods for functional prediction

[...]

Sholom M. Weiss¹, Nitin Indurkhya²•Institutions (2)

Rutgers University¹, University of Sydney²

01 Jun 1995-Journal of Artificial Intelligence Research

TL;DR: The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules, which can be extended to search efficiently for similar cases prior to approximating function values.

...read moreread less

Abstract: We describe a machine learning method for predicting the value of a real-valued function, given the values of multiple input variables. The method induces solutions from samples in the form of ordered disjunctive normal form (DNF) decision rules. A central objective of the method and representation is the induction of compact, easily interpretable solutions. This rule-based decision model can be extended to search efficiently for similar cases prior to approximating function values. Experimental results on real-world data demonstrate that the new techniques are competitive with existing machine learning and statistical methods and can sometimes yield superior regression performance.

...read moreread less

181 citations

Book Chapter•DOI•

A comparative evaluation of voting and meta-learning on partitioned data

[...]

Philip K. Chan¹, Salvatore J. Stolfo¹•Institutions (1)

Columbia University¹

09 Jul 1995

TL;DR: This paper evaluates different techniques for learning from partitioned data and the meta-learning approach is empirically compared with techniques in the literature that aim to combine multiple evidence to arrive at one prediction.

...read moreread less

Abstract: Much of the research in inductive learning concentrates on problems with relatively small amounts of data. With the coming age of very large network computing, it is likely that orders of magnitude more data in databases will be available for various learning problems of real world importance. Some learning algorithms assume that the entire data set fits into main memory, which is not feasible for massive amounts of data. One approach to handling a large data set is to partition the data set into subsets, run the learning algorithm on each of the subsets, and combine the results. In this paper we evaluate different techniques for learning from partitioned data. Our meta-learning approach is empirically compared with techniques in the literature that aim to combine multiple evidence to arrive at one prediction.

...read moreread less

179 citations

Journal Article•DOI•

Heterogeneous learning in the Doppelgänger user modeling system

[...]

Jon Orwant¹•Institutions (1)

Massachusetts Institute of Technology¹

01 Jan 1995-User Modeling and User-adapted Interaction

TL;DR: This paper concentrates on Doppelgänger's learning techniques and their implementation in an application-independent, sensor-independent environment.

...read moreread less

Abstract: Doppelganger is a generalized user modeling system that gathers data about users, performs inferences upon the data, and makes the resulting information available to applications.Doppelganger's learning is calledheterogeneous for two reasons: first, multiple learning techniques are used to interpret the data, and second, the learning techniques must often grapple with disparate data types. These computations take place at geographically distributed sites, and make use of portable user models carried by individuals. This paper concentrates onDoppelganger's learning techniques and their implementation in an application-independent, sensor-independent environment.

...read moreread less

159 citations

Journal Article•DOI•

Iterative learning control in feedback systems

[...]

Tae-Jeong Jang¹, Chong-Ho Choi¹, Hyun-Sik Ahn²•Institutions (2)

Seoul National University¹, Kookmin University²

01 Feb 1995-Automatica

TL;DR: It is shown that the convergence condition of the learning control in the feedback configuration does not change from the condition in an open-loop configuration, but the learning speed can be improved greatly in the Feedback configuration.

...read moreread less

153 citations

Journal Article•DOI•

Applications of inductive logic programming

[...]

Ivan Bratko¹, Stephen Muggleton²•Institutions (2)

University of Ljubljana¹, University of Oxford²

01 Nov 1995-Communications of The ACM

TL;DR: Attribute-based learning is limited to non-relational descriptions of objects in the sense that the learned descriptions do not specify relations among the objects' parts, and the lack of relations makes the concept description language inappropriate for some domains.

...read moreread less

Abstract: Techniques of machine learning have been successfully applied to various problems [1, 12]. Most of these applications rely on attribute-based learning, exemplified by the induction of decision trees as in the program C4.5 [20]. Broadly speaking, attribute-based learning also includes such approaches to learning as neural networks and nearest neighbor techniques. The advantages of attribute-based learning are: relative simplicity, efficiency, and existence of effective techniques for handling noisy data. However, attribute-based learning is limited to non-relational descriptions of objects in the sense that the learned descriptions do not specify relations among the objects' parts. Attribute-based learning thus has two strong limitations: the background knowledge can be expressed in rather limited form, andthe lack of relations makes the concept description language inappropriate for some domains.

...read moreread less

149 citations

Proceedings Article•DOI•

Combining Multiple Knowledge Sources for Discourse Segmentation

[...]

Diane J. Litman¹, Rebecca J. Passonneau•Institutions (1)

Bell Labs¹

26 Jun 1995

TL;DR: This work predicts discourse segment boundaries from linguistic features of utterances, using a corpus of spoken narratives as data to develop segmentation algorithms from training data using hand tuning and machine learning.

...read moreread less

Abstract: We predict discourse segment boundaries from linguistic features of utterances, using a corpus of spoken narratives as data. We present two methods for developing segmentation algorithms from training data: hand tuning and machine learning. When multiple types of features are used, results approach human performance on an independent test set (both methods), and using cross-validation (machine learning).

...read moreread less

Book Chapter•DOI•

Theory and applications of agnostic PAC-learning with small decision trees

[...]

Peter Auer¹, Robert C. Holte², Wolfgang Maass¹•Institutions (2)

Graz University of Technology¹, University of Ottawa²

09 Jul 1995

TL;DR: The performance of this theoretically founded algorithm T2, an agnostic PAC-learning of decision trees of at most 2 levels, is evaluated on 15 common “real-world” datasets, and it is shown that for most of these datasets T2 provides simple decision trees with little or no loss in predictive power.

...read moreread less

Abstract: We exhibit a theoretically founded algorithm T2 for agnostic PAC-learning of decision trees of at most 2 levels, whose computation time is almost linear in the size of the training set. We evaluate the performance of this learning algorithm T2 on 15 common “real-world” datasets, and show that for most of these datasets T2 provides simple decision trees with little or no loss in predictive power (compared with C4.5). In fact, for datasets with continuous attributes its error rate tends to be lower than that of C4.5. To the best of our knowledge this is the first time that a PAC-learning algorithm is shown to be applicable to “real-world” classification problems. Since one can prove that T2 is an agnostic PAC- learning algorithm, T2 is guaranteed to produce close to optimal 2-level decision trees from sufficiently large training sets for any (!) distribution of data. In this regard T2 differs strongly from all other learning algorithms that are considered in applied machine learning, for which no guarantee can be given about their performance on new datasets. We also demonstrate that this algorithm T2 can be used as a diagnostic tool for the investigation of the expressive limits of 2-level decision trees. Finally, T2, in combination with new bounds on the VC-dimension of decision trees of bounded depth that we derive, provides us now for the first time with the tools necessary for comparing learning curves of decision trees for “real-world” datasets with the theoretical estimates of PAC- learning theory.

...read moreread less

Journal Article•DOI•

A learning rule of neural networks via simultaneous perturbation and its hardware implementation

[...]

Yutaka Maeda¹, Hiroaki Hirano¹, Yakichi Kanata¹•Institutions (1)

Kansai University¹

01 Feb 1995-Neural Networks

TL;DR: A learning rule of neural networks via a simultaneous perturbation and an analog feedforward neural network circuit using the learning rule, which requires only forward operations of the neural network and is suitable for hardware implementation.

...read moreread less

Journal Article•DOI•

Learning With Real Machines or Diagrams: Application of Knowledge to Real-World Problems

[...]

Erika L. Ferguson, Mary Hegarty

01 Mar 1995-Cognition and Instruction

TL;DR: In this article, the authors investigated how learning from different media, either from real pulley systems or from simple line diagrams, affected mechanical learning and problem solving, and found that subjects who learned hands-on, by manipulating real-pulley systems, solved application problems more accurately than those who learned from diagrams.

...read moreread less

Abstract: In this study, we investigated how learning from different media, either from real pulley systems or from simple line diagrams, affected mechanical learning and problem solving. Novice subjects learned about pulley systems by comparing the efficiency of different systems and receiving feedback on their accuracy. The main outcome measures were subjects' ability to compare pulley system efficiency, their level of mechanical reasoning, and their ability to apply knowledge of system efficiency and construction details. Experiment 1 showed that (a) subjects learning with the two types of media made equal improvement on the learning task, and (b) all subjects showed an increase in quantitative understanding as they learned, but (c) subjects who learned hands-on, by manipulating real pulley systems, solved application problems more accurately than those who learned from diagrams. Experiment 2 showed that both the realism of the stimuli and the opportunity to manipulate systems contributed to the improved perform...

...read moreread less

Journal Article•DOI•

Combining rough sets learning- and neural learning-method to deal with uncertain and imprecise information

[...]

Ramin Yasdi¹•Institutions (1)

Heidelberg University¹

01 Jan 1995-Neurocomputing

TL;DR: This approach joins two forms of learning, the technique of neural networks and rough sets, and aims to improve the overall classification effectiveness of learned objects' description and refine the dependency factors of the rules.

...read moreread less

Proceedings Article•DOI•

Local learning in local model networks

[...]

Roderick Murray-Smith¹, Tor Arne Johansen¹•Institutions (1)

SINTEF¹

26 Jun 1995

TL;DR: The paper points out problems with global learning methods in local model networks and illustrated that local learning has a regularizing effect that can make it favorable compared to global learning in some cases.

...read moreread less

Abstract: Local model networks are hybrid models which allow the easy integration of a priori knowledge, as well as the ability to learn from data to represent complex, multidimensional dynamic systems from data. The paper points out problems with global learning methods in local model networks. The bias/variance trade offs for local and global learning are examined, and it is illustrated that local learning has a regularizing effect that can make it favorable compared to global learning in some cases.

...read moreread less

Journal Article•DOI•

A new approach to the design of reinforcement schemes for learning automata: Stochastic estimator learning algorithm

[...]

Athanasios V. Vasilakos¹, Georgios I. Papadimitriou²•Institutions (2)

Hellenic Air Force Academy¹, University of Patras²

01 Apr 1995-Neurocomputing

TL;DR: The performance of the presented Stochastic Estimator Learning Automaton (SELA) is superior to all previous well-known S- model ergodic schemes and it is proved that SELA is ϵ-optimal in every S-model random environment.

...read moreread less

Book Chapter•DOI•

About a framework for information and information processing of learning systems

[...]

Matthias Rauterberg¹•Institutions (1)

ETH Zurich¹

28 Mar 1995

TL;DR: A concept to information processing is presented that derives an inverted U-shaped function between incongruity and information: a homeostatic model of ‘in-homeostasis’.

...read moreread less

Abstract: Information and information processing are one of the most important aspects of dynamic systems. The term ‘information’, that is used in various contexts, might better be replaced with one that incorporates novelty, activity and learning. Many important communications of learning systems are non-ergodic. The ergodicity assumption in Shannon’s communication theory restricts his and all related concepts to systems that can not learn. For learning systems that interact with their environments, the more primitive concept of ‘variety’ will have to be used, instead of probability. Humans have a fundamental need for variety: he can’t permanently perceive the same context, he can’t do always the same things. The fundamental need for variety leads to a different interpretation of human behaviour that is often classified as “errors”. Variety is the basis to measure complexity. Complexity in the relationship between a learning system and his context can be expressed as incongruity. Incongruity is the difference between internal complexity of a learning system and the complexity of the context. Traditional concepts of information processing are models of homeostasis on a basic level without learning. Activity and the irreversible learning process are driving forces that cause permanently in-homeostasis in the relationship between a learning system and his context. A suitable model for information processing of learning systems must be conceptualised on a higher level: a homeostatic model of ‘in-homeostasis’. A concept to information processing is presented that derives an inverted U-shaped function between incongruity and information. This concept leads to some design recommendations for man-machine systems.

...read moreread less

Book•DOI•

Machine Learning: ECML-95

[...]

Nada Lavrač, Stefan Wrobel

01 Jan 1995

Proceedings Article•

Learning arbiter and combiner trees from partitioned data for scaling machine learning

[...]

Philip K. Chan¹, Salvatore J. Stolfo¹•Institutions (1)

Columbia University¹

20 Aug 1995

TL;DR: This paper compares the arbiter tree strategy to a new but related approach called the combiner tree strategy, which aims to learn how to combine a number of base classifiers so that it can scale efficiently to larger learning problems, and boost the accuracy of the constituent classifiers if possible.

...read moreread less

Abstract: Knowledge discovery in databases has become an increasingly important research topic with the advent of wide area network computing. One of the crucial problems we study in this paper is how to scale machine learning algorithms, that typically are designed to deal with main memory based datasets, to efficiently learn from large distributed databases. We have explored an approach called meta-learning that is related to the traditional approaches of data reduction commonly employed in distributed query processing systems. Here we seek efficient means to learn how to combine a number of base classifiers, which are learned from subsets of the data, so that we scale efficiently to larger learning problems, and boost the accuracy of the constituent classifiers if possible. In this paper we compare the arbiter tree strategy to a new but related approach called the combiner tree strategy.

...read moreread less

Journal Article•DOI•

Distributed Reinforcement Learning

[...]

Gerhard Weiß¹•Institutions (1)

Technische Universität München¹

01 Jul 1995-Robotics and Autonomous Systems

TL;DR: In multi-agent systems two forms of learning can be distinguished: centralized learning, that is, learning done by a single agent independent of the other agents; and distributed learning, which becomes possible only because several agents are present.

...read moreread less

Book Chapter•DOI•

Explanation-based learning and reinforcement learning: a unified view

[...]

Thomas G. Dietterich¹, Nicholas S. Flann²•Institutions (2)

Oregon State University¹, Utah State University²

09 Jul 1995

TL;DR: This paper shows how to develop a dynamic programming version of EBL, which is called Explanation-Based Reinforcement Learning (EBRL), and shows that EBRL combines the strengths of E BL (fast learning and the ability to scale to large state spaces) with the strength of RL* (learning of optimal policies).

...read moreread less

Abstract: In speedup-learning problems, where full descriptions of operators are always known, both explanation-based learning (EBL) and reinforcement learning (RL) can be applied. This paper shows that both methods involve fundamentally the same process of propagating information backward from the goal toward the starting state. RL performs this propagation on a state-by-state basis, while EBL computes the weakest preconditions of operators, and hence, performs this propagation on a region-by-region basis. Based on the observation that RL is a form of asynchronous dynamic programming, this paper shows how to develop a dynamic programming version of EBL, which we call Explanation-Based Reinforcement Learning (EBRL). The paper compares batch and online versions of EBRL to batch and online versions of RL and to standard EBL. The results show that EBRL combines the strengths of EBL (fast learning and the ability to scale to large state spaces) with the strengths of RL* (learning of optimal policies). Results are shown in chess endgames and in synthetic maze tasks.

...read moreread less

Dissertation•

Applications of Machine Learning

[...]

Åsa Rudström

01 Jan 1995

Book Chapter•DOI•

Learning and Consistency

[...]

Rolf Wiehagen¹, Thomas Zeugmann²•Institutions (2)

Kaiserslautern University of Technology¹, Kyushu University²

01 Jan 1995

TL;DR: In designing learning algorithms it seems quite reasonable to construct them in such a way that all data the algorithm already has obtained are correctly and completely reflected in the hypothesis the algorithm outputs on these data, but this approach may totally fail.

...read moreread less

Abstract: In designing learning algorithms it seems quite reasonable to construct them in such a way that all data the algorithm already has obtained are correctly and completely reflected in the hypothesis the algorithm outputs on these data. However, this approach may totally fail. It may lead to the unsolvability of the learning problem, or it may exclude any efficient solution of it.

...read moreread less

Journal Article•DOI•

An Internet based collaborative distance learning system: CODILESS

[...]

Kazuo Watabe¹, Matti Hamalainen², Andrew B. Whinston³•Institutions (3)

University of Shizuoka¹, Helsinki Metropolia University of Applied Sciences², University of Texas at Austin³

01 Apr 1995-Computers in Education

TL;DR: In this paper, the authors describe the conceptual model and architecture of the system (Collaborative Distance Learning Support System: CODILESS), which aims for effective learning and efficiency, both from the learner's and course provider's point of view.

...read moreread less

Abstract: In order to meet the growing demand for flexible and continuing education, distance learning is increasingly being used to supplement the conventional classroom based education. The learning approaches that have trained students to work alone and independently are also being augmented with collaborative approaches that better fit the needs of today's organizations. To devise a model and develop an implementation for an effective and efficient collaborative distance learning system, the authors have started an international cooperative project. In this paper, we describe our research objectives and illustrate the key design criteria and system features by using the experiences from recent work on collaborative distance learning. We describe the conceptual model and architecture of the system (Collaborative Distance Learning Support System: CODILESS). The system aims for effective learning and efficiency, both from the learner's and course provider's point of view. CODILESS supports both collaborative and resource based learning within the same environment by integrating asynchronous and synchronous multimedia communication with electronic learning resources on the local workstation and on the Internet.

...read moreread less

Proceedings Article•

Active Learning in Multilayer Perceptrons

[...]

Kenji Fukumizu¹•Institutions (1)

Ricoh¹

27 Nov 1995

TL;DR: This work derives the singularity condition of an information matrix, and proposes an active learning technique that is applicable to MLP, and its effectiveness is verified through experiments.

...read moreread less

Abstract: We propose an active learning method with hidden-unit reduction. which is devised specially for multilayer perceptrons (MLP). First, we review our active learning method, and point out that many Fisher-information-based methods applied to MLP have a critical problem: the information matrix may be singular. To solve this problem, we derive the singularity condition of an information matrix, and propose an active learning technique that is applicable to MLP. Its effectiveness is verified through experiments.

...read moreread less

Book Chapter•DOI•

Learning non-monotonic logic programs: Learning exceptions

[...]

Yannis Dimopoulos¹, Antonis C. Kakas¹•Institutions (1)

University of Cyprus¹

25 Apr 1995

TL;DR: It is proved that the non-monotonic learning algorithm that realizes these ideas converges asymptotically to the concept to be learned.

...read moreread less

Abstract: In this paper we present a framework for learning non-monotonic logic programs. The method is parametric on a classical learning algorithm whose generated rules are to be understood as default rules. This means that these rules must be tolerant to the negative information by allowing for the possibility of exceptions. The same classical algorithm is then used to learn recursively these exceptions. We prove that the non-monotonic learning algorithm that realizes these ideas converges asymptotically to the concept to be learned. We also discuss various general issues concerning the problem of learning nonmonotonic theories in the proposed framework.

...read moreread less

Proceedings Article•DOI•

Minimisation of data collection by active learning

[...]

T. RayChaudhuri¹, L.G.C. Hamey¹•Institutions (1)

Macquarie University¹

27 Nov 1995

TL;DR: In this method data gathering is reduced to a minimum, yet modelling accuracy is uncompromised, and the authors' active querying criterion is determined by whether or not several models agree when they are fitted to random subsamples of a small amount of data.

...read moreread less

Abstract: Uses the 'query-by-committee' approach for building an active scheme for data collection In this method data gathering is reduced to a minimum, yet modelling accuracy is uncompromised The authors' active querying criterion is determined by whether or not several models agree when they are fitted to random subsamples of a small amount of collected data Experiments with neural network models to establish the feasibility of the authors' algorithm have produced encouraging results

...read moreread less

Report•DOI•

Lifelong Learning: A Case Study.

[...]

Sebastian Thrun

01 Nov 1995

TL;DR: This paper investigates learning in a lifelong context where a learner faces a stream of learning tasks and proposes and evaluates several approaches to lifelong learning that generalize consistently more accurately from scarce training data than comparable "single-task" approaches.

...read moreread less

Abstract: : Machine learning has not yet succeeded in the design of robust learning algorithms that generalize well from very small datasets. In contrast, humans often generalize correctly from only a single training example, even if the number of potentially relevant features is large. To do so, they successfully exploit knowledge acquired in previous learning tasks, to bias subsequent learning. This paper investigates learning in a lifelong context. Lifelong learning addresses situations where a learner faces a stream of learning tasks. Such scenarios provide the opportunity for synergetic effects that arise if knowledge is transferred across multiple learning tasks. To study the utility of transfer, several approaches to lifelong learning are proposed and evaluated in an object recognition domain. It is shown that all these algorithms generalize consistently more accurately from scarce training data than comparable "single-task" approaches.

...read moreread less

Proceedings Article•

A Unified Learning Scheme: Bayesian-Kullback Ying-Yang Machine

[...]

Lei Xu¹•Institutions (1)

Peking University¹

27 Nov 1995

TL;DR: A Bayesian-Kullback learning scheme, called Ying-Yang Machine, is proposed based on the two complement but equivalent Bayesian representations for joint density and their Kullback divergence.

...read moreread less

Abstract: A Bayesian-Kullback learning scheme, called Ying-Yang Machine, is proposed based on the two complement but equivalent Bayesian representations for joint density and their Kullback divergence. Not only the scheme unifies existing major supervised and unsupervised learnings, including the classical maximum likelihood or least square learning, the maximum information preservation, the EM & em algorithm and information geometry, the recent popular Helmholtz machine, as well as other learning methods with new variants and new results; but also the scheme provides a number of new learning models.

...read moreread less

Collapse