scispace - formally typeset
Search or ask a question

Showing papers on "Active learning (machine learning) published in 1991"


Journal ArticleDOI
TL;DR: Dyna as mentioned in this paper is an AI architecture that integrates learning, planning, and reactive execution, where learning methods are used both for compiling planning results and for updating a model of the effects of the agent's actions on the world.
Abstract: Dyna is an AI architecture that integrates learning, planning, and reactive execution. Learning methods are used in Dyna both for compiling planning results and for updating a model of the effects of the agent's actions on the world. Planning is incremental and can use the probabilistic and ofttimes incorrect world models generated by learning processes. Execution is fully reactive in the sense that no planning intervenes between perception and action. Dyna relies on machine learning methods for learning from examples---these are among the basic building blocks making up the architecture---yet is not tied to any particular method. This paper briefly introduces Dyna and discusses its strengths and weaknesses with respect to other architectures.

681 citations



Book ChapterDOI
06 Mar 1991
TL;DR: This paper describes how continuous attributes can be converted economically into ordered discrete attributes before being given to the learning system, and suggests this change of representation does not often result in a significant loss of accuracy, but offers large reductions in learning time.
Abstract: The large real-world datasets now commonly tackled by machine learning algorithms are often described in terms of attributes whose values are real numbers on some continuous interval, rather than being taken from a small number of discrete values. Many algorithms are able to handle continuous attributes, but learning requires far more CPU time than for a corresponding task with discrete attributes. This paper describes how continuous attributes can be converted economically into ordered discrete attributes before being given to the learning system. Experimental results from a wide variety of domains suggest this change of representation does not often result in a significant loss of accuracy (in fact it sometimes significantly improves accuracy), but offers large reductions in learning time, typically more than a factor of 10 in domains with a large number of continuous attributes.

461 citations


Proceedings Article
02 Dec 1991
TL;DR: It is shown that arbitrary distributions of binary vectors can be approximated by the combination model and shown how the weight vectors in the model can be interpreted as high order correlation patterns among the input bits, and how the combination machine can be used as a mechanism for detecting these patterns.
Abstract: We present a distribution model for binary vectors, called the influence combination model and show how this model can be used as the basis for unsupervised learning algorithms for feature selection. The model can be represented by a particular type of Boltzmann machine with a bipartite graph structure that we call the combination machine. This machine is closely related to the Harmonium model defined by Smolensky. In the first part of the paper we analyze properties of this distribution representation scheme. We show that arbitrary distributions of binary vectors can be approximated by the combination model. We show how the weight vectors in the model can be interpreted as high order correlation patterns among the input bits, and how the combination machine can be used as a mechanism for detecting these patterns. We compare the combination model with the mixture model and with principle component analysis. In the second part of the paper we present two algorithms for learning the combination model from examples. The first learning algorithm is the standard gradient ascent heuristic for computing maximum likelihood estimates for the parameters of the model. Here we give a closed form for this gradient that is significantly easier to compute than the corresponding gradient for the general Boltzmann machine. The second learning algorithm is a greedy method that creates the hidden units and computes their weights one at a time. This method is a variant of projection pursuit density estimation. In the third part of the paper we give experimental results for these learning methods on synthetic data and on natural data of handwritten digit images.

350 citations


Proceedings ArticleDOI
08 Jul 1991
TL;DR: An original approach to neural modeling based on the idea of searching, with learning methods, for a synaptic learning rule which is biologically plausible and yields networks that are able to learn to perform difficult tasks is discussed.
Abstract: Summary form only given, as follows. The authors discuss an original approach to neural modeling based on the idea of searching, with learning methods, for a synaptic learning rule which is biologically plausible and yields networks that are able to learn to perform difficult tasks. The proposed method of automatically finding the learning rule relies on the idea of considering the synaptic modification rule as a parametric function. This function has local inputs and is the same in many neurons. The parameters that define this function can be estimated with known learning methods. For this optimization, particular attention is given to gradient descent and genetic algorithms. In both cases, estimation of this function consists of a joint global optimization of the synaptic modification function and the networks that are learning to perform some tasks. Both network architecture and the learning function can be designed within constraints derived from biological knowledge. >

293 citations


Journal ArticleDOI
TL;DR: It is shown that the direct transmission term of the plant plays a crucial role in the error convergence of the learning process and a sufficient condition for nonlinear systems to achieve the desired output by the iterative learning control.

220 citations


Book ChapterDOI
16 Oct 1991
TL;DR: This paper shows that the existing approaches to learning from inconsistent examples are not sufficient, and a new method is suggested, which transforms the original decision table with unknown values into a new decision table in which every attribute value is known.
Abstract: In machine learning many real-life applications data are characterized by attributes with unknown values. This paper shows that the existing approaches to learning from such examples are not sufficient. A new method is suggested, which transforms the original decision table with unknown values into a new decision table in which every attribute value is known. Such a new table, in general, is inconsistent. This problem is solved by a technique of learning from inconsistent examples, based on rough set theory. Thus, two sets of rules: certain and possible are induced. Certain rules are categorical, while possible rules are supported by existing data, although conflicting data may exist as well. The presented approach may be combined with any other approach to uncertainty when processing of possible rules is concerned.

195 citations


Book ChapterDOI
14 Jul 1991
TL;DR: In this paper, error-correcting output codes are employed as a distributed output representation to improve the performance of ID3 on the NETtalk task and of backpropagation on an isolated-letter speech-recognition task.
Abstract: Multiclass learning problems involve finding a definition for an unknown function f(x) whose range is a discrete set containing k > 2 values (i.e., k "classes"). The definition is acquired by studying large collections of training examples of the form 〈Xi, f(Xi)〉. Existing approaches to this problem include (a) direct application of multiclass algorithms such as the decision-tree algorithms ID3 and CART, (b) application of binary concept learning algorithms to learn individual binary functions for each of the k classes, and (c) application of binary concept learning algorithms with distributed output codes such as those employed by Sejnowski and Rosenberg in the NETtalk system. This paper compares these three approaches to a new technique in which BCH error-correcting codes are employed as a distributed output representation. We show that these output representations improve the performance of ID3 on the NETtalk task and of backpropagation on an isolated-letter speech-recognition task. These results demonstrate that error-correcting output codes provide a general-purpose method for improving the performance of inductive learning programs on multiclass problems.

188 citations


Book ChapterDOI
11 Nov 1991
TL;DR: This paper gives a survey of the relationship between the fields of cryptography and machine learning, with an emphasis on how each field has contributed ideas and techniques to the other.
Abstract: This paper gives a survey of the relationship between the fields of cryptography and machine learning, with an emphasis on how each field has contributed ideas and techniques to the other. Some suggested directions for future cross-fertilization are also proposed.

110 citations


Book ChapterDOI
01 Jun 1991
TL;DR: Providing domain knowledge to the integrated system can decrease the amount of search required during learning and increase the accuracy of learned concepts, even when the domain knowledge is incorrect and incomplete.
Abstract: We describe a new approach to integrating explanation-based and empirical learning methods for learning relational concepts. The approach uses an information-based heuristic to evaluate components of a hypothesis that are proposed either by explanation-based or empirical learning methods. Providing domain knowledge to the integrated system can decrease the amount of search required during learning and increase the accuracy of learned concepts, even when the domain knowledge is incorrect and incomplete.

84 citations


Book ChapterDOI
01 Jan 1991
TL;DR: In this paper, a cascade-correlation learning algorithm has been used to predict realvalued timeseries, and the results of learning to predict the Mackey-glass chaotic timeseries using Cascade-Correlation are compared with other neural net learning algorithms as well as standard techniques.
Abstract: The cascade-correlation learning algorithm has been shown to learn some binary output tasks 10-100 times more quickly than back-propagation. This paper shows that the cascade-correlation algorithm can be used to predict a real-valued timeseries. Results of learning to predict the Mackey-Glass chaotic timeseries using Cascade-Correlation are compared with other neural net learning algorithms as well as standard techniques. Learning speed results are presented in terms that allow easy comparison between cascade-correlation and other learning algorithms, independent of machine architecture or simulator implementation.

01 Jan 1991
TL;DR: A method of on-line adaptation and learning is pro- posed which makes use of a probing signal whose frequency content is concentrated at the bandwidth of the current controller.
Abstract: A method of on-line adaptation and learning is pro- posed which des use of a probing signal whose frequency content b concentrated at the bandwidth of the current controller. As the plant is learned the procedure naturdyincreases the learning band- width.

Book ChapterDOI
06 Mar 1991
TL;DR: The issues in multi-agent machine learning are looked at and what effect the presence of multiple agents has on current learning methodologies are examined.
Abstract: Many real world situations are currently being modelled as a set of cooperating intelligent agents. Trying to introduce learning into such a system requires dealing with the existence of multiple autonomous agents. The inherent distribution means that effective learning has to be based on a cooperative framework in which each agent contributes its part. In this paper we look at the issues in multi-agent machine learning and examine what effect the presence of multiple agents has on current learning methodologies. We describe a model for cooperative learning based on structured dialogue between the agents. MALE is an implementation of this model and we describe some results from it.

Book ChapterDOI
01 Jun 1991
TL;DR: A new method for learning to refine the control rules of approximate reasoning-based controllers that can use the control knowledge of an experienced operator and fine-tune it through the process of learning.
Abstract: Previous reinforcement learning models for learning control do not use existing knowledge of a physical system's behavior, but rather train the network from scratch. The learning process is usually long, and even after the learning is completed, the resulting network can not be easily explained. On the other hand, approximate reasoning-based controllers provide a clear understanding of the control strategy but can not learn from experience. In this paper, we introduce a new method for learning to refine the control rules of approximate reasoning-based controllers. A reinforcement learning technique is used in conjunction with a multi-layer neural network model of an approximate reasoning-based controller. The model learns by updating its prediction of the physical system's behavior. Unlike previous models, our model can use the control knowledge of an experienced operator and fine-tune it through the process of learning. We demonstrate the application of the new approach to a small but challenging real-world control problem.

Patent
23 Apr 1991
TL;DR: In this paper, a neural network system capable of performing integrated processing of a plurality of information includes a feature extractor group and an information processing unit for learning features of the learning data.
Abstract: A neural network system capable of performing integrated processing of a plurality of information includes a feature extractor group for extracting a plurality of learning feature data from learning data in a learning mode and a plurality of object feature data from object data to be processed in an execution mode, and an information processing unit for learning features of the learning data, based on the plurality of learning feature data from the feature extractor group and corresponding teacher data in the learning mode, and determining final learning result data from the plurality of object feature data from the feature extractor group in accordance with the learning result, including a logic representing relation between the plurality of object feature data in the execution mode.

Proceedings Article
24 Aug 1991
TL;DR: Comparative experiments show the derived Bayesian algorithm is consistently as good or better, although sometimes at computational cost, than the several mature AI and statistical families of tree learning algorithms currently in use.
Abstract: This paper describes how a competitive tree learning algorithm can be derived from first principles. The algorithm approximates the Bayesian decision theoretic solution to the learning task. Comparative experiments with the algorithm and the several mature AI and statistical families of tree learning algorithms currently in use show the derived Bayesian algorithm is consistently as good or better, although sometimes at computational cost. Using the same strategy, we can design algorithms for many other supervised and model learning tasks given just a probabilistic representation for the kind of knowledge to be learned. As an illustration, a second learning algorithm is derived for learning Bayesian networks from data. Implications to incremental learning and the use of multiple models are also discussed.

Proceedings ArticleDOI
18 Nov 1991
TL;DR: The authors present a learning algorithm that uses a genetic algorithm for creating novel examples to teach multilayer feedforward networks and shows that the self-teaching neural networks not only reduce the teaching efforts of the human, but the genetically created examples also contribute robustly to the improvement of generalization performance and the interpretation of the connectionist knowledge.
Abstract: The authors introduce an active learning paradigm for neural networks. In contrast to the passive paradigm, the learning in the active paradigm is initiated by the machine learner instead of its environment or teacher. The authors present a learning algorithm that uses a genetic algorithm for creating novel examples to teach multilayer feedforward networks. The creative learning networks, based on their own knowledge, discover new examples, criticize and select useful ones, train themselves, and thereby extend their existing knowledge. Experiments on function extrapolation show that the self-teaching neural networks not only reduce the teaching efforts of the human, but the genetically created examples also contribute robustly to the improvement of generalization performance and the interpretation of the connectionist knowledge. >

Proceedings ArticleDOI
08 Jul 1991
TL;DR: A methodology for faster supervised temporal learning in nonlinear neural networks is presented and the concept of terminal teacher forcing is introduced and appropriately modify the activation dynamics of the neural network.
Abstract: A methodology for faster supervised temporal learning in nonlinear neural networks is presented. The authors introduce the concept of terminal teacher forcing and appropriately modify the activation dynamics of the neural network. They also indicate how teacher forcing can be decreased as the learning proceeds. In order to make the algorithm more tangible, the authors compare its different phases to an important aspect of learning inspired by a real-life analogy. The results show that the learning time is reduced by one to two orders of magnitude with respect to conventional methods. The authors limited themselves to an example of representative complexity. It is demonstrated that a circular trajectory can be learned in about 400 iterations. >

Book ChapterDOI
01 Jun 1991
TL;DR: Three extensions to the two basic learning algorithms are investigated and it is shown that the extensions can effectively improve the learning rate and in many cases even the asymptotic performance.
Abstract: AHC-learning and Q-learning are slow learning methods. This paper investigates three extensions to the two basic learning algorithms. The three extensions are 1) experience replay, 2) learning action models for planning, and 3) teaching. The basic algorithms and their extensions were evaluated using a dynamic environment as a testbed. The environment is nontrivial and nondeter-ministic. The results show that the extensions can effectively improve the learning rate and in many cases even the asymptotic performance.

01 May 1991
TL;DR: This thesis abstracts a prescription for the preliminary design of design systems that learn, called M$\sp2$LTD (Mapping Machine Learning To Design), based on the experience of building one instance called Bridger, which is built around two machine learning programs: COBWEB and Protos.
Abstract: This thesis addresses the issue of building design systems that acquire knowledge and improve their performance by using machine learning techniques. Until recently, no serious attempts have been made at assisting in the knowledge acquisition for the complete design process by machine learning techniques. Previous attempts at identifying machine learning techniques for design knowledge acquisition have not been successful, mainly because they have tried to support a complete design process with a single learning method. This thesis abstracts a prescription for the preliminary design of design systems that learn, from the experience of building one instance called Bridger. The prescription, called M$\sp2$LTD (Mapping Machine Learning To Design) is based on the following steps: (1) analysis of the design process and its decomposition into a collection of smaller tasks; (2) identification of the representation of design objects used in each of the tasks described in Step (1) for the particular domain of interest, and of the strategies each task uses; (3) selection of closely related machine learning paradigms, also called generic learning tasks, that have the characteristics identified in Step (2); and (4) use of additional domain characteristics to select particular machine learning programs, from the collection available in each paradigm found in Step (3), that can acquire the knowledge in the right representation and support the strategies employed. Rarely will an existing machine learning program do the task as specified; but a close match reduces the effort, and M$\sp2$LTD focuses that effort on the important modifications required. Bridger, the system that demonstrates M$\sp2$LTD in the domain of the preliminary design of cable-stayed bridges, is built around two machine learning programs: COBWEB and Protos. COBWEB has been extended significantly along many dimensions to allow it to acquire and use synthesis knowledge. Protos has been modified slightly to allow it to acquire and use redesign knowledge. Bridger's development is described and evaluated in light of M$\sp2$LTD. Two other uses of M$\sp2$LTD in the preliminary design of two design systems in other domains (i.e., ship design and the design of finite-element models) are briefly described; these further illustrate the potential of the approach.

Proceedings ArticleDOI
08 Jul 1991
TL;DR: A neural network architecture that autonomously learns to classify arbitrarily many, arbitrarily ordered vectors into recognition categories based on predictive success by using an internal controller that conjointly maximizes predictive generalization and minimizes predictive error.
Abstract: The authors present a neural network architecture, called ARTMAP, that autonomously learns to classify arbitrarily many, arbitrarily ordered vectors into recognition categories based on predictive success. This supervised learning system is built up from a pair of adaptive resonance theory modules that are capable of self-organizing stable recognition categories in response to arbitrary sequences of input patterns. Tested on a benchmark machine learning database in both online and offline simulations, the ARTMAP system learned orders of magnitude more quickly, efficiently, and accurately than alternative algorithms, and achieves 100% accuracy after training on less than half the input patterns in the database. It achieves these properties by using an internal controller that conjointly maximizes predictive generalization and minimizes predictive error by linking predictive success to category size on a trial-by-trial basis, using only local operations. >

Book ChapterDOI
01 Jun 1991
TL;DR: A new learning algorithm and an architecture that allows transfer of learning by the “sharing― of solutions to the common parts of multiple tasks is presented.
Abstract: Most “weak― learning algorithms, including reinforcement learning methods, have been applied on tasks with single goals. The effort to build more sophisticated learning systems that operate in complex environments will require the ability to handle multiple goals. Methods that allow transfer of learning will play a crucial role in learning systems that support multiple goals. In this paper I describe a class of multiple tasks that represents a subset of routine animal activity. I present a new learning algorithm and an architecture that allows transfer of learning by the “sharing― of solutions to the common parts of multiple tasks. A proof of the algorithm is also provided.

Proceedings ArticleDOI
13 May 1991
TL;DR: A machine learning method is used to construct a kind of associative memory which features a sophisticated local interpolation scheme and fast searching algorithms, suitable for applications in domains involving complex nonlinear systems.
Abstract: A machine learning method, suitable for applications in domains involving complex nonlinear systems, is presented. The learning algorithm is used to construct a kind of associative memory which features a sophisticated local interpolation scheme and fast searching algorithms. Experiments with the implemented algorithm in the acquisition of the dynamics model for the well-known pole-balancing system verify the algorithm's theoretically derived time and space requirements and demonstrate its efficiency. >

Journal ArticleDOI
TL;DR: OCCAM is a computational model of this learning task that integrates three learning methods: similarity- based learning (SBL), explanation-based learning (EBL), and theory-driven learning (TDL), and the strengths and weaknesses of each learning method are described.
Abstract: We analyze the types of information that human learners rely on in the acquisition of predictive and explanatory knowledge. We present OCCAM, a computational model of this learning task that integrates three learning methods: similarity-based learning (SBL), explanation-based learning (EBL), and theory-driven learning (TDL). We focus on the strengths and weaknesses of each learning method and describe how they can be integrated in a complementary fashion. The goal of this integration is to provide a learning architecture that accounts for the effects of prior knowledge on human learning. The integration helps to explain how a learner can learn rapidly when new experiences are consistent with prior knowledge while still retaining the ability to learn in novel domains (although more slowly). In addition, we present experimental evidence that an integrated model converges on accurate concepts more rapidly than either method applied individually. OCCAM is unique among learning models in that it can make use o...

Book ChapterDOI
01 Jun 1991
TL;DR: This paper describes an inductive system for learning search control rules and compares it with an existing explanation-based learning system.
Abstract: The computational complexity of planning has motivated significant efforts in machine learning. However, much of this work has concentrated on explanation-based learning techniques. An alternative approach is to use inductive learning. An inductive approach does not require a complete and tractable domain theory to be encoded and has the potential to create more effective rules by learning from more than one example at a time. In this paper, we describe an inductive system for learning search control rules and compare it with an existing explanation-based learning system.

Journal ArticleDOI
01 Mar 1991
TL;DR: It is shown that the learning task addressed by connectionist methods, including the back-propagation method, is computationally intractable and it is proposed that the power of neural networks may be enhanced by developing task-specific connectionist method.
Abstract: A set of experiments that precisely identify the power and limitations of the method of back-propagation is reported. The experiment on learning to compute the exclusive-OR function suggests that the computational efficiency of learning by the method of back-propagation depends on the initial weights in the network. The experiment on learning to play tic-tac-toe suggests that the information content of what is learned by the back-propagation method is dependent on the initial abstractions in the network. It also suggests that these abstractions are a major source of power for learning in parallel distributed processing networks. In addition, it is shown that the learning task addressed by connectionist methods, including the back-propagation method, is computationally intractable. These experimental and theoretical results strongly indicate that current connectionist methods may be too limited for the complex task of learning they seek to solve. It is proposed that the power of neural networks may be enhanced by developing task-specific connectionist methods. >

Journal ArticleDOI
01 Jul 1991
TL;DR: A survey of the state of the art in learning systems (automata and neural networks) which are of increasing importance in both theory and practice is presented.
Abstract: A survey of the state of the art in learning systems (automata and neural networks) which are of increasing importance in both theory and practice is presented. Learning systems are a response to engineering design problems arising from nonlinearities and uncertainty. Definitions and properties of learning systems are detailed. An analysis of the reinforcement schemes which are the heart of learning systems is given. Some results related to the asymptotic properties of the learning automata are presented as well as the learning systems models, and at the same time the controller (optimiser) and the controlled process (criterion to be optimised). Two learning schemes for neural networks synthesis are presented. Several applications of learning systems are also described. >

Proceedings ArticleDOI
08 Jul 1991
TL;DR: If a learning system is able to provide some estimate of the reliability of the generalizations it produces, then the rate of learning can be considerably increased, and experience becomes concentrated in regions of the control space which are relevant to the task at hand.
Abstract: It is shown that if a learning system is able to provide some estimate of the reliability of the generalizations it produces, then the rate of learning can be considerably increased. The increase is achieved by a decision-theoretic estimate of the value of trying alternative experimental actions. A further consequence of this kind of learning is that experience becomes concentrated in regions of the control space which are relevant to the task at hand. Such a restriction of experience is essential for continuous multivariate control tasks because the entire state space of such tasks could not possibly be learned in a practical amount of time. >

Proceedings ArticleDOI
18 Nov 1991
TL;DR: The authors explore a learning process of recurrent neural networks in the learning surface where learning is executed, and show that the learning is gently descending on the steepest gradient forward along the bottom of a curved valley.
Abstract: The authors explore a learning process of recurrent neural networks in the learning surface where learning is executed. Computer simulations show that the learning, which is the process of searching for optimal adjustable parameters, is gently descending on the steepest gradient forward along the bottom of a curved valley. This also means that the learning surface has a specific shape. These characteristics in learning are basically consistent with those of the multilayer neural networks analyzed by Gouhara et al. >

Proceedings ArticleDOI
13 Aug 1991
TL;DR: A comparison of the relative merits of these two learning methods on reinforcement learning tasks shows that the reinforcement learning method used performs better than the supervised learning method, which provides grounds for believing that similar performance differences can be expected on other reinforcementlearning tasks as well.
Abstract: The forward modeling approach of M.I. Jordan and J.E. Rumelhart (1990) has been shown to be applicable when supervised learning methods are to be used for solving reinforcement learning tasks. Because such tasks are natural candidates for the application of reinforcement learning methods, there is a need to evaluate the relative merits of these two learning methods on reinforcement learning tasks. The author presents one such comparison on a task involving learning to control an unstable, nonminimum phase, dynamic system. The comparison shows that the reinforcement learning method used performs better than the supervised learning method. An examination of the learning behavior of the two methods indicates that the differences in performance can be attributed to the underlying mechanics of the two learning methods, which provides grounds for believing that similar performance differences can be expected on other reinforcement learning tasks as well. >