scispace - formally typeset
Search or ask a question

Showing papers on "Unsupervised learning published in 1988"


Book
01 Jan 1988
TL;DR: This book provides a clear and simple account of the key ideas and algorithms of reinforcement learning, which ranges from the history of the field's intellectual foundations to the most recent developments and applications.
Abstract: Reinforcement learning, one of the most active research areas in artificial intelligence, is a computational approach to learning whereby an agent tries to maximize the total amount of reward it receives when interacting with a complex, uncertain environment. In Reinforcement Learning, Richard Sutton and Andrew Barto provide a clear and simple account of the key ideas and algorithms of reinforcement learning. Their discussion ranges from the history of the field's intellectual foundations to the most recent developments and applications. The only necessary mathematical background is familiarity with elementary concepts of probability. The book is divided into three parts. Part I defines the reinforcement learning problem in terms of Markov decision processes. Part II provides basic solution methods: dynamic programming, Monte Carlo methods, and temporal-difference learning. Part III presents a unified view of the solution methods and incorporates artificial neural networks, eligibility traces, and planning; the two final chapters present case studies and consider the future of reinforcement learning.

37,989 citations


Journal ArticleDOI
TL;DR: This work presents one such algorithm that learns disjunctive Boolean functions, along with variants for learning other classes of Boolean functions.
Abstract: Valiant (1984) and others have studied the problem of learning various classes of Boolean functions from examples. Here we discuss incremental learning of these functions. We consider a setting in which the learner responds to each example according to a current hypothesis. Then the learner updates the hypothesis, if necessary, based on the correct classification of the example. One natural measure of the quality of learning in this setting is the number of mistakes the learner makes. For suitable classes of functions, learning algorithms are available that make a bounded number of mistakes, with the bound independent of the number of examples seen by the learner. We present one such algorithm that learns disjunctive Boolean functions, along with variants for learning other classes of Boolean functions. The basic method can be expressed as a linear-threshold algorithm. A primary advantage of this algorithm is that the number of mistakes grows only logarithmically with the number of irrelevant attributes in the examples. At the same time, the algorithm is computationally efficient in both time and space.

1,669 citations


Book
01 May 1988
TL;DR: In this paper, competitive learning is applied to parallel networks of neuron-like elements to discover salient, general features which can be used to classify a set of stimulus input patterns, and these feature detectors form the basis of a multilayer system that serves to learn categorizations of stimulus sets which are not linearly separable.
Abstract: This paper reporis the results of our studies with an unsupervised learning paradigm which we have called “Competitive Learning” We have examined competitive learning using both computer simulation and formal analysis and hove found that when it is applied to parallel networks of neuron-like elements, many potentially useful learning tasks can be accomplished We were attracted to competitive learning because it seems to provide o way to discover the salient, general features which can be used to classify o set of patterns We show how o very simply competitive mechanism con discover a set of feature detectors which capture important aspects of the set of stimulus input patterns We 0150 show how these feature detectors con form the basis of o multilayer system that con serve to learn categorizations of stimulus sets which ore not linearly separable We show how the use of correlated stimuli con serve IX o kind of “teaching” input to the system to allow the development of feature detectors which would not develop otherwise Although we find the competitive learning mechanism o very interesting and powerful learning principle, we do not, of course, imagine thot it is the only learning principle Competitive learning is cm essentially nonassociative stotisticol learning scheme We certainly imagine that other kinds of learning mechanisms will be involved in the building of associations among patterns of activation in o more complete neural network We offer this analysis of these competitive learning mechanisms to further our understanding of how simple adaptive networks can discover features importont in the description of the stimulus environment in which the system finds itself

1,319 citations


Journal ArticleDOI
TL;DR: The authors used adaptive network theory to extend the Rescorla-Wagner (1972) least mean squares (LMS) model of associative learning to phenomena of human learning and judgment.
Abstract: We used adaptive network theory to extend the Rescorla-Wagner (1972) least mean squares (LMS) model of associative learning to phenomena of human learning and judgment. In three experiments subjects learned to categorize hypothetical patients with particular symptom patterns as having certain diseases. When one disease is far more likely than another, the model predicts that subjects will substantially overestimate the diagnosticity of the more valid symptom for the rare disease. The results of Experiments 1 and 2 provide clear support for this prediction in contradistinction to predictions from probability matching, exemplar retrieval, or simple prototype learning models. Experiment 3 contrasted the adaptive network model with one predicting pattern-probability matching when patients always had four symptoms (chosen from four opponent pairs) rather than the presence or absence of each of four symptoms, as in Experiment 1. The results again support the Rescorla-Wagner LMS learning rule as embedded within an adaptive network model.

844 citations


Journal ArticleDOI
TL;DR: An appropriate causal learning law for inductively inferring FCMs from time-series data is the differential Hebbian law, which modifies causal connections by correlating time derivatives of FCM node outputs.

320 citations


Proceedings Article
01 Jan 1988
TL;DR: Parallelizable optimization techniques such as the Polak-Ribiere method are significantly more efficient than the Backpropagation algorithm and the noisy real-valued learning problem of hand-written character recognition.
Abstract: Parallelizable optimization techniques are applied to the problem of learning in feedforward neural networks. In addition to having superior convergence properties, optimization techniques such as the Polak-Ribiere method are also significantly more efficient than the Backpropagation algorithm. These results are based on experiments performed on small boolean learning problems and the noisy real-valued learning problem of hand-written character recognition.

157 citations


Proceedings ArticleDOI
Kung1, Hwang1
24 Jul 1988
TL;DR: An algebraic projection (AP) analysis method is proposed that provides an analytical solution to both of the critical issues in back-propagation (BP) learning and the learning rate of the BP rule.
Abstract: Two critical issues in back-propagation (BP) learning are the discrimination capability given a number of hidden units and the speed of convergence in learning. The number of hidden units must be sufficient to provide the discriminating capability required by the given application. On the other hand, the training of an excessively large number of synaptic weights may be computationally costly and unreliable. This makes it desirable to have an a priori estimate of an optimal number of hidden neurons. Another closely related issue is the learning rate of the BP rule. In general, it is desirable to have fast learning, but not so fast that brings about instability of the iterative computation. An algebraic projection (AP) analysis method is proposed that provides an analytical solution to both of these problems. If the training patterns are completely irregular, then the predicted optimal number of hidden neurons is the same as that of the training patterns. In the case of regularity embedded patterns, the number of hidden neurons will depend on the type of regularity inherent. The optimal learning rate parameter is found to be inversely proportional to the number of hidden neurons. >

147 citations


Proceedings Article
01 Jan 1988
TL;DR: An optimality principle is proposed for training an unsupervised feedforward neural network based upon maximal ability to reconstruct the input data from the network outputs and an algorithm which can be used to train either linear or nonlinear networks with certain types of nonlinearity.
Abstract: We propose an optimality principle for training an unsupervised feedforward neural network based upon maximal ability to reconstruct the input data from the network outputs. We describe an algorithm which can be used to train either linear or nonlinear networks with certain types of nonlinearity. Examples of applications to the problems of image coding, feature detection, and analysis of random-dot stereograms are presented.

126 citations


Journal ArticleDOI
TL;DR: This work evaluated whether and when focused sampling benefits observational learning, investigated the effects of different distributions of systematic and unsystematic features, and compared observational leorning to learning with feedback.

81 citations


Patent
19 Jan 1988
TL;DR: In this paper, a learning algorithm for the N-dimensional Coulomb network is disclosed which is applicable to multi-layer networks and the central concept is to define a potential energy of a collection of memory sites.
Abstract: A learning algorithm for the N-dimensional Coulomb network is disclosed which is applicable to multi-layer networks. The central concept is to define a potential energy of a collection of memory sites. Then each memory site is an attractor of other memory sites. With the proper definition of attractive and repulsive potentials between various memory sites, it is possible to minimize the energy of the collection of memories. By this method, internal representations may be "built-up" one layer at a time. Following the method of Bachmann et al. a system is considered in which memories of events have already been recorded in a layer of cells. A method is found for the consolidation of the number of memories required to correctly represent the pattern environment. This method is shown to be applicable to a supervised or unsupervised learning paradigm in which pairs of input and output patterns are presented sequentially to the network. The resulting learning procedure develops internal representations in an incremental or cumulative fashion, from the layer closest to the input, to the output layer.

80 citations


Proceedings ArticleDOI
Williams1
24 Jul 1988
TL;DR: A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks and it is observed that such an approach even permits a seamless blend of associatives reinforcement learning and supervised learning within the same network.
Abstract: A description is given of several ways that backpropagation can be useful in training networks to perform associative reinforcement learning tasks. One way is to train a second network to model the environmental reinforcement signal and to backpropagate through this network into the first network. This technique has been proposed and explored previously in various forms. Another way is based on the use of the reinforce algorithm and amounts to backpropagating through deterministic parts of the network while performing a correlation-style computation where the behavior is stochastic. A third way, which is an extension of the second, allows backpropagation through the stochastic parts of the network as well. The mathematical validity of this third technique rests on the use of continuous-valued stochastic units. Some implications of this result for using supervised learning to train networks of stochastic units are noted, and it is also observed that such an approach even permits a seamless blend of associative reinforcement learning and supervised learning within the same network. >

Patent
30 Mar 1988
TL;DR: The subject neural network as discussed by the authors was created by adding a non-linear layer to a more standard neural network architecture to expand a functional input space to a signal set including orthonormal elements, when the input signal is visualized as a vector representation.
Abstract: A neural network system includes means for accomplishing artificial intelligence functions in three formerly divergent implementations. These functions include: supervised learning, unsupervised learning, and associative memory storage and retrieval. The subject neural network is created by addition of a non-linear layer to a more standard neural network architecture. The non-linear layer functions to expand a functional input space to a signal set including orthonormal elements, when the input signal is visualized as a vector representation. An input signal is selectively passed to a non-linear transform circuit, which outputs a transform signal therefrom. Both the input signal and the transform signal are placed in communication with a first layer of a plurality of processing nodes. An improved hardware implementation of the subject system includes a highly parallel, hybrid analog/digital circuitry. Included therein is a digitally addressed, random access memory means for storage and retrieval of an analog signal.

Proceedings ArticleDOI
24 Jul 1988
TL;DR: In this article, the unconditional stability of Hebbian learning systems is summarized in the adaptive bidirectional associative memory (ABAM) theorem, and sufficient conditions for global stability are established for dynamical systems that adapt according to competitive and differential Hebbians learning laws.
Abstract: Global stability is examined for nonlinear feedback dynamical systems subject to unsupervised learning. Only differentiable neural models are discussed. The unconditional stability of Hebbian learning systems is summarized in the adaptive bidirectional associative memory (ABAM) theorem. When no learning occurs, the resulting BAM models include Cohen-Grossberg autoassociators, Hopfield circuits, brain-state-in-a box models, and masking field models. The ABAM theorem is extended to arbitrary higher-order Hebbian learning. Conditions for exponential convergence are discussed. Sufficient conditions for global stability are established for dynamical systems that adapt according to competitive and differential Hebbian learning laws. >

Journal ArticleDOI
TL;DR: Although neural net models show great promise in areas where traditional AI approaches falter, their success is constrained by slow learning rates and biological models such as error-back-propagation are also implausible as biological models.

Proceedings ArticleDOI
24 Jul 1988
TL;DR: The author describes a paradigm for creating novel examples from the class of patterns recognized by a trained gradient-descent associative learning network, and can be used for creative problems, such as music composition, which are not described by an input-output mapping.
Abstract: The author describes a paradigm for creating novel examples from the class of patterns recognized by a trained gradient-descent associative learning network. The paradigm consists of a learning phase, in which the network learns to identify patterns of the desired class, followed by a simple synthesis algorithm, in which a haphazard 'creation' is refined by a gradient-descent search complementary to the one used in learning. This paradigm is an alternative to one in which novel patterns are obtained by applying novel inputs to a learned mapping, and can be used for creative problems, such as music composition, which are not described by an input-output mapping. A simple simulation is shown in which a back-propagation network learns to judge simple patterns representing musical motifs, and then creates similar motifs. >


Journal ArticleDOI
TL;DR: In this article, a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis is described. And the performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier.
Abstract: This paper describes a procedure for classifying tissue types from unlabeled acoustic measurements (data type unknown) using unsupervised cluster analysis. These techniques are being applied to unsupervised ultrasonic image segmentation and tissue characteriza-tion. The performance of a new clustering technique is measured and compared with supervised methods, such as a linear Bayes classifier. In these comparisons two objectives are sought: a) How well does the clustering method group the data? b) Do the clusters correspond to known tissue classes? The first question is investigated by a measure of cluster similarity and dispersion. The second question involves a comparison with a supervised technique using labeled data.

Proceedings ArticleDOI
Hassoun1, Clark1
24 Jul 1988
TL;DR: Simulation results for the adaptive algorithm for supervised learning in single-layer neural networks are shown to be superior to those of the Widrow-Hoff (or least-mean-squares) adaptive learning algorithm.
Abstract: An adaptive algorithm for supervised learning in single-layer neural networks is proposed. The algorithm is characterized by fast convergence and high learning accuracy. It also allows for attentive learning and control of the dynamics of single-layer neural networks. This learning algorithm is based on the Ho-Kashyap associative neural memory (ANM) recording algorithm and is suited for the learning and association of binary patterns. Simulation results for the algorithm are shown to be superior to those of the Widrow-Hoff (or least-mean-squares) adaptive learning algorithm. >

Journal ArticleDOI
TL;DR: In this article, classification accuracies of unsupervised classification methods were evaluated for Landsat TM data with comparison to a conventional supervised maximum likelihood classificati cation method for high-resolution satellite data.
Abstract: Supervised classification methods have been mainly used for land‐cover/use classifications from the view point of classification accuracy, especially in the area where detailed land use dominates as in Japan. However, for high ground resolution image data such as Landsat TM and SPOT HRV data, it has been clarified that the classification accuracy using supervised classifications is lower than what was expected. One of the major reasons of this phenomenon may be caused by the difficulty with selecting sufficient training data. There is a possibility to solve this problem by using an unsupervised learning method because of its independent sampling characteristics. However, quantitative evaluations of performances of unsupervised classification methods for high resolution satellite data are not yet established. In this study, classification accuracies of unsupervised classification methods were evaluated for Landsat TM data with comparison to a conventional supervised maximum likelihood classificati...


Proceedings ArticleDOI
01 Dec 1988
TL;DR: The efficiency of learning from unclassified data (unsupervised learning) is examined by constructing a framework similar in style to the recent work on supervised concept learning inspired by Valiant, which is compared to both the supervised learnability model and other models of unsuper supervised learning.
Abstract: The efficiency of learning from unclassified data (unsupervised learning) is examined by constructing a framework similar in style to the recent work on supervised concept learning inspired by Valiant. We define the framework and illustrate it with results on three model classes. The framework is compared to both the supervised learnability model and other models of unsupervised learning.

Journal ArticleDOI
TL;DR: Unsupervised learning techniques are applied to the problems of detecting tumors within an organ and of discriminating between tissue types of two neighboring organs such as the liver and the kidney.
Abstract: The application of a procedure for classifying tissue types from unlabeled acoustic measurements using unsupervised analysis is reviewed and evaluated. Unsupervised learning techniques are applied to the problems of detecting tumors within an organ and of discriminating between tissue types of two neighboring organs such as the liver and the kidney. >

Proceedings ArticleDOI
03 May 1988
TL;DR: The drive-reinforcement neuronal model is described as an example of a newly discovered class of real-time learning mechanisms that correlate earlier derivatives of inputs with later derivatives of outputs as discussed by the authors.
Abstract: The drive-reinforcement neuronal model is described as an example of a newly discovered class of real-time learning mechanisms that correlate earlier derivatives of inputs with later derivatives of outputs. The drive-reinforcement neuronal model has been demonstrated to predict a wide range of classical conditioning phenomena in animal learning. A variety of classes of connectionist and neural network models have been investigated in recent years (Hinton and Anderson, 1981; Levine, 1983; Barto, 1985; Feldman, 1985; Rumelhart and McClelland, 1986). After a brief review of these models, discussion will focus on the class of real-time models because they appear to be making the strongest contact with the experimental evidence of animal learning. Theoretical models in physics have inspired Boltzmann machines (Ackley, Hinton, and Sejnowski, 1985) and what are sometimes called Hopfield networks (Hopfield, 1982; Hopfield and Tank, 1986). These connectionist models utilize symmetric connections and adaptive equilibrium processes during which the networks settle into minimal energy states. Networks utilizing error-correction learning mechanisms go back to Rosenblatt's (1962) perception and Widrow's (1962) adaline and currently take the form of back propagation networks (Parker, 1985; Rumelhart, Hinton, and Williams, 1985, 1986). These networks require a "teacher" or "trainer" to provide error signals indicating the difference between desired and actual responses. Networks employing real-time learning mechanisms, in which the temporal association of signals is of fundamental importance, go back to Hebb (1949). Real-time learning mechanisms may require no teacher or trainer and thus may lend themselves to unsupervised learning. Such models have been extended by Klopf (1972, 1982), who introduced the notions of synaptic eligibility and generalized reinforcement. Sutton and Barto (1981) advanced this class of models by proposing that a derivative of the theoretical neuron's out-put be utilized as a reinforcement signal. Klopf (1986) has recently extended the Sutton-Barto (1981) model, yielding a learning mechanism that correlates earlier derivatives of the theoretical neuron's inputs with later derivatives of the theoretical neuron's output. Independently, Kosko (1986) has also discovered this new class of differential learning mechanisms. Kosko (1986), approaching from philosophical and mathematical directions, and Klopf (1986), approaching from the directions of neuronal modeling and animal learning research, came to the same conclusion: correlating earlier derivatives of inputs with later derivatives of outputs may constitute a fundamental improvement over a Hebbian correlation of approximately simultaneous input and output signals. Klopf's version of the learning mechanism, termed a drive-reinforcement model, has been demonstrated to predict a wide range of classical conditioning phenomena in animal learning. This will be illustrated with results of computer simulations of the drive-reinforcement neuronal model and with a videotape of a simulated network of drive-reinforcement neurons controlling a simulated robot operating in a simulated environment.

30 Sep 1988
TL;DR: A network of units designed to be implemented as a unit in a Connectionist network is presented that learns the inVERSE KINEMATIC TRANSFORM of a SIMULATED 3 DEGREE-based system.
Abstract: REINFORCEMENT LEARNING IS THE PROCESS BY WHICH THE PROBABILITY OF THE RESPONSE OF A SYSTEM TO A STIMULUS INCREASES WITH REWARD AND DECREASES WITH PUNISHMENT [19]. MOST OF THE RESEARCH IN REINFORCEMENT LEARNING (WITH THE EXCEPTION OF THE WORK IN FUNCTION OPTIMIZATION) HAS BEEN ON PROBLEMS WITH DISCRETE ACTION SPACES, IN WHICH THE LEARNING SYSTEM CHOOSES ONE OF A FIN- ITE NUMBER OF POSSIBLE ACTIONS. HOWEVER, MANY CONTROL PROBLEMS REQUIRE THE APPLICATION OF CONTINUOUS CONTROL SIGNALS. IN THIS PAPER, WE PRESENT A STO- CHASTIC REINFORCEMENT LEARNING ALGORITHM FOR LEARNING FUNCTIONS WITH CON- TINOUS OUTPUTS. OUR ALGORITHM IS DESIGNED TO BE IMPLEMENTED AS A UNIT IN A CONNECTIONIST NETWORK. WE ASSUME THAT THE LEARNING SYSTEM COMPUTES ITS REAL -VALUED OUTPUT AS SOME FUNCTION OF A RANDOM ACTIVATION GENERATED USING THE NORMAL DISTRIBUTION. THE ACTIVATION AT ANY TIME DEPENDS ON THE TWO PARAME- TERS, THE MEAN AND THE STANDARD DEVIATION, USED IN THE NORMAL DISTRIBUTION, WHICH, IN TURN, DEPEND ON THE CURRENT INPUTS TO THE UNIT. LEARNING TAKES PLACE BY USING OUR ALGORITHM TO ADJUST THESE TWO PARAMETERS SO AS TO IN- CREASE THE PROBABILITY OF PRODUCING THE OPTIMAL REAL VALUE FOR EACH INPUT PATTERN. THE PERFORMANCE OF THE ALGORITHM IS STUDIED BY USING IT TO LEARN TASKS OF VARYING LEVELS OF DIFFICULTY. FURTHER, AS AN EXAMPLE OF A POTEN- TIAL APPLICATION, WE PRESENT A NETWORK INCORPORATING THESE REAL-VALUED UNITS THAT LEARNS THE INVERSE KINEMATIC TRANSFORM OF A SIMULATED 3 DEGREE-

Book ChapterDOI
01 Jan 1988
TL;DR: This work has developed an algorithm and architecture for a connectionist system which mimics unsupervised competitive learning in function; however in form it resembles a reinforcement learning scheme, and suggests that this type of algorithm may be important in the building of large-scale networks for learning complex tasks.
Abstract: A crucial problem for connectionist learning schemes is how to partition the space of input vectors into useful categories for subsequent processing. Currently this partitioning is usually accomplished by unsupervised competitive learning algorithms. Although these schemes are simple and fast, they are unable to deal with categorizations that depend on factors other than the vectors' superficial similarity. Specifically they do not take into account feedback (or reinforcement) from outside the system as to the appropriateness of the categorizations that are being learned. We have developed an algorithm and architecture for a connectionist system which mimics unsupervised competitive learning in function; however in form it resembles a reinforcement learning scheme. We call this algorithm competitive reinforcement. This algorithm is inherently more stable than traditional competitive learning paradigms and can be easily and naturally adapted to function in reinforcement learning networks, allowing feature detection to be guided by externally generated reinforcement. A demonstration of the algorithm and its features using the classic dipole stimulus is presented. We suggest that this type of algorithm may be important in the building of large-scale networks for learning complex tasks.

Proceedings ArticleDOI
27 Oct 1988
TL;DR: A robotic system architecture facilitating the symbiotic integration of teleoperative and automated modes of task execution that would allow improved speed, accuracy, and efficiency oftask execution, while retaining the man in the loop for innovative reasoning and decision-making.
Abstract: The man-robot symbiosis concept has the fundamental objective of bridging the gap between fully human-controlled and fully autonomous systems to achieve true man-robot cooperative control and intelligence. Such a system would allow improved speed, accuracy, and efficiency of task execution, while retaining the man in the loop for innovative reasoning and decision-making. The symbiont would have capabilities for supervised and unsupervised learning, allowing an increase of expertise in a wide task domain. This paper describes a robotic system architecture facilitating the symbiotic integration of teleoperative and automated modes of task execution. The architecture reflects a unique blend of many disciplines of artificial intelligence into a working system, including job or mission planning, dynamic task allocation, man-robot communication, automated monitoring, and machine learning. These disciplines are embodied in five major components of the symbiotic framework: the Job Planner, the Dynamic Task Allocator, the Presenter/Interpreter, the Automated Monitor, and the Learning System.


Proceedings ArticleDOI
01 Jan 1988
TL;DR: The author examines both unsupervised learning algorithms, which allow networks to find correlations in the input, and supervised learning algorithms that allow the pairing of arbitrary patterns.
Abstract: The basic computing element in models of neural networks that focus on information processing capabilities is a 'neural' unit that has an output that is a function of the sum of its inputs. Information is stored in 'synapses' or connection strengths between units. Networks of these neurons are not programmed like standard computers, but trained by data input. The author examines both unsupervised learning algorithms, which allow networks to find correlations in the input, and supervised learning algorithms, which allow the pairing of arbitrary patterns. >