scispace - formally typeset
Search or ask a question

Showing papers on "Unsupervised learning published in 1990"


Book ChapterDOI
E.R. Davies1
01 Jan 1990
TL;DR: This chapter introduces the subject of statistical pattern recognition (SPR) by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier.
Abstract: This chapter introduces the subject of statistical pattern recognition (SPR). It starts by considering how features are defined and emphasizes that the nearest neighbor algorithm achieves error rates comparable with those of an ideal Bayes’ classifier. The concepts of an optimal number of features, representativeness of the training data, and the need to avoid overfitting to the training data are stressed. The chapter shows that methods such as the support vector machine and artificial neural networks are subject to these same training limitations, although each has its advantages. For neural networks, the multilayer perceptron architecture and back-propagation algorithm are described. The chapter distinguishes between supervised and unsupervised learning, demonstrating the advantages of the latter and showing how methods such as clustering and principal components analysis fit into the SPR framework. The chapter also defines the receiver operating characteristic, which allows an optimum balance between false positives and false negatives to be achieved.

1,189 citations


Proceedings ArticleDOI
01 Jul 1990
TL;DR: In this article, the authors present an algorithm for improving the accuracy of algorithms for learning binary concepts by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples.
Abstract: We present an algorithm for improving the accuracy of algorithms for learning binary concepts. The improvement is achieved by combining a large number of hypotheses, each of which is generated by training the given learning algorithm on a different set of examples. Our algorithm is based on ideas presented by Schapire and represents an improvement over his results, The analysis of our algorithm provides general upper bounds on the resources required for learning in Valiant′s polynomial PAC learning framework, which are the best general upper bounds known today. We show that the number of hypotheses that are combined by our algorithm is the smallest number possible. Other outcomes of our analysis are results regarding the representational power of threshold circuits, the relation between learnability and compression, and a method for parallelizing PAC learning algorithms. We provide extensions of our algorithms to cases in which the concepts are not binary and to the case where the accuracy of the learning algorithm depends on the distribution of the instances.

865 citations


Proceedings Article
01 Jan 1990

557 citations


Journal ArticleDOI
TL;DR: The heart of these algorithms is the pocket algorithm, a modification of perceptron learning that makes perceptronLearning well-behaved with nonseparable training data, even if the data are noisy and contradictory.
Abstract: A key task for connectionist research is the development and analysis of learning algorithms. An examination is made of several supervised learning algorithms for single-cell and network models. The heart of these algorithms is the pocket algorithm, a modification of perceptron learning that makes perceptron learning well-behaved with nonseparable training data, even if the data are noisy and contradictory. Features of these algorithms include speed algorithms fast enough to handle large sets of training data; network scaling properties, i.e. network methods scale up almost as well as single-cell models when the number of inputs is increased; analytic tractability, i.e. upper bounds on classification error are derivable; online learning, i.e. some variants can learn continually, without referring to previous data; and winner-take-all groups or choice groups, i.e. algorithms can be adapted to select one out of a number of possible classifications. These learning algorithms are suitable for applications in machine learning, pattern recognition, and connectionist expert systems. >

529 citations


Journal ArticleDOI
TL;DR: A stochastic reinforcement learning algorithm for learning functions with continuous outputs using a connectionist network that learns to perform an underconstrained positioning task using a simulated 3 degree-of-freedom robot arm.

306 citations


Journal ArticleDOI
TL;DR: A new hybrid unsupervised-learning law, called the differential competitive law, which uses the signal velocity as a local unsuper supervised reinforcement mechanism, is introduced, and its coding and stability behavior in feedforward and feedback networks is studied.
Abstract: A new hybrid learning law, the differential competitive law, which uses the neuronal signal velocity as a local unsupervised reinforcement mechanism, is introduced, and its coding and stability behavior in feedforward and feedback networks is examined. This analysis is facilitated by the recent Gluck-Parker pulse-coding interpretation of signal functions in differential Hebbian learning systems. The second-order behavior of RABAM (random adaptive bidirectional associative memory) Brownian-diffusion systems is summarized by the RABAM noise suppression theorem: the mean-squared activation and synaptic velocities decrease exponentially quickly to their lower bounds, the instantaneous noise variances driving the system. This result is extended to the RABAM annealing model, which provides a unified framework from which to analyze Geman-Hwang combinatorial optimization dynamical systems and continuous Boltzmann machine learning. >

176 citations


Journal ArticleDOI
TL;DR: It is shown that the optical neural network is capable of performing both unsupervised learning and pattern recognition operations simultaneously, by setting two matching scores in the learning algorithm by using a slower learning rate.
Abstract: One of the features in neural computing must be the ability to adapt to a changeable environment and to recognize unknown objects. This paper deals with an adaptive optical neural network using Kohonen's self-organizing feature map algorithm for unsupervised learning. A compact optical neural network of 64 neurons using liquid crystal televisions is used for this study. To test the performance of the self-organizing neural network, experimental demonstrations and computer simulations are provided. Effects due to unsupervised learning parameters are analyzed. We show that the optical neural network is capable of performing both unsupervised learning and pattern recognition operations simultaneously, by setting two matching scores in the learning algorithm. By using a slower learning rate, the construction of the memory matrix becomes more organized topologically. Moreover, the introduction of forbidden regions in the memory space enables the neural network to learn new patterns without erasing the old ones.

141 citations



Proceedings ArticleDOI
17 Jun 1990
TL;DR: An online learning algorithm for reinforcement learning with continually running recurrent networks in nonstationary reactive environments is described and the possibility of using the system for planning future action sequences is investigated and this approach is compared to approaches based on temporal difference methods.
Abstract: An online learning algorithm for reinforcement learning with continually running recurrent networks in nonstationary reactive environments is described. Various kinds of reinforcement are considered as special types of input to an agent living in the environment. The agent's only goal is to maximize the amount of reinforcement received over time. Supervised learning techniques for recurrent networks serve to construct a differentiable model of the environmental dynamics which includes a model of future reinforcement. This model is used for learning goal-directed behavior in an online fashion. The possibility of using the system for planning future action sequences is investigated and this approach is compared to approaches based on temporal difference methods. A connection to metalearning (learning how to learn) is noted

98 citations


Book ChapterDOI
01 Jun 1990
TL;DR: An improvement of Wilson's classifier system BOOLE is proposed that shows how Genetics based machine learning systems learning rates can be greatly improved, and is compared to a neural net using back propagation on a difficult boolean learning task, the multiplexer function.
Abstract: 1 Genetics based machine learning systems are considered by a majority of machine learners as slow rate learning systems. In this paper, we propose an improvement of Wilson's classifier system BOOLE that shows how Genetics based machine learning systems learning rates can be greatly improved. This modification consists in a change of the reinforcement component. We then compare the respective performances of this modified BOOLE, called NEWBOOLE, and a neural net using back propagation on a difficult boolean learning task, the multiplexer function. The results of this comparison show that NEWBOOLE obtains significantly faster learning rates.

76 citations


DOI
01 Jan 1990
TL;DR: These Ecole polytechnique federale de Lausanne EPFL students studied polymer engineering at the 1990s and produced a large number of students who went on to earn post-graduate degrees.
Abstract: These Ecole polytechnique federale de Lausanne EPFL, n° 863 (1990) Reference doi:10.5075/epfl-thesis-863Print copy in library catalog Record created on 2005-03-16, modified on 2016-08-08

Proceedings Article
01 Oct 1990
TL;DR: It is shown by simulation that relaxation networks of the kind the authors are implementing in VLSI are capable of learning large problems just like back-propagation networks.
Abstract: Feedback connections are required so that the teacher signal on the output neurons can modify weights during supervised learning. Relaxation methods are needed for learning static patterns with full-time feedback connections. Feedback network learning techniques have not achieved wide popularity because of the still greater computational efficiency of back-propagation. We show by simulation that relaxation networks of the kind we are implementing in VLSI are capable of learning large problems just like back-propagation networks. A microchip incorporates deterministic mean-field theory learning as well as stochastic Boltzmann learning. A multiple-chip electronic system implementing these networks will make high-speed parallel learning in them feasible in the future.

Proceedings Article
01 Oct 1990
TL;DR: Using an unsupervised learning procedure, a network is trained on an ensemble of images of the same two-dimensional object at different positions, orientations and sizes, and can reject instances of other shapes by using the fact that the predictions made by its two halves disagree.
Abstract: Using an unsupervised learning procedure, a network is trained on an ensemble of images of the same two-dimensional object at different positions, orientations and sizes. Each half of the network "sees" one fragment of the object, and tries to produce as output a set of 4 parameters that have high mutual information with the 4 parameters output by the other half of the network. Given the ensemble of training patterns, the 4 parameters on which the two halves of the network can agree are the position, orientation, and size of the whole object, or some recoding of them. After training, the network can reject instances of other shapes by using the fact that the predictions made by its two halves disagree. If two competing networks are trained on an unlabelled mixture of images of two objects, they cluster the training cases on the basis of the objects' shapes, independently of the position, orientation, and size.

Proceedings Article
29 Jul 1990
TL;DR: Six myths in the machine learning community that address issues of bias, learning as search, computational learning theory, Occam's razor, "universal" learning algorithms, and interactive learning are proposed.
Abstract: This paper is a discussion of machine learning theory on empirically learning classification rules. The paper proposes six myths in the machine learning community that address issues of bias, learning as search, computational learning theory, Occam's razor, "universal" learning algorithms, and interactive learning. Some of the problems raised are also addressed from a Bayesian perspective. The paper concludes by suggesting questions that machine learning researchers should be addressing both theoretically and experimentally.

Proceedings Article
29 Jul 1990
TL;DR: The learning part of a system which has been developed to provide expert systems capability augmented with learning, a hybrid connectionist, symbolic one, is described, which includes learning the well-known Iris data set.
Abstract: This paper describes the learning part of a system which has been developed to provide expert systems capability augmented with learning. The learning scheme is a hybrid connectionist, symbolic one. A network representation is used. Learning may be done incrementally and requires only one pass through the data set to be learned. Attribute, value pairs are supported as a variable implementation. Variables are represented by groups of connected cells in the network. The learning algorithm is described and an example given. Current results are discussed, which include learning the well-known Iris data set. The results show that the system has promise.

Proceedings ArticleDOI
17 Jun 1990
TL;DR: A discussion is presented of three techniques which offer significant improvement in training time by using an acceleration process for neurons which produce the same output class for the inputs provided by the training sample.
Abstract: A discussion is presented of three techniques which offer significant improvement in training time. In the first, training is restricted to those samples for which the network fails to predict correctly. The training process is extended to the entire training data set as the performance of the network improves. In the second technique, an acceleration process is used for neurons which produce the same output class for the inputs provided by the training sample. In the third technique, the learning rate is optimized, on the fly, to get the optimal improvement for each training pass. A derivation is presented for an optimal matching of momentum and learning rate

Journal ArticleDOI
A. Carlson1
TL;DR: Local, unsupervised learning rules for the threshold and the transition width are proposed, and a network using these rules sorts the input patterns into classes, which it identifies by a binary code, with the coarser structure coded by the earlier neurons in the hierarchy.
Abstract: The Hebbian rule (Hebb 1949), coupled with an appropriate mechanism to limit the growth of synaptic weights, allows a neuron to learn to respond to the first principal component of the distribution of its input signals (Oja 1982). Rubner and Schulten (1990) have recently suggested the use of an "anti-Hebbian" rule in a network with hierarchical lateral connections. When applied to neurons with linear response functions, this model allows additional neurons to learn to respond to additional principal components (Rubner and Tavan 1989). Here we apply the model to neurons with non-linear response functions characterized by a threshold and a transition width. We propose local, unsupervised learning rules for the threshold and the transition width, and illustrate the operation of these rules with some simple examples. A network using these rules sorts the input patterns into classes, which it identifies by a binary code, with the coarser structure coded by the earlier neurons in the hierarchy.

Book ChapterDOI
01 Jan 1990
TL;DR: GAL is a new algorithm that is able to quantize vectors as members of categories in an incremental fashion that, when learning, grows if and when necessary.
Abstract: Learning by changing connection weights only is time-consuming and does not always work. Freedom to modify network structure is also needed. Grow-and-Learn (GAL) is a new algorithm that is able to quantize vectors as members of categories in an incremental fashion. When a new vector is encountered, it is tested as in nearest neighbor search and if it is not already quantized correctly, unit and links are added to accommodate this additional requirement. Thus network when learning, grows if and when necessary. As the structure of the resulting network in such a learning phase is dependent on the order of encountering the vectors, a second phase is added to eliminate old, no-longer necessary associations. In this phase, the network is closed to the environment and the input patterns are generated by the network itself during which relevance of units are computed and those who are not vital are removed. Simulation results when applied to character recognition is promising. Physiological plausibility and how the idea may be extended to unsupervised learning is discussed.

Journal ArticleDOI
TL;DR: A learning scheme based on Hebb's rule which allows the system's neuronal cells to specialize on different patterns during learning is introduced, appropriately modified and applied to the competitive network under study.

Journal ArticleDOI
TL;DR: A new artificial neural model for unsupervised learning that iterates the weights in such a way as to move the decision boundary to a place of low pattern density and extended to the multiclass case by applying the previous procedure in a hierarchical manner.

Proceedings ArticleDOI
03 Apr 1990
TL;DR: A new approach for the learning process of multilayer perceptron neural networks using a recursive-least-squares-(RLS) type algorithm is proposed, and an analog of the back-propagation strategy used in the conventional learning algorithms is developed.
Abstract: A new approach for the learning process of multilayer perceptron neural networks using a recursive-least-squares-(RLS) type algorithm is proposed. The weights in the network are updated recursively upon the arrival of a new training sample. To determine the desired target in the hidden layers an analog of the back-propagation strategy used in the conventional learning algorithms is developed. This permits the application of the learning procedure to all the other lower layers. Simulation results on the 4-b parity checker problem are provided. >

Book ChapterDOI
01 Jun 1990
TL;DR: In control tasks, such as pole balancing, it is found that a program that learns to balance the pole quickly produces a control strategy that is so specific as to make it impossible to transfer expertise from one related task to another.
Abstract: The most frequently used measure of performance for reinforcement learning algorithms is learning rate. That is, how many learning trials are required before the program is able to perform its task adequately. In this paper, we argue that this is not necessarily the best measure of performance and, in some cases, can even be misleading. In control tasks, such as pole balancing, we have found that a program that learns to balance the pole quickly produces a control strategy that is so specific as to make it impossible to transfer expertise from one related task to another. We examine the reasons for this and suggest ways of obtaining general control strategies. We also make the conjecture that, as a broad principle, there is a trade-off between rapid learning rate and the ability to generalise. We also introduce methods for analysing the results of reinforcement learning algorithms to produce readable control rules.

Book ChapterDOI
01 Jan 1990
TL;DR: The goal of the research is to understand the power and appropriateness of instance-based representations and their associated acquisition methods and to mitigate the effects of non-convex concepts.
Abstract: The goal of our research is to understand the power and appropriateness of instance-based representations and their associated acquisition methods. This paper concerns two methods for reducing storage requirements for instance-based learning algorithms. The first method, termed instance-saving, represents concept descriptions by selecting and storing a representative subset of the given training instances. We provide an analysis for instance-saving techniques and specify one general class of concepts that instance-saving algorithms can learn. The second method, termed instance-averaging, represents concept descriptions by averaging together some training instances while simply saving others. We describe why analyses for instance-averaging algorithms are difficult to produce. Our empirical results indicate that storage requirements for these two methods are roughly equivalent. We outline the assumptions of instance-averaging algorithms and describe how their violation might degrade performance. To mitigate the effects of non-convex concepts, a dynamic distance-thresholding technique is introduced and applied in both the averaging and non-averaging learning algorithms. Thresholding increases storage requirements but also increases the quality of the resulting concept descriptions.

Proceedings Article
01 Jan 1990
TL;DR: This paper proves the general inability of simple learning programs to learn complex concepts from few input data, independently of the epistemological problems of inductive inference.
Abstract: Machine learning is widely regarded as a tool for overcoming the bottleneck in knowledge acquisition. Especially in knowledge-intensive domains there is the hope for using machine learning techniques successfully. This paper prove the general inability of simple learning programs to learn complex concepts from few input data. This holds independently of the epistemological problems of inductive inference. These results are obtained by the use of algorithmic information theory.

Book ChapterDOI
01 Jun 1990
TL;DR: This work applies an approach to modeling the average case behavior of learning algorithms to a purely empirical learning algorithm, and to an algorithm that combines empirical and explanation-based learning.
Abstract: We present an approach to modeling the average case behavior of learning algorithms. Our motivation is to predict the expected accuracy of learning algorithms as a function of the number of training examples. We apply this framework to a purely empirical learning algorithm, (the one-sided algorithm for pure conjunctive concepts), and to an algorithm that combines empirical and explanation-based learning. We evaluate the average-case models by comparing the accuracy predicted by the models to the actual accuracy obtained by running the learning algorithms.

Proceedings ArticleDOI
01 Sep 1990
TL;DR: It is shown that the optical neural network is capable of performing both unsupervised learning and pattern recognition operations simultaneously, by setting two matching scores in the learning algorithm by introducing the forbidden regions in the memory space.
Abstract: One of the features in neural computing must be the adaptability to changeable environment and to recognize unknown objects. This paper deals with an adaptive optical neural network using Kohonon's self-organizing feature map algorithm for unsupervised learning. A compact optical neural network of 64 neurons using liquid crystal televisions is used for this study. To test the performances of the self-organizing neural network, experimental demonstrations with computer simulations are provided. Effects due to unsupervised learning parameters are analyzed. We have shown that the optical neural network is capable of performing both unsupervised learning and pattern recognition operations simultaneously, by setting two matching scores in the learning algorithm. By using slower learning rate, the construction of the memory matrix becomes topologically more organized. Moreover, by introducing the forbidden regions in the memory space, it would enable the neural network to learn new patterns without erasing the old ones.

Proceedings Article
29 Jul 1990
TL;DR: A representation language is presented that supports a hybrid analytical and similarity-based classification scheme and can be seen as providing an inductive bias to the learning procedure, thereby shortening the required training phase, and reducing the brittleness of the induced generalizations.
Abstract: This paper is concerned with knowledge representation issues in machine learning. In particular, it presents a representation language that supports a hybrid analytical and similarity-based classification scheme. Analytical classification is produced using a KL-ONE-like term-subsumption strategy, while similarity-based classification is driven by generalizations induced from a training set by an unsupervised learning procedure. This approach can be seen as providing an inductive bias to the learning procedure, thereby shortening the required training phase, and reducing the brittleness of the induced generalizations.

Proceedings ArticleDOI
J. Zhang1
06 Nov 1990
TL;DR: The method for combining inductive learning and exemplar-based learning has been implemented in the flexible concept learning system and experiments showed that the combined method has comparable performance to that of AQ16 and ASSISTANT in three natural domains.
Abstract: A learning approach that combines inductive learning with exemplar-based learning is described. In the method, a concept is represented by two parts: a generalized abstract description and a set of exemplars (exceptions). Generalized descriptions represent the principles of concepts, whereas exemplars represent the exceptional or rare cases. The method is an alternative for solving the problem of small disjuncts and for representing concepts with imprecise and irregular boundaries. The method for combining inductive learning and exemplar-based learning has been implemented in the flexible concept learning system. Experiments showed that the combined method has comparable performance to that of AQ16 and ASSISTANT in three natural domains. >

01 Jul 1990
TL;DR: The theory of input-output mapping from a set of examples is extended by introducing ways of dealing with two aspects of learning: learning in the presence of unreliable examples and learning from positive {\it and} negative examples.
Abstract: Learning an input-output mapping from a set of examples can be regarded as synthesizing an approximation of a multi-dimensional function. From this point of view, this form of learning is closely related to regularization theory. In this note, we extend the theory by introducing ways of dealing with two aspects of learning: learning in the presence of unreliable examples and learning from positive {\it and} negative examples. The first extension corresponds to dealing with outliers among the sparse data. The second one corresponds to exploiting information about points or regions in the range of the function that are forbidden.

Proceedings ArticleDOI
17 Jun 1990
TL;DR: The authors show how the network itself can infer the grammar and show that nonstochastic nets can perform signature verification with high reliability, raising the possibility of signature verification on a robust smart card.
Abstract: A syntactic neural network is equivalent to a parser for a certain type of grammar-in this case, strictly hierarchical context-free. This allows an efficient method for pattern description and has the added advantage of being a generative model. The authors show how the network itself can infer the grammar. Syntactic neural nets can model stochastic or nonstochastic grammars. The stochastic nets are properly probabilistic and are powerful discriminators; the nonstochastic nets are less powerful, but have straightforward silicon implementations with existing technology. Learning in syntactic nets may proceed supervised or unsupervised. In each case, the algorithm is the same; the difference lies in the data presented to the net. In prior publications, the authors applied syntactic neural nets to character recognition and cursive script recognition. The authors presently show that nonstochastic nets can perform signature verification with high reliability. This raises the possibility of signature verification on a robust smart card