scispace - formally typeset
Search or ask a question

Showing papers on "Artificial neural network published in 1989"


Journal ArticleDOI
TL;DR: The exact form of a gradient-following learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks.
Abstract: The exact form of a gradient-following learning algorithm for completely recurrent networks running in continually sampled time is derived and used as the basis for practical algorithms for temporal supervised learning tasks. These algorithms have (1) the advantage that they do not require a precisely defined training interval, operating while the network runs; and (2) the disadvantage that they require nonlocal communication in the network being trained and are computationally expensive. These algorithms allow networks having recurrent connections to learn complex tasks that require the retention of information over time periods having either fixed or indefinite length.

4,351 citations


Journal ArticleDOI
TL;DR: It is proved that any continuous mapping can be approximately realized by Rumelhart-Hinton-Williams' multilayer neural networks with at least one hidden layer whose output functions are sigmoid functions.

3,989 citations


Proceedings Article
01 Jan 1989
TL;DR: A class of practical and nearly optimal schemes for adapting the size of a neural network by using second-derivative information to make a tradeoff between network complexity and training set error is derived.
Abstract: We have used information-theoretic ideas to derive a class of practical and nearly optimal schemes for adapting the size of a neural network. By removing unimportant weights from a network, several improvements can be expected: better generalization, fewer training examples required, and improved speed of learning and/or classification. The basic idea is to use second-derivative information to make a tradeoff between network complexity and training set error. Experiments confirm the usefulness of the methods on a real-world application.

3,961 citations


Proceedings Article
01 Jan 1989
TL;DR: The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no back-propagation of error signals through the connections of the network.
Abstract: Cascade-Correlation is a new architecture and supervised learning algorithm for artificial neural networks. Instead of just adjusting the weights in a network of fixed topology. Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multi-layer structure. Once a new hidden unit has been added to the network, its input-side weights are frozen. This unit then becomes a permanent feature-detector in the network, available for producing outputs or for creating other, more complex feature detectors. The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no back-propagation of error signals through the connections of the network.

2,698 citations


Journal ArticleDOI
TL;DR: In this article, the authors presented a time-delay neural network (TDNN) approach to phoneme recognition, which is characterized by two important properties: (1) using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation; and (2) the time delay arrangement enables the network to discover acoustic-phonetic features and the temporal relationships between them independently of position in time and therefore not blurred by temporal shifts in the input
Abstract: The authors present a time-delay neural network (TDNN) approach to phoneme recognition which is characterized by two important properties: (1) using a three-layer arrangement of simple computing units, a hierarchy can be constructed that allows for the formation of arbitrary nonlinear decision surfaces, which the TDNN learns automatically using error backpropagation; and (2) the time-delay arrangement enables the network to discover acoustic-phonetic features and the temporal relationships between them independently of position in time and therefore not blurred by temporal shifts in the input. As a recognition task, the speaker-dependent recognition of the phonemes B, D, and G in varying phonetic contexts was chosen. For comparison, several discrete hidden Markov models (HMM) were trained to perform the same task. Performance evaluation over 1946 testing tokens from three speakers showed that the TDNN achieves a recognition rate of 98.5% correct while the rate obtained by the best of the HMMs was only 93.7%. >

2,319 citations


Book
01 Jan 1989
TL;DR: This is a book that will show you even new to old thing, and when you are really dying of adaptive pattern recognition and neural networks, just pick this book; it will be right for you.
Abstract: It's coming again, the new collection that this site has. To complete your curiosity, we offer the favorite adaptive pattern recognition and neural networks book as the choice today. This is a book that will show you even new to old thing. Forget it; it will be right for you. Well, when you are really dying of adaptive pattern recognition and neural networks, just pick it. You know, this book is always making the fans to be dizzy if not to find.

2,166 citations


MonographDOI
24 Jul 1989
TL;DR: From the Publisher: Substantial progress in understanding memory, the learning process, and self-organization by studying the properties of models of neural networks have resulted in discoveries of important parallels between the property of statistical, nonlinear cooperative systems in physics and neural networks.
Abstract: From the Publisher: Substantial progress in understanding memory, the learning process, and self-organization by studying the properties of models of neural networks have resulted in discoveries of important parallels between the properties of statistical, nonlinear cooperative systems in physics and neural networks.

1,721 citations


Proceedings ArticleDOI
01 Jan 1989
TL;DR: A speculative neurophysiological model illustrating how the backpropagation neural network architecture might plausibly be implemented in the mammalian brain for corticocortical learning between nearby regions of the cerebral cortex is presented.
Abstract: The author presents a survey of the basic theory of the backpropagation neural network architecture covering architectural design, performance measurement, function approximation capability, and learning. The survey includes previously known material, as well as some new results, namely, a formulation of the backpropagation neural network architecture to make it a valid neural network (past formulations violated the locality of processing restriction) and a proof that the backpropagation mean-squared-error function exists and is differentiable. Also included is a theorem showing that any L/sub 2/ function from (0, 1)/sup n/ to R/sup m/ can be implemented to any desired degree of accuracy with a three-layer backpropagation neural network. The author presents a speculative neurophysiological model illustrating how the backpropagation neural network architecture might plausibly be implemented in the mammalian brain for corticocortical learning between nearby regions of the cerebral cortex. >

1,668 citations


Journal ArticleDOI
TL;DR: An optimality principle is proposed which is based upon preserving maximal information in the output units and an algorithm for unsupervised learning based upon a Hebbian learning rule, which achieves the desired optimality is presented.

1,554 citations


Journal ArticleDOI
TL;DR: The main result is a complete description of the landscape attached to E in terms of principal component analysis, showing that E has a unique minimum corresponding to the projection onto the subspace generated by the first principal vectors of a covariance matrix associated with the training patterns.

1,456 citations


Book
01 Jan 1989
TL;DR: Combinatorial Optimization and Boltzmann Machines, Parallel Simulated Annealing Algorithms, and Neural Computing.
Abstract: SIMULATED ANNEALING. Combinatorial Optimization. Simulated Annealing. Asymptotic Convergence. Finite-Time Approximation. Simulated Annealing in Practice. Parallel Simulated Annealing Algorithms. BOLTZMANN MACHINES. Neural Computing. Boltzmann Machines. Combinatorial Optimization and Boltzmann Machines. Classification and Boltzmann Machines. Learning and Boltzmann Machines. Appendix. Bibliography. Indices.

Book
01 Jan 1989
TL;DR: Advanced Methods in Neural Computing meets the reference needs of electronics engineers, control systems engineers, programmers, and others in scientific disciplines by explaining diverse high-performance paradigms for artificial neural networks that function effectively in real-world situations.
Abstract: From the Publisher: Following up where Neural Computing: Theory and Practice left off, this guide explains diverse high-performance paradigms for artificial neural networks (ANNs) that function effectively in real-world situations. The tutorial approach, use of standardized notation, undergraduate-level mathematics, and extensive examples explain methods for solving practical neural network engineering problems in a clear and comprehensible manner. Emphasis is given to paradigms that perform well rather than those of academic interest. Explanations of the paradigms are program-oriented and are written in algorithmic form. Self-contained chapters cover field theory methods, including Nestor's restricted coulomb energy system; probabilistic neural networks, which can increase training speed by orders of magnitude; genetic algorithms that mimic biological evolution; sparse distributed memory, a powerful associative memory paradigm, which is compatible with VLSI implementation; fuzzy logic methods that are finding widespread application in control systems; neural engineering, including a set of techniques for designing, training, and applying artificial neural systems to real-world problems; and additional chapters cover basis function methods, chaos, and automatic control. Most of the paradigms presented have been used by the author in actual applications. Paradigms that are still in the research stage, but offer great potential, are also discussed. Advanced Methods in Neural Computing meets the reference needs of electronics engineers, control systems engineers, programmers, and others in scientific disciplines.

Journal ArticleDOI
TL;DR: In this paper, each connection computes the derivative, with respect to the connection strength, of a global measure of the error in the performance of the network, and the strength is then adjusted in the direction that decreases the error.

Journal ArticleDOI
TL;DR: Concepts and analytical results from the literatures of mathematical statistics, econometrics, systems identification, and optimization theory relevant to the analysis of learning in artificial neural networks are reviewed.
Abstract: The premise of this article is that learning procedures used to train artificial neural networks are inherently statistical techniques. It follows that statistical theory can provide considerable insight into the properties, advantages, and disadvantages of different network learning methods. We review concepts and analytical results from the literatures of mathematical statistics, econometrics, systems identification, and optimization theory relevant to the analysis of learning in artificial neural networks. Because of the considerable variety of available learning procedures and necessary limitations of space, we cannot provide a comprehensive treatment. Our focus is primarily on learning procedures for feedforward networks. However, many of the concepts and issues arising in this framework are also quite broadly relevant to other network learning paradigms. In addition to providing useful insights, the material reviewed here suggests some potentially useful new training methods for artificial neural ne...

Journal ArticleDOI
TL;DR: In this paper, the authors analyzed the dynamics of continuous-time analog networks with delay, and showed that there is a critical delay above which a symmetrically connected network will oscillate.
Abstract: Continuous-time analog neural networks with symmetric connections will always converge to fixed points when the neurons have infinitely fast response, but can oscillate when a small time delay is present. Sustained oscillation resulting from time delay is relevant to hardware implementations of neural networks where delay due to the finite switching speed of amplifiers can be appreciable compared to the network relaxation time. We analyze the dynamics of continuous-time analog networks with delay, and show that there is a critical delay above which a symmetrically connected network will oscillate. Two different stability analyses are presented for low and high neuron gain. The results are useful as design criteria for building fast but stable electronic networks. We find that for some connection topologies, a delay much smaller than the relaxation time can lead to oscillation, whereas for other topologies, including associative memory networks, even long delays will not produce oscillation. The most oscillation-prone network configuration is the all-inhibitory network; in this configuration, the critical delay for oscillation is smaller than the network relaxation time by a factor of N, the size of the network. Theoretical results are compared with numerical simulations and with experiments performed on a small (eight neurons) electronic network with controllable delay.

Proceedings Article
01 Jun 1989
TL;DR: This semester, the RoboCup Keepaway Machine Learning testbed provided an excellent environment to train the authors' agents, but still needed to scale down the problem in order to do a feasibility study.
Abstract: RoboCup has come a long way since it’s creation in ’97 [1] and is a respected place for machine learning researchers to try out new algorithms in a competitive fashion. RoboCup is now an international competition that draws many teams and respected researchers looking for a chance to create the best team. Originally we set out to create a team to compete in RoboCup. This was an ambitious project, and we had hopes to finish within the next year. For this semester, we chose to scale down the RoboCup team towards a smaller research area to try our learning algorithm on. The scaled down version of the RoboCup soccer environment is known as the ”Keepaway Testbed” and was started by Peter Stone, University of Texas [2]. Here the task is simple, you have two teams on the field each with the same number of players. Instead of trying to score a goal on the opponent the teams are given tasks, and one team is labeled the keepers and the other is labeled the takers. It is the task of the keepers to maintain possesion of the ball and it is the task of the takers to take the ball. The longer the keepers are able to maintain possesion of the ball the better the team. There are several advantages to this environment. First, it provides some of the essential characteristics of a real soccer game. Typically it is believed that if a team is able to maintain possesion of the ball for long periods of time they will win the match. Secondly, it provides realistic behavior much the same as the original RoboCup server. This is accomplished by introducing noise into the system similar to the original RoboCup, and similar to what would be received by real robots. Finally, when you want to go through the learning process this environment is capable of stopping play once the takers have touched the ball, and the environment is capable of starting a new trial based on that occurrence. Although the RoboCup Keepaway Machine Learning testbed provided an excellent environment to train our agents, we still needed to scale down the problem in order to do a feasibility study. Based on the Keepaway testbed, we created a simulation world with one simple task. One agent is placed into the world and has to locate the position of the goal. This can be thought of as an agent in a soccer environment needing to locate either the ball or another teammate. It was in this environment where we tested our methods for learning autonomous agents.

Journal ArticleDOI
TL;DR: A single neuron with Hebbian-type learning for the connection weights, and with nonlinear internal feedback, has been shown to extract the statistical principal components of its stationary input pattern sequence, which yields a multi-dimensional, principal component subspace.
Abstract: A single neuron with Hebbian-type learning for the connection weights, and with nonlinear internal feedback, has been shown to extract the statistical principal components of its stationary input pattern sequence. A generalization of this model to a layer of neuron units is given, called the Subspace Network, which yields a multi-dimensional, principal component subspace. This can be used as an associative memory for the input vectors or as a module in nonsupervised learning of data clusters in the input space. It is also able to realize a powerful pattern classifier based on projections on class subspaces. Some classification results for natural textures are given.

Journal ArticleDOI
TL;DR: The author extends a previous review and focuses on feed-forward neural-net classifiers for static patterns with continuous-valued inputs, examining probabilistic, hyperplane, kernel, and exemplar classifiers.
Abstract: The author extends a previous review and focuses on feed-forward neural-net classifiers for static patterns with continuous-valued inputs. He provides a taxonomy of neural-net classifiers, examining probabilistic, hyperplane, kernel, and exemplar classifiers. He then discusses back-propagation and decision-tree classifiers; matching classifier complexity to training data; GMDH (generalized method of data handling) networks and high-order nets; K nearest-neighbor classifiers; the feature-map classifier; the learning vector quantizer; hypersphere classifiers; and radial-basis function classifiers. >

01 Jul 1989
TL;DR: In this article, a generalized radial basis function (GRBF) is proposed to learn an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function.
Abstract: Learning an input-output mapping from a set of examples, of the type that many neural networks have been constructed to perform, can be regarded as synthesizing an approximation of a multi-dimensional function. We develop a theoretical framework for approximation based on regularization techniques that leads to a class of three-layer networks that we call Generalized Radial Basis Functions (GRBF). GRBF networks are not only equivalent to generalized splines, but are also closely related to several pattern recognition methods and neural network algorithms. The paper introduces several extensions and applications of the technique and discusses intriguing analogies with neurobiological data.

Journal ArticleDOI
12 Jan 1989-Nature
TL;DR: The remarkable properties of some recent computer algorithms for neural networks seemed to promise a fresh approach to understanding the computational properties of the brain, but most of these neural nets are unrealistic in important respects.
Abstract: The remarkable properties of some recent computer algorithms for neural networks seemed to promise a fresh approach to understanding the computational properties of the brain. Unfortunately most of these neural nets are unrealistic in important respects.


Book
30 Dec 1989
TL;DR: This chapter discusses why neural nets are important, how they are improving, and how they can be improved in the real world.
Abstract: Introduction - Why neural nets? Principles and promises. The McCulloch and Pitts legacy. The hard learning problem. Making neurons. The secrets of Wisard. Multi-layer perceptrons. Dynamic networks. Variations. Neurocontrol. Varieties of pattern analysis. Developments in weightless systems. Trends and promises.

Journal ArticleDOI
TL;DR: In this article, an inverted pendulum is simulated as a control task with the goal of learning to balance the pendulum with no a priori knowledge of the dynamics, and reinforcement and temporal-difference learning methods are presented that deal with these issues to avoid unstable conditions.
Abstract: An inverted pendulum is simulated as a control task with the goal of learning to balance the pendulum with no a priori knowledge of the dynamics. In contrast to other applications of neural networks to the inverted pendulum task, performance feedback is assumed to be unavailable on each step, appearing only as a failure signal when the pendulum falls or reaches the bounds of a horizontal track. To solve this task, the controller must deal with issues of delayed performance evaluation, learning under uncertainty, and the learning of nonlinear functions. Reinforcement and temporal-difference learning methods are presented that deal with these issues to avoid unstable conditions and balance the pendulum. >

Journal ArticleDOI
TL;DR: Further work is necessary for large-vocabulary continuous-speech problems, to develop training algorithms that progressively build internal word models, and to develop compact VLSI neural net hardware.
Abstract: The performance of current speech recognition systems is far below that of humans. Neural nets offer the potential of providing massive parallelism, adaptation, and new algorithmic approaches to problems in speech recognition. Initial studies have demonstrated that multilayer networks with time delays can provide excellent discrimination between small sets of pre-segmented difficult-to-discriminate words, consonants, and vowels. Performance for these small vocabularies has often exceeded that of more conventional approaches. Physiological front ends have provided improved recognition accuracy in noise and a cochlea filter-bank that could be used in these front ends has been implemented using micro-power analog VLSI techniques. Techniques have been developed to scale networks up in size to handle larger vocabularies, to reduce training time, and to train nets with recurrent connections. Multilayer perceptron classifiers are being integrated into conventional continuous-speech recognizers. Neural net architectures have been developed to perform the computations required by vector quantizers, static pattern classifiers, and the Viterbi decoding algorithm. Further work is necessary for large-vocabulary continuous-speech problems, to develop training algorithms that progressively build internal word models, and to develop compact VLSI neural net hardware.

Journal ArticleDOI
TL;DR: A novel modified method for obtaining approximate solutions to difficult optimization problems within the neural network paradigm is presented, which considers the graph partition and the travelling salesman problems and exhibits an impressive level of parameter insensitivity.
Abstract: A novel modified method for obtaining approximate solutions to difficult optimization problems within the neural network paradigm is presented We consider the graph partition and the travelling salesman problems The key new ingredient is a reduction of solution space by one dimension by using graded neurons, thereby avoiding the destructive redundancy that has plagued these problems when using straightforward neural network techniques This approach maps the problems onto Potts glass rather than spin glass theories A systematic prescription is given for estimating the phase transition temperatures in advance, which facilitates the choice of optimal parameters This analysis, which is performed for both serial and synchronous updating of the mean field theory equations, makes it possible to consistently avoid chaotic behaviorWhen exploring this new technique numerically we find the results very encouraging; the quality of the solutions are in parity with those obtained by using optimally tuned simulated annealing heuristics Our numerical study, which for TSP extends to 200-city problems, exhibits an impressive level of parameter insensitivity (Less)

Proceedings Article
20 Aug 1989
TL;DR: For these problems, which have relatively few hypotheses and features, the machine learning procedures for rule induction or tree induction clearly performed best.
Abstract: Classification methods from statistical pattern recognition, neural nets, and machine learning were applied to four real-world data sets. Each of these data sets has been previously analyzed and reported in the statistical, medical, or machine learning literature. The data sets are characterized by statisucal uncertainty; there is no completely accurate solution to these problems. Training and testing or resampling techniques are used to estimate the true error rates of the classification methods. Detailed attention is given to the analysis of performance of the neural nets using back propagation. For these problems, which have relatively few hypotheses and features, the machine learning procedures for rule induction or tree induction clearly performed best.

Proceedings ArticleDOI
Nguyen1, Widrow1
01 Jan 1989
TL;DR: In this paper, a two-layer neural network containing 26 adaptive neural elements was used to back up a computer-simulated trailer truck to a loading dock, even when initially jackknifed.
Abstract: Neural networks can be used to solve highly nonlinear control problems. A two-layer neural network containing 26 adaptive neural elements has learned to back up a computer-simulated trailer truck to a loading dock, even when initially jackknifed. It is not yet known how to design a controller to perform this steering task. Nevertheless, the neural net was able to learn of its own accord to do this, regardless of initial conditions. Experience gained with the truck backer-upper should be applicable to a wide variety of nonlinear control problems. >

Journal ArticleDOI
TL;DR: Two novel methods for achieving handwritten digit recognition are described, based on a neural network chip that performs line thinning and feature extraction using local template matching and on a digital signal processor that makes extensive use of constrained automatic learning.
Abstract: Two novel methods for achieving handwritten digit recognition are described. The first method is based on a neural network chip that performs line thinning and feature extraction using local template matching. The second method is implemented on a digital signal processor and makes extensive use of constrained automatic learning. Experimental results obtained using isolated handwritten digits taken from postal zip codes, a rather difficult data set, are reported and discussed. >

Proceedings Article
01 Jan 1989
TL;DR: It is shown that once the output layer of a multilayer perceptron is modified to provide mathematically correct probability distributions, and the usual squared error criterion is replaced with a probability-based score, the result is equivalent to Maximum Mutual Information training.
Abstract: One of the attractions of neural network approaches to pattern recognition is the use of a discrimination-based training method. We show that once we have modified the output layer of a multilayer perceptron to provide mathematically correct probability distributions, and replaced the usual squared error criterion with a probability-based score, the result is equivalent to Maximum Mutual Information training, which has been used successfully to improve the performance of hidden Markov models for speech recognition. If the network is specially constructed to perform the recognition computations of a given kind of stochastic model based classifier then we obtain a method for discrimination-based training of the parameters of the models. Examples include an HMM-based word discriminator, which we call an 'Alphanet'.

Journal Article
TL;DR: The multilayer feedforward networks (MLFN'S) outperformed the single-layer networks, achieving 1000/0 accuracy on the training set and up to 980/0 Accuracy on the testing set.