scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 1990"


Journal ArticleDOI
TL;DR: It is demonstrated that neural networks can be used effectively for the identification and control of nonlinear dynamical systems and the models introduced are practically feasible.
Abstract: It is demonstrated that neural networks can be used effectively for the identification and control of nonlinear dynamical systems. The emphasis is on models for both identification and control. Static and dynamic backpropagation methods for the adjustment of parameters are discussed. In the models that are introduced, multilayer and recurrent networks are interconnected in novel configurations, and hence there is a real need to study them in a unified fashion. Simulation results reveal that the identification and adaptive control schemes suggested are practically feasible. Basic concepts and definitions are introduced throughout, and theoretical questions that have to be addressed are also described. >

7,692 citations


Journal ArticleDOI
TL;DR: The multilayer perceptron, when trained as a classifier using backpropagation, is shown to approximate the Bayes optimal discriminant function.
Abstract: The multilayer perceptron, when trained as a classifier using backpropagation, is shown to approximate the Bayes optimal discriminant function. The result is demonstrated for both the two-class problem and multiple classes. It is shown that the outputs of the multilayer perceptron approximate the a posteriori probability functions of the classes being trained. The proof applies to any number of layers and any type of unit activation function, linear or nonlinear. >

866 citations


Journal ArticleDOI
TL;DR: Shadow arrays are introduced which keep track of the incremental changes to the synaptic weights during a single pass of back-propagating learning and are ordered by decreasing sensitivity numbers so that the network can be efficiently pruned by discarding the last items of the sorted list.
Abstract: The sensitivity of the global error (cost) function to the inclusion/exclusion of each synapse in the artificial neural network is estimated. Introduced are shadow arrays which keep track of the incremental changes to the synaptic weights during a single pass of back-propagating learning. The synapses are then ordered by decreasing sensitivity numbers so that the network can be efficiently pruned by discarding the last items of the sorted list. Unlike previous approaches, this simple procedure does not require a modification of the cost function, does not interfere with the learning process, and demands a negligible computational overhead. >

684 citations


Journal ArticleDOI
TL;DR: The heart of these algorithms is the pocket algorithm, a modification of perceptron learning that makes perceptronLearning well-behaved with nonseparable training data, even if the data are noisy and contradictory.
Abstract: A key task for connectionist research is the development and analysis of learning algorithms. An examination is made of several supervised learning algorithms for single-cell and network models. The heart of these algorithms is the pocket algorithm, a modification of perceptron learning that makes perceptron learning well-behaved with nonseparable training data, even if the data are noisy and contradictory. Features of these algorithms include speed algorithms fast enough to handle large sets of training data; network scaling properties, i.e. network methods scale up almost as well as single-cell models when the number of inputs is increased; analytic tractability, i.e. upper bounds on classification error are derivable; online learning, i.e. some variants can learn continually, without referring to previous data; and winner-take-all groups or choice groups, i.e. algorithms can be adapted to select one out of a number of possible classifications. These learning algorithms are suitable for applications in machine learning, pattern recognition, and connectionist expert systems. >

529 citations


Journal ArticleDOI
TL;DR: Two methods for classification based on the Bayes strategy and nonparametric estimators for probability density functions are reviewed and the polynomial equivalent can be found.
Abstract: Two methods for classification based on the Bayes strategy and nonparametric estimators for probability density functions are reviewed. The two methods are named the probabilistic neural network (PNN) and the polynomial Adaline. Both methods involve one-pass learning algorithms that can be implemented directly in parallel neural network architectures. The performances of the two methods are compared with multipass backpropagation networks, and relative advantages and disadvantages are discussed. PNN and the polynomial Adaline are complementary techniques because they implement the same decision boundaries but have different advantages for applications. PNN is easy to use and is extremely fast for moderate-sized databases. For very large databases and for mature applications in which classification speed is more important than training speed, the polynomial equivalent can be found. >

524 citations


Journal ArticleDOI
TL;DR: A description is given of 11 papers from the April 1990 special issue on neural networks in control systems of IEEE Control Systems Magazine, on the design of associative memories using feedback neural networks and the modeling of nonlinear chemical systems using neural networks.
Abstract: A description is given of 11 papers from the April 1990 special issue on neural networks in control systems of IEEE Control Systems Magazine. The emphasis was on presenting as varied and current a picture as possible of the use of neural networks in control. The papers described cover: the design of associative memories using feedback neural networks; a method to use neural networks to control highly nonlinear systems; the modeling of nonlinear chemical systems using neural networks; the identification of dynamical systems; the comparison of conventional adaptive controllers and neural-network-based controllers; a method to provide adaptive control for nonlinear systems; neural networks and back-propagation; the back-propagation algorithm; the use of trained neural networks to regulate the pitch attitude of an underwater telerobot; the control of mobile robots; and the issues involved in integrating neural networks and knowledge-based systems. >

462 citations


Journal ArticleDOI
TL;DR: In this paper, the behavior of the Hopfield model as a content-addressable memory and as a method of solving the traveling salesman problem (TSP) is analyzed based on the geometry of the subspace set up by the degenerate eigenvalues of the connection matrix.
Abstract: An analysis is made of the behavior of the Hopfield model as a content-addressable memory (CAM) and as a method of solving the traveling salesman problem (TSP). The analysis is based on the geometry of the subspace set up by the degenerate eigenvalues of the connection matrix. The dynamic equation is shown to be equivalent to a projection of the input vector onto this subspace. In the case of content-addressable memory, it is shown that spurious fixed points can occur at any corner of the hypercube that is on or near the subspace spanned by the memory vectors. Analysed is why the network can frequently converge to an invalid solution when applied to the traveling salesman problem energy function. With these expressions, the network can be made robust and can reliably solve the traveling salesman problem with tour sizes of 50 cities or more. >

395 citations


Journal ArticleDOI
N.E. Cotter1
TL;DR: The Stone-Weierstrass theorem is reviewed, and a modified logistic network satisfying the theorem is proposed as an alternative to commonly used networks based on logistic squashing functions.
Abstract: The Stone-Weierstrass theorem and its terminology are reviewed, and neural network architectures based on this theorem are presented. Specifically, exponential functions, polynomials, partial fractions, and Boolean functions are used to create networks capable of approximating arbitrary bounded measurable functions. A modified logistic network satisfying the theorem is proposed as an alternative to commonly used networks based on logistic squashing functions. >

358 citations


Journal ArticleDOI
TL;DR: Two innovations are discussed: dynamic weighting of the input signals at each input of each cell, which improves the ordering when very different input signals are used, and definition of neighborhoods in the learning algorithm by the minimum spanning tree, which provides a far better and faster approximation of prominently structured density functions.
Abstract: Self-organizing maps have a bearing on traditional vector quantization. A characteristic that makes them more closely resemble certain biological brain maps, however, is the spatial order of their responses, which is formed in the learning process. A discussion is presented of the basic algorithms and two innovations: dynamic weighting of the input signals at each input of each cell, which improves the ordering when very different input signals are used, and definition of neighborhoods in the learning algorithm by the minimal spanning tree, which provides a far better and faster approximation of prominently structured density functions. It is cautioned that if the maps are used for pattern recognition and decision process, it is necessary to fine tune the reference vectors so that they directly define the decision borders. >

337 citations


Journal ArticleDOI
TL;DR: A review is presented of ATR (automatic target recognition), and some of the highlights of neural network technology developments that have the potential for making a significant impact on ATR are presented.
Abstract: A review is presented of ATR (automatic target recognition), and some of the highlights of neural network technology developments that have the potential for making a significant impact on ATR are presented. In particular, neural network technology developments in the areas of collective computation, learning algorithms, expert systems, and neurocomputer hardware could provide crucial tools for developing improved algorithms and computational hardware for ATR. The discussion covers previous ATR system efforts. ATR issues and needs, early vision and collective computation, learning and adaptation for ATR, feature extraction, higher vision and expert systems, and neurocomputer hardware. >

315 citations


Journal ArticleDOI
Eric A. Wan1
TL;DR: The relationship between minimizing a mean squared error and finding the optimal Bayesian classifier is reviewed and a number of confidence measures are proposed to evaluate the performance of the neural network classifier within a statistical framework.
Abstract: The relationship between minimizing a mean squared error and finding the optimal Bayesian classifier is reviewed. This provides a theoretical interpretation for the process by which neural networks are used in classification. A number of confidence measures are proposed to evaluate the performance of the neural network classifier within a statistical framework. >

Journal ArticleDOI
TL;DR: An extension of T. Kohonen's (1982) self-organizing mapping algorithm together with an error-correction scheme based on the Widrow-Hoff learning rule is applied to develop a learning algorithm for the visuomotor coordination of a simulated robot arm.
Abstract: An extension of T. Kohonen's (1982) self-organizing mapping algorithm together with an error-correction scheme based on the Widrow-Hoff learning rule is applied to develop a learning algorithm for the visuomotor coordination of a simulated robot arm. Learning occurs by a sequence of trial movements without the need for an external teacher. Using input signals from a pair of cameras, the closed robot arm system is able to reduce its positioning error to about 0.3% of the linear dimensions of its work space. This is achieved by choosing the connectivity of a three-dimensional lattice consisting of the units of the neural net. >

Journal ArticleDOI
TL;DR: An approximation is derived which expresses the probability of error for an output neuron of a large network (a network with many neurons per layer) as a function of the percentage change in the weights.
Abstract: An analysis is made of the sensitivity of feedforward layered networks of Adaline elements (threshold logic units) to weight errors. An approximation is derived which expresses the probability of error for an output neuron of a large network (a network with many neurons per layer) as a function of the percentage change in the weights. As would be expected, the probability of error increases with the number of layers in the network and with the percentage change in the weights. The probability of error is essentially independent of the number of weights per neuron and of the number of neurons per layer, as long as these numbers are large (on the order of 100 or more). >

Journal ArticleDOI
TL;DR: The temperature behavior of MFA during bipartitioning is analyzed and shown to have an impact on the tuning of neural networks for improved performance, and a new modification to MFA is presented that supports partitioning of random or structured graphs into three or more bins-a problem that has previously shown resistance to solution by neural networks.
Abstract: A new algorithm, mean field annealing (MFA), is applied to the graph-partitioning problem. The MFA algorithm combines characteristics of the simulated-annealing algorithm and the Hopfield neural network. MFA exhibits the rapid convergence of the neural network while preserving the solution quality afforded by simulated annealing (SA). The rate of convergence of MFA on graph bipartitioning problems is 10-100 times that of SA, with nearly equal quality of solutions. A new modification to mean-field annealing is also presented which supports partitioning graphs into three or more bins, a problem which has previously shown resistance to solution by neural networks. The temperature-behavior of MFA during graph partitioning is analyzed approximately and shown to possess a critical temperature at which most of the optimization occurs. This temperature is analogous to the gain of the neurons in a neural network and can be used to tune such networks for better performance. The value of the repulsion penalty needed to force MFA (or a neural network) to divide a graph into equal-sized pieces is also estimated. >

Journal ArticleDOI
TL;DR: The authors present single- and multispeaker recognition results for the voiced stop consonants /b, d, g/ using time-delay neural networks (TDNN), a new objective function for training these networks, and a simple arbitration scheme for improved classification accuracy.
Abstract: Single-speaker and multispeaker recognition results are presented for the voice-stop consonants /b,d,g/ using time-delay neural networks (TDNNs) with a number of enhancements, including a new objective function for training these networks. The new objective function, called the classification figure of merit (CFM), differs markedly from the traditional mean-squared-error (MSE) objective function and the related cross entropy (CE) objective function. Where the MSE and CE objective functions seek to minimize the difference between each output node and its ideal activation, the CFM function seeks to maximize the difference between the output activation of the node representing incorrect classifications. A simple arbitration mechanism is used with all three objective functions to achieve a median 30% reduction in the number of misclassifications when compared to TDNNs trained with the traditional MSE back-propagation objective function alone. >

Journal ArticleDOI
TL;DR: In representative computer simulations, multiple training has been shown to lead to an improvement over the original Kosko strategy for recall of multiple pairs as well, and theorems underlying the results are presented.
Abstract: Enhancements of the encoding strategy of a discrete bidirectional associative memory (BAM) reported by B. Kosko (1987) are presented. There are two major concepts in this work: multiple training, which can be guaranteed to achieve recall of a single trained pair under suitable initial conditions of data, and dummy augmentation, which can be guaranteed to achieve recall of all trained pairs if attaching dummy data to the training pairs is allowable. In representative computer simulations, multiple training has been shown to lead to an improvement over the original Kosko strategy for recall of multiple pairs as well. A sufficient condition for a correlation matrix to make the energies of the training pairs be local minima is discussed. The use of multiple training and dummy augmentation concepts are illustrated, and theorems underlying the results are presented. >

Journal ArticleDOI
TL;DR: A new hybrid unsupervised-learning law, called the differential competitive law, which uses the signal velocity as a local unsuper supervised reinforcement mechanism, is introduced, and its coding and stability behavior in feedforward and feedback networks is studied.
Abstract: A new hybrid learning law, the differential competitive law, which uses the neuronal signal velocity as a local unsupervised reinforcement mechanism, is introduced, and its coding and stability behavior in feedforward and feedback networks is examined. This analysis is facilitated by the recent Gluck-Parker pulse-coding interpretation of signal functions in differential Hebbian learning systems. The second-order behavior of RABAM (random adaptive bidirectional associative memory) Brownian-diffusion systems is summarized by the RABAM noise suppression theorem: the mean-squared activation and synaptic velocities decrease exponentially quickly to their lower bounds, the instantaneous noise variances driving the system. This result is extended to the RABAM annealing model, which provides a unified framework from which to analyze Geman-Hwang combinatorial optimization dynamical systems and continuous Boltzmann machine learning. >

Journal ArticleDOI
TL;DR: A learning method that uses neural networks for service quality control in the asynchronous transfer mode (ATM) communications network is described and a training data selection method called the leaky pattern table method is proposed to learn precise relations.
Abstract: A learning method that uses neural networks for service quality control in the asynchronous transfer mode (ATM) communications network is described. Because the precise characteristics of the source traffic are not known and the service quality requirements change over time, building an efficient network controller which can control the network traffic is a difficult task. The proposed ATM network controller uses backpropagation neural networks for learning the relations between the offered traffic and service quality. The neural network is adaptive and easy to implement. A training data selection method called the leaky pattern table method is proposed to learn precise relations. The performance of the proposed controller is evaluated by simulation of basic call admission models. >

Journal ArticleDOI
TL;DR: A new algorithm called the self-organizing neural network (SONN) is introduced and its use is demonstrated in a system identification task, demonstrating a simpler, more accurate model, requiring less training data and fewer epochs.
Abstract: A new algorithm called the self-organizing neural network (SONN) is introduced. Its use is demonstrated in a system identification task. The algorithm constructs a network, chooses the node functions, and adjusts the weights. It is compared to the backpropagation algorithm in the identification of the chaotic time series. The results show that SONN constructs a simpler, more accurate model, requiring less training data and fewer epochs. The algorithm can also be applied as a classifier. >

Journal ArticleDOI
TL;DR: A novel derivation is presented of T. Kohonen's topographic mapping training algorithm, based upon an extension of the Linde-Buzo-Gray algorithm for vector quantizer design, which stabilizes the internal representations chosen by a network against anticipated noise or distortion processes.
Abstract: A novel derivation is presented of T. Kohonen's topographic mapping training algorithm (Self-Organization and Associative Memory, 1984), based upon an extension of the Linde-Buzo-Gray (LBG) algorithm for vector quantizer design. Thus a vector quantizer is designed by minimizing an L/sub 2/ reconstruction distortion measure, including an additional contribution from the effect of code noise which corrupts the output of the vector quantizer. The neighborhood updating scheme of Kohonen's topographic mapping training algorithm emerges as a special case of this code noise model. This formulation of Kohonen's algorithm is a specific instance of the robust hidden layer principle, which stabilizes the internal representations chosen by a network against anticipated noise or distortion processes. >

Journal ArticleDOI
TL;DR: A new neural-network architecture called the parallel, self-organizing, hierarchical neural network (PSHNN) is presented, which has many desirable properties, such as optimized system complexity, high classification accuracy, minimized learning and recall times, and truly parallel architectures in which all stages operate simultaneously without waiting for data from other stages during testing.
Abstract: A new neural-network architecture called the parallel, self-organizing, hierarchical neural network (PSHNN) is presented. The new architecture involves a number of stages in which each stage can be a particular neural network (SNN). At the end of each stage, error detection is carried out, and a number of input vectors are rejected. Between two stages there is a nonlinear transformation of input vectors rejected by the previous stage. The new architecture has many desirable properties, such as optimized system complexity (in the sense of minimized self-organizing number of stages), high classification accuracy, minimized learning and recall times, and truly parallel architectures in which all stages operate simultaneously without waiting for data from other stages during testing. The experiments performed indicated the superiority of the new architecture over multilayered networks with back-propagation training. >

Journal ArticleDOI
TL;DR: A parallel algorithm for tiling with polyominoes is presented and can be used for placement of components or cells in a very large-scale integrated circuit (VLSI) chip, designing and compacting printed circuit boards, and solving a variety of two- or three-dimensional packing problems.
Abstract: A parallel algorithm for tiling with polyominoes is presented. The tiling problem is to pack polyominoes in a finite checkerboard. The algorithm using l*m*n processing elements requires O(1) time, where l is the number of different kinds of polyominoes on an m*n checkerboard. The algorithm can be used for placement of components or cells in a very large-scale integrated circuit (VLSI) chip, designing and compacting printed circuit boards, and solving a variety of two- or three-dimensional packing problems. >

Journal ArticleDOI
TL;DR: A theory and computer simulation of a neural controller that learns to move and position a link carrying an unforeseen payload accurately are presented.
Abstract: A theory and computer simulation of a neural controller that learns to move and position a link carrying an unforeseen payload accurately are presented. The neural controller learns adaptive dynamic control from its own experience. It does not use information about link mass, link length, or direction of gravity, and it uses only indirect uncalibrated information about payload and actuator limits. Its average positioning accuracy across a large range of payloads after learning is 3% of the positioning range. This neural controller can be used as a basis for coordinating any number of sensory inputs with limbs of any number of joints. The feedforward nature of control allows parallel implementation in real time across multiple joints. >

Journal ArticleDOI
TL;DR: Model-free learning for synchronous and asynchronous quasi-static networks is presented, which allows for integrated, on-chip learning in large analog and optical networks.
Abstract: Model-free learning for synchronous and asynchronous quasi-static networks is presented. The network weights are continuously perturbed, while the time-varying performance index is measured and correlated with the perturbation signals; the correlation output determines the changes in the weights. The perturbation may be either via noise sources or orthogonal signals. The invariance to detailed network structure mitigates large variability between supposedly identical networks as well as implementation defects. This local, regular, and completely distributed mechanism requires no central control and involves only a few global signals. Thus, it allows for integrated, on-chip learning in large analog and optical networks. >

Journal ArticleDOI
TL;DR: The simulator discovered several solutions which are more stable structures, in a sequence of 359 bases from the potato spindle tuber viroid, than previously proposed structures.
Abstract: A parallel algorithm for finding a near-maximum independent set in a circle graph is presented. An independent set in a graph is a set of vertices, no two of which are adjacent. A maximum independent set is an independent set whose cardinality is the largest among all independent sets of a graph. The algorithm is modified for predicting the secondary structure in ribonucleic acids (RNA). The proposed system, composed of an n neural network array (where n is the number of edges in the circle graph of the number of possible base pairs), not only generates a near-maximum independent set but also predicts the secondary structure of ribonucleic acids within several hundred iteration steps. The simulator discovered several solutions which are more stable structures, in a sequence of 359 bases from the potato spindle tuber viroid, than previously proposed structures. >

Journal ArticleDOI
TL;DR: A perceptron learning algorithm may be viewed as a steepest-descent method whereby an instantaneous performance function is iteratively minimized and the update term of the algorithm is the gradient of this function.
Abstract: A perceptron learning algorithm may be viewed as a steepest-descent method whereby an instantaneous performance function is iteratively minimized. An appropriate performance function for the most widely used perceptron algorithm is described and it is shown that the update term of the algorithm is the gradient of this function. An example is given of the corresponding performance surface based on Gaussian assumptions and it is shown that there is an infinity of stationary points. The performance surfaces of two related performance functions are examined. Computer simulations that demonstrate the convergence properties of the adaptive algorithms are given. >

Journal ArticleDOI
TL;DR: Analytical results are supported by Monte Carlo simulation runs which indicate that the detection capability of the proposed neural receiver is not sensitive to the level of training or number of patterns in the training set.
Abstract: The M-input optimum likelihood-ratio receiver is generalized by considering the case of different signal amplitudes on the receiver primary input lines. Using the more general likelihood-ratio receiver as a reference, an equivalent optimum multilayer perceptron neural network (or neural receiver) is identified for detecting the presence of an M-dimensional target signal corrupted by bandlimited white Gaussian noise. Analytical results are supported by Monte Carlo simulation runs which indicate that the detection capability of the proposed neural receiver is not sensitive to the level of training or number of patterns in the training set. >

Journal ArticleDOI
TL;DR: A fault tolerant neural network is described, patterned after the trellis graph description of convolutional codes and is able to tolerate errors in its inputs and failures of constituent neurons.
Abstract: Relationships between locally interconnected neural networks that use receptive field representations and trellis or convolutional codes are explored. A fault tolerant neural network is described. It is patterned after the trellis graph description of convolutional codes and is able to tolerate errors in its inputs and failures of constituent neurons. This network incorporates learning, which adds failure tolerance; the network is able to modify its connection weights an internal representation so that spare neurons can replace neurons which fail. A brief review of trellis-coding concepts is included. >

Journal ArticleDOI
TL;DR: The minimal number of times for using a pair for training to guarantee recall of that pair among a set of training pairs is derived for a bidirectional associative memory.
Abstract: The minimal number of times for using a pair for training to guarantee recall of that pair among a set of training pairs is derived for a bidirectional associative memory. >

Journal ArticleDOI
TL;DR: An associative neural network whose architecture is greatly influenced by biological data is described and constitutes a good mathematical prototype to analyze the properties of modularity, recurrent connections, and feedback.
Abstract: An associative neural network whose architecture is greatly influenced by biological data is described. The proposed neural network is significantly different in architecture and connectivity from previous models. Its emphasis is on high parallelism and modularity. The network connectivity is enriched by recurrent connections within the modules. Each module is, effectively, a Hopfield net. Connections within a module are plastic and are modified by associative learning. Connections between modules are fixed and thus not subject to learning. Although the network is tested with character recognition, it cannot be directly used as such for real-world applications. It must be incorporated as a module in a more complex structure. The architectural principles of the proposed network model can be used in the design of other modules of a whole system. Its architecture is such that it constitutes a good mathematical prototype to analyze the properties of modularity, recurrent connections, and feedback. The model does not make any contribution to the subject of learning in neural networks. >