scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 1991"


Journal ArticleDOI
TL;DR: The general regression neural network (GRNN) is a one-pass learning algorithm with a highly parallel structure that provides smooth transitions from one observed value to another.
Abstract: A memory-based network that provides estimates of continuous variables and converges to the underlying (linear or nonlinear) regression surface is described. The general regression neural network (GRNN) is a one-pass learning algorithm with a highly parallel structure. It is shown that, even with sparse data in a multidimensional measurement space, the algorithm provides smooth transitions from one observed value to another. The algorithmic form can be used for any regression problem in which an assumption of linearity is not justified. >

4,091 citations


Journal ArticleDOI
TL;DR: The authors propose an alternative learning procedure based on the orthogonal least-squares method, which provides a simple and efficient means for fitting radial basis function networks.
Abstract: The radial basis function network offers a viable alternative to the two-layer neural network in many applications of signal processing. A common learning algorithm for radial basis function networks is based on first choosing randomly some data points as radial basis function centers and then using singular-value decomposition to solve for the weights of the network. Such a procedure has several drawbacks, and, in particular, an arbitrary selection of centers is clearly unsatisfactory. The authors propose an alternative learning procedure based on the orthogonal least-squares method. The procedure chooses radial basis function centers one by one in a rational way until an adequate network has been constructed. In the algorithm, each selected center maximizes the increment to the explained variance or energy of the desired output and does not suffer numerical ill-conditioning problems. The orthogonal least-squares learning strategy provides a simple and efficient means for fitting radial basis function networks. This is illustrated using examples taken from two different signal processing applications. >

3,414 citations


Journal ArticleDOI
TL;DR: An extension of the backpropagation method, termed dynamic back Propagation, which can be applied in a straightforward manner for the optimization of the weights (parameters) of multilayer neural networks is discussed.
Abstract: An extension of the backpropagation method, termed dynamic backpropagation, which can be applied in a straightforward manner for the optimization of the weights (parameters) of multilayer neural networks is discussed. The method is based on the fact that gradient methods used in linear dynamical systems can be combined with backpropagation methods for neural networks to obtain the gradient of a performance index of nonlinear dynamical systems. The method can be applied to any complex system which can be expressed as the interconnection of linear dynamical systems and multilayer neural networks. To facilitate the practical implementation of the proposed method, emphasis is placed on the diagrammatic representation of the system which generates the gradient of the performance function. >

662 citations


Journal ArticleDOI
TL;DR: An overview of the current-mode approach for designing analog VLSI neural systems in subthreshold CMOS technology is presented and emphasis is given to design techniques at the device level using theCurrent-controlled current conveyor and the translinear principle.
Abstract: An overview of the current-mode approach for designing analog VLSI neural systems in subthreshold CMOS technology is presented. Emphasis is given to design techniques at the device level using the current-controlled current conveyor and the translinear principle. Circuits for associative memory and silicon retina systems are used as examples. The design methodology and how it relates to actual biological microcircuits are discussed. >

342 citations


Journal ArticleDOI
TL;DR: A least upper bound is derived for the number of hidden neurons needed to realize an arbitrary function which maps from a finite subset of E(n) into E(d) and a nontrivial lower bound is obtained for realizations of injective functions.
Abstract: Fundamental issues concerning the capability of multilayer perceptrons with one hidden layer are investigated. The studies are focused on realizations of functions which map from a finite subset of E/sup n/ into E/sup d/. Real-valued and binary-valued functions are considered. In particular, a least upper bound is derived for the number of hidden neurons needed to realize an arbitrary function which maps from a finite subset of E/sup n/ into E/sup d/. A nontrivial lower bound is also obtained for realizations of injective functions. This result can be applied in studies of pattern recognition and database retrieval. An upper bound is given for realizing binary-valued functions that are related to pattern-classification problems. >

312 citations


Journal ArticleDOI
TL;DR: A pattern recognition system which works with the mechanism of the neocognitron, a neural network model for deformation-invariant visual pattern recognition, is discussed, which has been trained to recognize 35 handwritten alphanumeric characters.
Abstract: A pattern recognition system which works with the mechanism of the neocognitron, a neural network model for deformation-invariant visual pattern recognition, is discussed. The neocognition was developed by Fukushima (1980). The system has been trained to recognize 35 handwritten alphanumeric characters. The ability to recognize deformed characters correctly depends strongly on the choice of the training pattern set. Some techniques for selecting training patterns useful for deformation-invariant recognition of a large number of characters are suggested. >

249 citations


Journal ArticleDOI
TL;DR: The pulse-stream technique, which represents neural states as sequences of pulses, is reviewed, and generic methods appraised, for pulsed encoding, arithmetic, and intercommunication schemes are presented and compared.
Abstract: The pulse-stream technique, which represents neural states as sequences of pulses, is reviewed. Several general issues are raised, and generic methods appraised, for pulsed encoding, arithmetic, and intercommunication schemes. Two contrasting synapse designs are presented and compared. The first is based on a fully analog computational form in which the only digital component is the signaling mechanism itself-asynchronous, pulse-rate encoded digital voltage pulses. In this circuit, multiplication occurs in the voltage/current domain. The second design uses more conventional digital memory for weight storage, with synapse circuits based on pulse stretching. Integrated circuits implementing up to 15000 analog, fully programmable synaptic connections are described. A demonstrator project is described in which a small robot localization network is implemented using asynchronous, analog, pulse-stream devices. >

214 citations


Journal ArticleDOI
TL;DR: A method for designing near-optimal nonlinear classifiers, based on a self-organizing technique for estimating probability density functions when only weak assumptions are made about the densities, is described.
Abstract: A method for designing near-optimal nonlinear classifiers, based on a self-organizing technique for estimating probability density functions when only weak assumptions are made about the densities, is described. The method avoids disadvantages of other existing methods by parametrizing a set of component densities from which the actual densities are constructed. The parameters of the component densities are optimized by a self-organizing algorithm, reducing to a minimum the labeling of design samples. All the required computations are realized with the simple sum-of-product units commonly used in connectionist models. The density approximations produced by the method are illustrated in two dimensions for a multispectral image classification task. The practical use of the method is illustrated by a small speech recognition problem. Related issues of invariant projections, cross-class pooling of data, and subspace partitioning are discussed. >

203 citations


Journal ArticleDOI
TL;DR: A derivative of Akaike's information criterion (AIC) is given which can be used to objectively select a ;best' network for binary classification problems and can be extended to problems with an arbitrary number of classes.
Abstract: The choice of an optimal neural network design for a given problem is addressed. A relationship between optimal network design and statistical model identification is described. A derivative of Akaike's information criterion (AIC) is given. This modification yields an information statistic which can be used to objectively select a 'best' network for binary classification problems. The technique can be extended to problems with an arbitrary number of classes. >

189 citations


Journal ArticleDOI
E.B. Baum1
TL;DR: The author's algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth two, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k=/<4.
Abstract: An algorithm which trains networks using examples and queries is proposed. In a query, the algorithm supplies a y and is told t(y) by an oracle. Queries appear to be available in practice for most problems of interest, e.g. by appeal to a human expert. The author's algorithm is proved to PAC learn in polynomial time the class of target functions defined by layered, depth two, threshold nets having n inputs connected to k hidden threshold units connected to one or more output units, provided k >

189 citations


Journal ArticleDOI
TL;DR: An algorithm that is faster than back-propagation and for which it is not necessary to specify the number of hidden units in advance is described, and accuracy is comparable to that for the nearest-neighbor algorithm, which is slower and requires more storage space.
Abstract: An algorithm that is faster than back-propagation and for which it is not necessary to specify the number of hidden units in advance is described. The relationship with other fast pattern-recognition algorithms, such as algorithms based on k-d trees, is discussed. The algorithm has been implemented and tested on artificial problems, such as the parity problem, and on real problems arising in speech recognition. Experimental results, including training times and recognition accuracy, are given. Generally, the algorithm achieves accuracy as good as or better than nets trained using back-propagation. Accuracy is comparable to that for the nearest-neighbor algorithm, which is slower and requires more storage space. >

Journal ArticleDOI
TL;DR: The training set can be implemented with zero error with two layers and with the number of the hidden-layer neurons equal to #1>/= p-1, and the method presented exactly solves (M), the multilayer neural network training problem, for any arbitrary training set.
Abstract: A new derivation is presented for the bounds on the size of a multilayer neural network to exactly implement an arbitrary training set; namely the training set can be implemented with zero error with two layers and with the number of the hidden-layer neurons equal to Hash 1>or=p-1. The derivation does not require the separation of the input space by particular hyperplanes, as in previous derivations. The weights for the hidden layer can be chosen almost arbitrarily, and the weights for the output layer can be found by solving Hash 1+1 linear equations. The method presented exactly solves (M), the multilayer neural network training problem, for any arbitrary training set. >

Journal ArticleDOI
TL;DR: A modified version of the PNN (probabilistic neural network) learning phase which allows a considerable simplification of network structure by including a vector quantification of learning data is proposed and has been shown to improve the classification performance of the LVQ (learning vector quantization) procedure.
Abstract: A modified version of the PNN (probabilistic neural network) learning phase which allows a considerable simplification of network structure by including a vector quantization of learning data is proposed. It can be useful if large training sets are available. The procedure has been successfully tested in two synthetic data experiments. The proposed network has been shown to improve the classification performance of the LVQ (learning vector quantization) procedure. >

Journal ArticleDOI
TL;DR: An approach is presented for query-based neural network learning that combines a layered perceptron partially trained for binary classification with an inversion algorithm for the neural network that allows generation of this boundary.
Abstract: An approach is presented for query-based neural network learning. A layered perceptron partially trained for binary classification is considered. The single-output neuron is trained to be either a zero or a one. A test decision is made by thresholding the output at, for example, one-half. The set of inputs that produce an output of one-half forms the classification boundary. The authors adopted an inversion algorithm for the neural network that allows generation of this boundary. For each boundary point, the classification gradient can be generated. The gradient provides a useful measure of the steepness of the multidimensional decision surfaces. Conjugate input pairs are generated using the boundary point and gradient information and presented to an oracle for proper classification. These data are used to refine further the classification boundary, thereby increasing the classification accuracy. The result can be a significant reduction in the training set cardinality in comparison with, for example, randomly generated data points. An application example to power system security assessment is given. >

Journal ArticleDOI
TL;DR: A neural network approach to the problem of color constancy is presented and an electronic system that is based on the original algorithm and that operates at video rates was built using subthreshold analog CMOS VLSI resistive grids.
Abstract: A neural network approach to the problem of color constancy is presented. Various algorithms based on Land's retinex theory are discussed with respect to neurobiological parallels, computational efficiency, and suitability for VLSI implementation. The efficiency of one algorithm is improved by the application of resistive grids and is tested in computer simulations; the simulations make clear the strengths and weaknesses of the algorithm. A novel extension to the algorithm is developed to address its weaknesses. An electronic system that is based on the original algorithm and that operates at video rates was built using subthreshold analog CMOS VLSI resistive grids. The system displays color constancy abilities and qualitatively mimics aspects of human color perception. >

Journal ArticleDOI
TL;DR: It is rigorously established that the sequence of weight estimates can be approximated by a certain ordinary differential equation, in the sense of weak convergence of random processes as epsilon tends to zero.
Abstract: The behavior of neural network learning algorithms with a small, constant learning rate, epsilon , in stationary, random input environments is investigated. It is rigorously established that the sequence of weight estimates can be approximated by a certain ordinary differential equation, in the sense of weak convergence of random processes as epsilon tends to zero. As applications, backpropagation in feedforward architectures and some feature extraction algorithms are studied in more detail. >

Journal ArticleDOI
TL;DR: The author proposes a technique based on the idea that for most of the data, only a few dimensions of the input may be necessary to compute the desired output function, and it can be used to reduce the number of required measurements in situations where there is a cost associated with sensing.
Abstract: Nonlinear function approximation is often solved by finding a set of coefficients for a finite number of fixed nonlinear basis functions. However, if the input data are drawn from a high-dimensional space, the number of required basis functions grows exponentially with dimension, leading many to suggest the use of adaptive nonlinear basis functions whose parameters can be determined by iterative methods. The author proposes a technique based on the idea that for most of the data, only a few dimensions of the input may be necessary to compute the desired output function. Additional input dimensions are incorporated only where needed. The learning procedure grows a tree whose structure depends upon the input data and the function to be approximated. This technique has a fast learning algorithm with no local minima once the network shape is fixed, and it can be used to reduce the number of required measurements in situations where there is a cost associated with sensing. Three examples are given: controlling the dynamics of a simulated planar two-joint robot arm, predicting the dynamics of the chaotic Mackey-Glass equation, and predicting pixel values in real images from pixel values above and to the left. >

Journal ArticleDOI
TL;DR: The present method provides guidelines for reducing the number of spurious states and for estimating the extent of the patterns' domains of attraction, and provides a means of implementing neural networks by serial processors and special digital hardware.
Abstract: A qualitative analysis is presented for a class of synchronous discrete-time neural networks defined on hypercubes in the state space. Analysis results are utilized to establish a design procedure for associative memories to be implemented on the present class of neural networks. To demonstrate the storage ability and flexibility of the synthesis procedure, several specific examples are considered. The design procedure has essentially the same desirable features as the results of J. Li et al. (1988, 1989) for continuous-time neural networks. For a given system dimension, networks designed by the present method may have the ability to store more patterns (as asymptotically stable equilibria) than corresponding discrete-time networks designed by other techniques. The design method guarantees the storage of all the desired patterns as asymptotically stable equilibrium points. The present method provides guidelines for reducing the number of spurious states and for estimating the extent of the patterns' domains of attraction. The present results provide a means of implementing neural networks by serial processors and special digital hardware. >

Journal ArticleDOI
TL;DR: A variant of nearest-neighbor (NN) pattern classification and supervised learning by learning vector quantization (LVQ) is described and the DSM algorithm outperforms these methods with respect to error rates, learning rates, and the number of prototypes required to describe class boundaries.
Abstract: A variant of nearest-neighbor (NN) pattern classification and supervised learning by learning vector quantization (LVQ) is described. The decision surface mapping method (DSM) is a fast supervised learning algorithm and is a member of the LVQ family of algorithms. A relatively small number of prototypes are selected from a training set of correctly classified samples. The training set is then used to adapt these prototypes to map the decision surface separating the classes. This algorithm is compared with NN pattern classification, learning vector quantization, and a two-layer perceptron trained by error backpropagation. When the class boundaries are sharply defined (i.e., no classification error in the training set), the DSM algorithm outperforms these methods with respect to error rates, learning rates, and the number of prototypes required to describe class boundaries. >

Journal ArticleDOI
TL;DR: The resulting adaptively trained neural network (ATNN), based on nonlinear programming techniques, is shown to adapt to new training data that are in conflict with earlier training data without affecting the neural networks' response to data elsewhere.
Abstract: A training procedure that adapts the weights of a trained layered perceptron artificial neural network to training data originating from a slowly varying nonstationary process is proposed. The resulting adaptively trained neural network (ATNN), based on nonlinear programming techniques, is shown to adapt to new training data that are in conflict with earlier training data without affecting the neural networks' response to data elsewhere. The adaptive training procedure also allows for new data to be weighted in terms of its significance. The adaptive algorithm is applied to the problem of electric load forecasting and is shown to outperform the conventionally trained layered perceptron. >

Journal ArticleDOI
TL;DR: The asymptotic storage capacity of the ECAM with limited dynamic range in its exponentiation nodes is found to be proportional to that dynamic range, and it meets the ultimate upper bound for the capacity of associative memories.
Abstract: A model for a class of high-capacity associative memories is presented. Since they are based on two-layer recurrent neural networks and their operations depend on the correlation measure, these associative memories are called recurrent correlation associative memories (RCAMs). The RCAMs are shown to be asymptotically stable in both synchronous and asynchronous (sequential) update modes as long as their weighting functions are continuous and monotone nondecreasing. In particular, a high-capacity RCAM named the exponential correlation associative memory (ECAM) is proposed. The asymptotic storage capacity of the ECAM scales exponentially with the length of memory patterns, and it meets the ultimate upper bound for the capacity of associative memories. The asymptotic storage capacity of the ECAM with limited dynamic range in its exponentiation nodes is found to be proportional to that dynamic range. Design and fabrication of a 3-mm CMOS ECAM chip is reported. The prototype chip can store 32 24-bit memory patterns, and its speed is higher than one associative recall operation every 3 mu s. An application of the ECAM chip to vector quantization is also described. >

Journal ArticleDOI
TL;DR: A novel nonlinear regulator design method that integrates linear optimal control techniques and nonlinear neural network learning methods is presented that can compensate for nonlinear system uncertainties that are not considered in the LOR design and can tolerate a wider range of uncertainties than the L OR alone.
Abstract: The authors present a novel nonlinear regulator design method that integrates linear optimal control techniques and nonlinear neural network learning methods. Multilayered neural networks are used to add nonlinear effects to the linear optimal regulator (LOR). The regulator can compensate for nonlinear system uncertainties that are not considered in the LOR design and can tolerate a wider range of uncertainties than the LOR alone. The salient feature of the regulator is that the control performance is much improved by using a priori knowledge of the plant dynamics as the system equation and the corresponding LOR. Computer simulations are performed to show the applicability and the limitations of the regulator. >

Journal ArticleDOI
TL;DR: It is shown that for a class of vector quantization processes, related to neural modeling, that the asymptotic density Q(x ) of the quantization levels in one dimension in terms of the input signal distribution P(x) is a power law C-P(x)(alpha), where the exponent alpha depends on the number n of neighbors on each side of a unit.
Abstract: It is shown that for a class of vector quantization processes, related to neural modeling, that the asymptotic density Q(x) of the quantization levels in one dimension in terms of the input signal distribution P(x) is a power law Q(x)=C-P(x)/sup alpha /, where the exponent alpha depends on the number n of neighbors on each side of a unit and is given by alpha =2/3-1/(3n/sup 2/+3(n+1)/sup 2/). The asymptotic level density is calculated, and Monte Carlo simulations are presented. >

Journal ArticleDOI
TL;DR: A classifier that incorporates both preprocessing and postprocessing procedures as well as a multilayer feedforward network (based on the back-propagation algorithm) in its design to distinguish between several major classes of radar returns including weather, birds, and aircraft is described.
Abstract: A classifier that incorporates both preprocessing and postprocessing procedures as well as a multilayer feedforward network (based on the back-propagation algorithm) in its design to distinguish between several major classes of radar returns including weather, birds, and aircraft is described. The classifier achieves an average classification accuracy of 89% on generalization for data collected during a single scan of the radar antenna. The procedures of feature selection for neural network training, the classifier design considerations, the learning algorithm development, the implementation, and the experimental results of the neural clutter classifier, which is simulated on a Warp systolic computer, are discussed. A comparative evaluation of the multilayer neural network with a traditional Bayes classifier is presented. >

Journal ArticleDOI
TL;DR: A linear programming/multiple training (LP/MT) method that determines weights which satisfy the conditions when a solution is feasible is presented and the sequential multiple training (SMT) method is shown to yield integers for the weights, which are multiplicities of the training pairs.
Abstract: Necessary and sufficient conditions are derived for the weights of a generalized correlation matrix of a bidirectional associative memory (BAM) which guarantee the recall of all training pairs. A linear programming/multiple training (LP/MT) method that determines weights which satisfy the conditions when a solution is feasible is presented. The sequential multiple training (SMT) method is shown to yield integers for the weights, which are multiplicities of the training pairs. Computer simulation results, including capacity comparisons of BAM, LP/MT BAM, and SMT BAM, are presented. >

Journal ArticleDOI
TL;DR: Instead of reserving one input line (as a memory) for each quantized state, the integrated technique distributively stores learned information; this reduces the required memory and makes the self-learning control scheme applicable to problems of larger size.
Abstract: A technique that integrates the cerebellar model articulation controller (CMAC) into a self-learning control scheme developed by A.G. Barto et al. (IEEE Trans. Syst. Man., Cybern., vol.SMC-13, p.834-46, Sept./Oct. 1983) is presented. Instead of reserving one input line (as a memory) for each quantized state, the integrated technique distributively stores learned information; this reduces the required memory and makes the self-learning control scheme applicable to problems of larger size. CMAC's capability with regard to information interpolation also helps improve the learning speed. >

Journal ArticleDOI
TL;DR: A hierarchical approach is proposed for solving the surface and vertex correspondence problems in multiple-view-based 3D object-recognition systems and provides a more general and compact formulation of the problem and a solution more suitable for parallel implementation.
Abstract: A hierarchical approach is proposed for solving the surface and vertex correspondence problems in multiple-view-based 3D object-recognition systems. The proposed scheme is a coarse-to-fine search process, and a Hopfield network is used at each stage. Compared with conventional object-matching schemes, the proposed technique provides a more general and compact formulation of the problem and a solution more suitable for parallel implementation. At the coarse search stage, the surface matching scores between the input image and each object model in the database are computed through a Hopfield network and are used to select the candidates for further consideration. At the fine search stage, the object models selected from the previous stage are fed into another Hopfield network for vertex matching. The object model that has the best surface and vertex correspondences with the input image is finally singled out as the best matched model. Experimental results are reported using both synthetic and real range images to corroborate the proposed theory. >

Journal ArticleDOI
TL;DR: It is shown that network using large arrays of nonuniform components can perform analog communications with a much higher degree of accuracy than might be expected given the degree of variation in the network's elements.
Abstract: Experimental results from adaptive learning using an optically controlled neural network are presented. The authors have used example problems in nonlinear system identification and signal prediction, two areas of potential neural network application, to study the capabilities of analog neural hardware. These experiments investigated the effects of a variety of nonidealities typical of analog hardware systems. They show that network using large arrays of nonuniform components can perform analog communications with a much higher degree of accuracy than might be expected given the degree of variation in the network's elements. The effects of other common nonidealities, such as noise, weight quantization, and dynamic range limitations, were also investigated. >

Journal ArticleDOI
TL;DR: The authors study various techniques for obtaining this invariance with neural net classifiers and identify the invariant-feature technique as the most suitable for current neural classifiers.
Abstract: Application of neural nets to invariant pattern recognition is considered. The authors study various techniques for obtaining this invariance with neural net classifiers and identify the invariant-feature technique as the most suitable for current neural classifiers. A novel formulation of invariance in terms of constraints on the feature values leads to a general method for transforming any given feature space so that it becomes invariant to specified transformations. A case study using range imagery is used to exemplify these ideas, and good performance is obtained. >

Journal ArticleDOI
TL;DR: A special class of mutually inhibitory networks is analyzed, and parameters for reliable K-winner performance are presented, and Restrictions on initial states are derived which ensure accurate K- winner performance when unequal external inputs are used.
Abstract: A special class of mutually inhibitory networks is analyzed, and parameters for reliable K-winner performance are presented. The network dynamics are modeled using interactive activation, and results are compared with the sigmoid model. For equal external inputs, network parameters that select the units with the larger initial activations (the network converges to the nearest stable state) are derived. Conversely, for equal initial activations, networks that select the units with larger external inputs (the network converges to the lowest energy stable state) are derived. When initial activations are mixed with external inputs, anomalous behavior results. These discrepancies are analyzed with several examples. Restrictions on initial states are derived which ensure accurate K-winner performance when unequal external inputs are used. >