scispace - formally typeset
Search or ask a question

Showing papers on "Deep learning published in 1994"


Book
01 Jul 1994
TL;DR: In this chapter seven Neural Nets based on Competition, Adaptive Resonance Theory, and Backpropagation Neural Net are studied.
Abstract: 1. Introduction. 2. Simple Neural Nets for Pattern Classification. 3. Pattern Association. 4. Neural Networks Based on Competition. 5. Adaptive Resonance Theory. 6. Backpropagation Neural Net. 7. A Sampler of Other Neural Nets. Glossary. References. Index.

2,665 citations


Book
08 Sep 1994
TL;DR: Perceptron Learning with a Hidden Layer and an Object-Oriented Backpropagation Learning Model and Adaptive Conjugate Gradient Learning Algorithm for Efficient Training of Neural Networks.
Abstract: Perceptron Learning with a Hidden Layer An Object-Oriented Backpropagation Learning Model Concurrent Backpropagation Learning Algorithms An Adaptive Conjugate Gradient Learning Algorithm for Efficient Training of Neural Networks A Concurrent Adaptive Conjugate Gradient Learning Algorithm on MIMD Shared Memory Machines A Concurrent Genetic/Neural Network Learning Algorithm for MIMD Shared Memory Machines A Hybrid Learning Algorithm for Distributed Memory Multicomputers A Fuzzy Neural Network Learning Model Appendices References Index.

473 citations


Proceedings ArticleDOI
01 Aug 1994
TL;DR: This paper presents an APL system for forecasting univariate time series with artificial neural networks that delivered a better forecasting performance than results obtained by the well known ARIMA technique.
Abstract: Artificial neural networks are suitable for many tasks in pattern recognition and machine learning. In this paper we present an APL system for forecasting univariate time series with artificial neural networks. Unlike conventional techniques for time series analysis, an artificial neural network needs little information about the time series data and can be applied to a broad range of problems. However, the problem of network “tuning” remains: parameters of the backpropagation algorithm as well as the network topology need to be adjusted for optimal performances. For our application, we conducted experiments to find the right parameters for a forecasting network. The artificial neural networks that were found delivered a better forecasting performance than results obtained by the well known ARIMA technique.

290 citations


Journal ArticleDOI
TL;DR: A series of simulations and analyses with modular neural networks are presented, suggesting a number of design principles in the form of explicit ways in which neural modules can cooperate in recognition tasks that may supplement recent accounts of the relation between structure and function in the brain.

289 citations


Proceedings ArticleDOI
30 May 1994
TL;DR: The search for redundant data components is performed for networks with continuous outputs and is based on the concept in sensitivity of linearized neural networks, which could lead to smaller networks and reduced-size data vectors.
Abstract: Multilayer feedforward networks are often used for modeling complex relationships between the data sets. Deleting unimportant data components in the training sets could lead to smaller networks and reduced-size data vectors. This can be achieved by analyzing the total disturbance of network outputs due to perturbed inputs. The search for redundant data components is performed for networks with continuous outputs and is based on the concept in sensitivity of linearized neural networks. The formalized criteria and algorithm for pruning data vectors are formulated and illustrated with examples. >

216 citations


Proceedings Article
01 Jan 1994
TL;DR: In this paper, a generic tool for extracting symbolic knowledge by propagating rule-like knowledge through Backpropagation-style neural networks is presented. But this tool is not suitable for the robot arm domain.
Abstract: Although artificial neural networks have been applied in a variety of real-world scenarios with remarkable success, they have often been criticized for exhibiting a low degree of human comprehensibility. Techniques that compile compact sets of symbolic rules out of artificial neural networks offer a promising perspective to overcome this obvious deficiency of neural network representations. This paper presents an approach to the extraction of if-then rules from artificial neural networks. Its key mechanism is validity interval analysis, which is a generic tool for extracting symbolic knowledge by propagating rule-like knowledge through Backpropagation-style neural networks. Empirical studies in a robot arm domain illustrate the appropriateness of the proposed method for extracting rules from networks with real-valued and distributed representations.

196 citations


Book
01 Jan 1994
TL;DR: Neural Networks for speech processing, neural networks for identification and control of nonlinear systems, Hybrid neural networks and image restoration, and Dynamic systems and perception.
Abstract: Neural Networks for speech processing. Neural Networks in the acquisition of speech by machine. The nervous system - fantasy and reality. Processing of complex stimuli in the mammalian cochlear nucleus. On the possible role of auditory peripheral feedback in the representation of speech sounds. Is there a role for neural networks in speech recognition? The neurosphysiology of world reading - a connectionist approach. Status versus stacks - representing grammatical strucure in a recurrent neural network. Connections and associations in language acquisition. Some relationships between artificial neural nets and hidden markov models. A learning neural tree for phoneme classification. Decision feedback learning of neural networks. Visual focus of attention in language acquisition. Integrated segmentation and recognition of handprinted characters. Neural net image analysis for postal applications - from locating address blocks to determining zip codes. Space invariant active vision. Engineering document processing with neural networks. Goal-oriented training of neural networks. Hybrid neural networks and image restoration. Dynamic systems and perception. Deterministic annealing for optimization. Neural networks in vision. A neural chip set for supervised learning and CAM. A discrete radon transform method for invariant image analysis using artificial neural networks. Recurrent neural networks and sequential machines. Non-literal transfer of information among inductive learners. Neural networks for identification and control of nonlinear systems. Using neural networks to identify DNA sequences.

174 citations


09 Aug 1994
TL;DR: A massively parallel fingerprint classification system is described that uses image-based ridge-valley features, K-L transforms, and neural networks to perform pattern level classification and is capable of 95% classification accuracy with 10% rejects.
Abstract: A massively parallel fingerprint classification system is described that uses image-based ridge-valley features, K-L transforms, and neural networks to perform pattern level classification. The speed of classification is 2.65 seconds per fingerprint on a massively parallel computer. The system is capable of 95% classification accuracy with 10% rejects. All tests were performed using a sample of 4000 fingerprints, 2000 matched pairs. Finding two ridge-valley direction sets takes 0.5 seconds per image, the alignment 0.1 seconds per image, the K-L transform 20 ms per image, and the classification 1 ms per image. The image processing prior to classification takes more than 99% of total processing time; the classification time is 0.03% of the total system's time.

166 citations


Journal ArticleDOI
TL;DR: This article explores the use of a real-valued modular genetic algorithm to evolve continuous-time recurrent neural networks capable of sequential behavior and learning and utilizes concepts from dynamical systems theory to understand the operation of some of these evolved networks.
Abstract: This article explores the use of a real-valued modular genetic algorithm to evolve continuous-time recurrent neural networks capable of sequential behavior and learning. We evolve networks that can generate a fixed sequence of outputs in response to an external trigger occurring at varying intervals of time. We also evolve networks that can learn to generate one of a set of possible sequences based on reinforcement from the environment. Finally, we utilize concepts from dynamical systems theory to understand the operation of some of these evolved networks. A novel feature of our approach is that we assume neither an a priori discretization of states or time nor an a priori learning algorithm that explicitly modifies network parameters during learning. Rather, we merely expose dynamical neural networks to tasks that require sequential behavior and learning and allow the genetic algorithm to evolve network dynamics capable of accomplishing these tasks.

156 citations


Journal ArticleDOI
TL;DR: The superior convergence property of the parallel hybrid neural network learning algorithm presented in this paper is demonstrated.
Abstract: A new algorithm is presented for training of multilayer feedforward neural networks by integrating a genetic algorithm with an adaptive conjugate gradient neural network learning algorithm. The parallel hybrid learning algorithm has been implemented in C on an MIMD shared memory machine (Cray Y-MP8/864 supercomputer). It has been applied to two different domains, engineering design and image recognition. The performance of the algorithm has been evaluated by applying it to three examples. The superior convergence property of the parallel hybrid neural network learning algorithm presented in this paper is demonstrated. >

140 citations


Journal ArticleDOI
TL;DR: This work presents a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks and shows that rules extracted from networks trained with this heuristic are more consistent with the rules to be learned.
Abstract: Determining the architecture of a neural network is an important issue for any learning task. For recurrent neural networks no general methods exist that permit the estimation of the number of layers of hidden neurons, the size of layers or the number of weights. We present a simple pruning heuristic that significantly improves the generalization performance of trained recurrent networks. We illustrate this heuristic by training a fully recurrent neural network on positive and negative strings of a regular grammar. We also show that rules extracted from networks trained with this pruning heuristic are more consistent with the rules to be learned. This performance improvement is obtained by pruning and retraining the networks. Simulations are shown for training and pruning a recurrent neural net on strings generated by two regular grammars, a randomly-generated 10-state grammar and an 8-state, triple-parity grammar. Further simulations indicate that this pruning method can have generalization performance superior to that obtained by training with weight decay. >

Journal ArticleDOI
TL;DR: Compared with a traditional clustering method, theK-means procedure had fewer points misclassified while the classification accuracy of neural networks worsened as the number of clusters in the data increased from two to five.
Abstract: Several neural networks have been proposed in the general literature for pattern recognition and clustering, but little empirical comparison with traditional methods has been done. The results reported here compare neural networks using Kohonen learning with a traditional clustering method (K-means) in an experimental design using simulated data with known cluster solutions. Two types of neural networks were examined, both of which used unsupervised learning to perform the clustering. One used Kohonen learning with a conscience and the other used Kohonen learning without a conscience mechanism. The performance of these nets was examined with respect to changes in the number of attributes, the number of clusters, and the amount of error in the data. Generally, theK-means procedure had fewer points misclassified while the classification accuracy of neural networks worsened as the number of clusters in the data increased from two to five.

Journal ArticleDOI
TL;DR: It is shown that a sufficient condition for the existence of a sparse neural network design is self feedback for every neuron in the network, and the synthesis procedure makes it possible to design in a systematic manner neural networks which store all desired memory patterns as reachable memory vectors.
Abstract: We first present results for the analysis and synthesis of a class of neural networks without any restrictions on the interconnecting structure. The class of neural networks which we consider have the structure of analog Hopfield nets and utilize saturation functions to model the neurons. Our analysis results make it possible to locate in a systematic manner all equilibrium points of the neural network and to determine the stability properties of the equilibrium points. The synthesis procedure makes it possible to design in a systematic manner neural networks (for associative memories) which store all desired memory patterns as reachable memory vectors. We generalize the above results to develop a design procedure for neural networks with sparse coefficient matrices. Our results guarantee that the synthesized neural networks have predetermined sparse interconnection structures and store any set of desired memory patterns as reachable memory vectors. We show that a sufficient condition for the existence of a sparse neural network design is self feedback for every neuron in the network. We apply our synthesis procedure to the design of cellular neural networks for associative memories. Our design procedure for neural networks with sparse interconnecting structure can take into account various problems encountered in VLSI realizations of such networks. For example, our procedure can be used to design neural networks with few or without any line-crossings resulting from the network interconnections. Several specific examples are included to demonstrate the applicability of the methodology advanced herein. >

Journal ArticleDOI
TL;DR: An algorithm based on the back propagation procedure that dynamically configures the structure of feedforward multilayered neural networks and demonstrates its potential for control applications.

Proceedings ArticleDOI
27 Jun 1994
TL;DR: By adapting separate smoothing parameters for each dimension, the classification accuracy of the the probabilistic neural network (PNN), and the estimation accuracy of a general regression neural network can both be greatly improved.
Abstract: By adapting separate smoothing parameters for each dimension, the classification accuracy of the the probabilistic neural network (PNN), and the estimation accuracy of the general regression neural network (GRNN) can both be greatly improved. Accuracy comparisons are given for 28 databases. In addition, the dimensionality of the problem and the complexity of the network can usually be simultaneously reduced. The price to be paid for these benefits is increased training time. >

Proceedings ArticleDOI
27 Jun 1994
TL;DR: Two algorithms for the construction of pattern classifier neural architectures are proposed and a comparison with other known similar architectures is given and simulation results are carried out.
Abstract: In this paper two algorithms for the construction of pattern classifier neural architectures are proposed. A comparison with other known similar architectures is given and simulation results are carried out. >

BookDOI
01 Jan 1994
TL;DR: The underlying principles of many of the practical approaches developed in artificial intelligence and connectionism have been reviewed, with the goal of placing them in a common perspective and providing a unifying overview.
Abstract: Predictive learning has been traditionally studied in applied mathematics (function approximation), statistics (nonparametric regression), and engineering (pattern recognition). Recently the fields of artificial intelligence (machine learning) and connectionism (neural networks) have emerged, increasing interest in this problem, both in terms of wider application and methodological advances. This paper reviews the underlying principles of many of the practical approaches developed in these fields, with the goal of placing them in a common perspective and providing a unifying overview.

Journal ArticleDOI
TL;DR: This paper proposes a novel approach for a hybrid connectionist-hidden Markov model (HMM) speech recognition system based on the use of a neural network as vector quantizer and demonstrates how the new learning approach can be applied to multiple-feature hybrid speech recognition systems, using a joint information theory-based optimization procedure.
Abstract: This paper proposes a novel approach for a hybrid connectionist-hidden Markov model (HMM) speech recognition system based on the use of a neural network as vector quantizer. The neural network is trained with a new learning algorithm offering the following innovations. (1) It is an unsupervised learning algorithm for perceptron-like neural networks that are usually trained in the supervised mode. (2) Information theory principles are used as learning criteria, making the network especially suitable for combination with a HMM-based speech recognition system. (3) The neural network is not trained using the standard error-backpropagation algorithm but using instead a newly developed self-organizing learning approach. The use of the hybrid system with the neural vector quantizer results in a 25% error reduction compared with the same HMM system using a standard k-means vector quantizer. The training algorithm can be further refined by using a combination of unsupervised and supervised learning algorithms. Finally, it is demonstrated how the new learning approach can be applied to multiple-feature hybrid speech recognition systems, using a joint information theory-based optimization procedure for the multiple neural codebooks, resulting in a 30% error reduction. >

Journal ArticleDOI
TL;DR: This second part of a Tutorial on neural networks focuses on the Kohonen self-organising feature map and the Hopfield network and a theoretical description of each type is given.

Proceedings ArticleDOI
27 Jun 1994
TL;DR: The concept of a modular neural network structure, which is capable of clustering input patterns through unsupervised learning, and representing a self-consistent hierarchy of clusters at several levels of specificity, is introduced.
Abstract: This paper introduces the concept of a modular neural network structure, which is capable of clustering input patterns through unsupervised learning, and representing a self-consistent hierarchy of clusters at several levels of specificity. In particular, we use the ART neural network as a building block, and name our architecture SMART (for Self-consistent Modular ART). We also show some experimental results for "proof-of-concept" using the ARTMAP network, that can be seen as an implementation of a two-level SMART network. >

Proceedings ArticleDOI
30 May 1994
TL;DR: A behavioral approach to the impact of errors due to faults in neural computation is analyzed and the probability of error detection at the neuron's output and at the network's outputs is derived.
Abstract: A behavioral approach to the impact of errors due to faults in neural computation is analyzed. Starting from a geometrical description of errors affecting neural values, we derive the probability of error detection at the neuron's output and at the network's outputs. >

Book ChapterDOI
26 May 1994
TL;DR: In recent years, neural networks have been successfully used to attack a wide variety of difficult nonlinear regression and classification tasks and their effectiveness, particularly when the dimension of the problem measured in the number of variables involved, has been widely documented.
Abstract: In recent years, neural networks have been successfully used to attack a wide variety of difficult nonlinear regression and classification tasks and their effectiveness, particularly when the dimension of the problem measured in the number of variables involved, has been widely documented (Finnoff 1993).

Journal ArticleDOI
TL;DR: A new network that maps n -dimensional binary vectors into m -dimensionalbinary vectors using 3-layered feedforward neural networks is described, which may be gauged from the example that the exclusive-Or problem was solved in eight or fewer steps.

Journal ArticleDOI
TL;DR: An example of applying neural networks to time series analysis and prediction, using Backpropagation algorithm to train layered, feed-forward networks to model a complex, non-linear time series.
Abstract: Neural network algorithms have been shown to provide good solutions for a variety of non-linear optimization problems, ranging from classification to function approximation in high dimension space. These algorithms are capable of “learning” a target function from a set of “training examples”without strong assumption about the function. In this paper we show an example of applying neural networks to time series analysis and prediction. Backpropagation algorithm is used to train layered, feed-forward networks to model a complex, non-linear time series. A general state space formulation is adopted to analyze the problem and a Cascaded Method is used to predict multiple steps into the future. A fast parallel implementation of Backpropagation on the Connection Machine allowed us to do extensive exploratory data analysis to search for good neural net predictive models on large data sets.

Journal ArticleDOI
TL;DR: It is shown here that for many applications the standard methods of data fitting and approximation techniques are much better than neural networks in the sense of giving more accurate results with a lower number of adjustable parameters.

Journal ArticleDOI
TL;DR: These basic computational concepts are reviewed in this paper with the purposes of providing a mathematical continuity to seemingly disparate techniques, establishing basic mathematical limitations on applicability of existing techniques, and discerning fundamental questions facing the classification field.
Abstract: A large number of algorithms have been developed for classification and recognition. These algorithms can be divided into three major paradigms: statistical pattern recognition, neural networks, and model-based vision. Neural networks embody an especially rich field of approaches based on a variety of architectures, learning mechanisms, biological and algorithmic motivations, and application areas. Mathematical analysis of these approaches and paradigms reveals that there are only a few computational concepts permeating all the diverse approaches and serving as a basis for all paradigms and algorithms for classification and recognition.

Journal ArticleDOI
TL;DR: A family of methods for flexible regression and discrimination were developed in multivariate statistics, and tree-induction methods have been developed in both machine learning and statistics.
Abstract: Feed-forward neural networks—also known as multi-layer perceptrons—are now widely used for regression and classification. In parallel but slightly earlier, a family of methods for flexible regression and discrimination were developed in multivariate statistics, and tree-induction methods have been developed in both machine learning and statistics. We expound and compare these approaches in the context of a number of examples.

Journal ArticleDOI
TL;DR: It is demonstrated that the proposed learning algorithm and the network architecture provide stable and accurate tracking performance and the issue of robustness of the controller to system parameter variations as well as to measurement disturbances is addressed.

Journal ArticleDOI
TL;DR: This work has explored the use of principal component analysis for data pre-processing prior to classification of stellar spectra with a non-linear neural network to significantly enhances classification replicability, network stability, and convergence.

Proceedings ArticleDOI
01 Dec 1994
TL;DR: It is shown that for some special neurons (corresponding to wavelets), neural networks are optimal approximators in the sense that they require (asymptotically) the smallest possible number of bits.
Abstract: Neural networks are universal approximators For example, it has been proved (K Hornik et al, 1989) that for every /spl epsiv/>0 an arbitrary continuous function on a compact set can be /spl epsiv/-approximated by a 3-layer neural network This and other results prove that in principle, any function (eg, any control) can be implemented by an appropriate neural network But why neural networks? In addition to neural networks, an arbitrary continuous function can be also approximated by polynomials, etc What is so special about neural networks that make them preferable approximators? To compare different approximators, one can compare the number of bits that we must store in order to be able to reconstruct a function with a given precision /spl epsiv/ For neural networks, we must store weights and thresholds For polynomials, we must store coefficients, etc We consider functions of one variable, and show that for some special neurons (corresponding to wavelets), neural networks are optimal approximators in the sense that they require (asymptotically) the smallest possible number of bits >