scispace - formally typeset
Search or ask a question

Showing papers on "Deep learning published in 2001"


Journal ArticleDOI
TL;DR: Results show that neural networks are valuable tools for modeling and forecasting nonlinear time series while traditional linear methods are not as competent for this task.

335 citations


Journal ArticleDOI
TL;DR: A Real-Coded Genetic Algorithm is presented that uses the appropriate operators for this encoding type to train Recurrent Neural Networks and is compared with the Real-Time Recurrent Learning algorithm to perform the fuzzy grammatical inference.

234 citations


01 Jan 2001
TL;DR: Backpropagation learning is described for feedforward networks, adapted to suit the authors' (probabilistic) modeling needs, and extended to cover recurrent networks.
Abstract: This paper provides guidance to some of the concepts surrounding recurrent neural networks. Contrary to feedforward networks, recurrent networks can be sensitive, and be adapted to past inputs. Backpropagation learning is described for feedforward networks, adapted to suit our (probabilistic) modeling needs, and extended to cover recurrent networks. The aim of this brief paper is to set the scene for applying and understanding recurrent neural networks.

230 citations



Proceedings ArticleDOI
15 Jul 2001
TL;DR: A supervised learning algorithm is derived for a spiking neural network which encodes information in the timing of spike trains which is similar to the classical error backpropagation algorithm for sigmoidal neural network but the learning parameter is adaptively changed.
Abstract: We derive a supervised learning algorithm for a spiking neural network which encodes information in the timing of spike trains. This algorithm is similar to the classical error backpropagation algorithm for sigmoidal neural network but the learning parameter is adaptively changed. The algorithm is applied to a complex nonlinear classification problem and the results show that the spiking neural network is capable of performing nonlinearly separable classification tasks. Several issues concerning the spiking neural network are discussed.

104 citations


Journal ArticleDOI
TL;DR: An investigation has been made into the use of stochastic arithmetic to implement an artificial neural network solution to a typical pattern recognition application, with results indicating an order of magnitude improvement over the floating-point implementation assuming clock frequency parity.
Abstract: For pt. I see ibid., p.891-905. An investigation has been made into the use of stochastic arithmetic to implement an artificial neural network solution to a typical pattern recognition application. Optical character recognition is performed on very noisy characters in the E-13B MICR font. The artificial neural network is composed of two layers, the first layer being a set of soft competitive learning subnetworks and the second a set of fully connected linear output neurons. The observed number of clock cycles in the stochastic case represents an order of magnitude improvement over the floating-point implementation assuming clock frequency parity. Network generalization capabilities were also compared based on the network squared error as a function of the amount of noise added to the input patterns. The stochastic network maintains a squared error within 10 percent of that of the floating-point implementation for a wide range of noise levels.

96 citations


Proceedings ArticleDOI
15 Jul 2001
TL;DR: This work observes that the model learns to generate melodies according to composition rules on tonality and rhythm with interesting variations and finds a neural network that maximizes the chance of generating good melodies.
Abstract: Music composition is a domain well-suited for evolutionary reinforcement learning. Instead of applying explicit composition rules, a neural network is used to generate melodies. An evolutionary algorithm is used to find a neural network that maximizes the chance of generating good melodies. Composition rules on tonality and rhythm are used as a fitness function for the evolution. We observe that the model learns to generate melodies according to these rules with interesting variations.

87 citations


BookDOI
01 Jan 2001
TL;DR: Learning and Other Plasticity Phenomena, and Complex Systems Dynamics.
Abstract: Foundations of Connectionism and Biophysical Models of Neurons.- Dendrites: The Last-Generation Computers.- Homogeneity in the Electrical Activity Pattern as a Function of Intercellular Coupling in Cell Networks.- A Realistic Computational Model of the Local Circuitry of the Cuneate Nucleus.- Algorithmic Extraction of Morphological Statistics from Electronic Archives of Neuroanatomy.- What Can We Compute with Lateral Inhibition Circuits?.- Neuronal Models with Current Inputs.- Decoding the Population Responses of Retinal Ganglions Cells Using Information Theory.- Numerical Study of Effects of Co-transmission by Substance P and Acetylcholine on Synaptic Plasticity in Myenteric Neurons.- Neurobiological Modeling of Bursting Response During Visual Attention.- Sensitivity of Simulated Striate Neurons to Cross-Like Stimuli Based on Disinhibitory Mechanism.- Synchronisation Mechanisms in Neuronal Networks.- Detection of Oriented Repetitive Alternating Patterns in color Images.- Synchronization in Brain - Assessment by Electroencephalographic Signals.- Strategies for the Optimization of Large Scale Networks of Integrate and Fire Neurons.- Structural and Functional Models of Neurons.- A Neural Network Model of Working Memory (Processing of "What" and "Where" Information).- Orientation Selectivity of Intracortical Inhibitory Cells in the Striate Visual Cortex: A Computational Theory and a Neural Circuitry.- Interpreting Neural Networks in the Frame of the Logic of Lukasiewicz.- Time-Dispersive Effects in the J. Gonzalo's Research on Cerebral Dynamics.- Verifying Properties of Neural Networks.- Algorithms and Implementation Architectures for Hebbian Neural Networks.- The Hierarchical Neuro-Fuzzy BSP Model: An Application in Electric Load Forecasting.- The Chemical Metaphor in Neural Computation.- The General Neural-Network Paradigm for Visual Cryptography.- ?-DTB, Discrete Time Backpropagation with Product Units.- Neocognitron-Type Network for Recognizing Rotated and Shifted Patterns with Reduction of Resources.- Classification with Synaptic Radial Basis Units.- A Randomized Hypercolumn Model and Gesture Recognition.- Heterogeneous Kohonen Networks.- Divided-Data Analysis in a Financial Case Classification with Multi-dendritic Neural Networks.- Neuro Fuzzy Systems: State-of-the-Art Modeling Techniques.- Generating Linear Regression Rules from Neural Networks Using Local Least Squares Approximation.- Speech Recognition Using Fuzzy Second-Order Recurrent Neural Networks.- A Measure of Noise Immunity for Functional Networks.- A Functional-Neural Network for Post-Nonlinear Independent Component Analysis.- Optimal Modular Feedfroward Neural Nets Based on Functional Network Architectures.- Optimal Transformations in Multiple Linear Regression Using Functional Networks.- Learning and Other Plasticity Phenomena, and Complex Systems Dynamics.- Generalization Error and Training Error at Singularities of Multilayer Perceptrons.- Bistable Gradient Neural Networks: Their Computational Properties.- Inductive Bias in Recurrent Neural Networks.- Accelerating the Convergence of EM-Based Training Algorithms for RBF Networks.- Expansive and Competitive Neural Networks.- Fast Function Approximation with Hierarchical Neural Networks and Their Application to a Reinforcement Learning Agent.- Two Dimensional Evaluation Reinforcement Learning.- Comparing the Learning Processes of Cognitive Distance Learning and Search Based Agent.- Selective Learning for Multilayer Feedforward Neural Networks.- Connectionist Models of Cortico-Basal Ganglia Adaptive Neural Networks During Learning of Motor Sequential Procedures.- Practical Consideration on Generalization Property of Natural Gradient Learning.- Novel Training Algorithm Based on Quadratic Optimisation Using Neural Networks.- Non-symmetric Support Vector Machines.- Natural Gradient Learning in NLDA Networks.- AUTOWISARD: Unsupervised Modes for the WISARD.- Neural Steering: Difficult and Impossible Sequential Problems for Gradient Descent.- Analysis of Scaling Exponents of Waken and Sleeping Stage in EEG.- Model Based Predictive Control Using Genetic Algorithms. Application to Greenhouses Climate Control.- Nonlinear Parametric Model Identification with Genetic Algorithms. Application to a Thermal Process.- A Comparison of Several Evolutionary Heuristics for the Frequency Assignment Problem.- GA Techniques Applied to Contour Search in Images of Bovine Livestock.- Richer Network Dynamics of Intrinsically Non-regular Neurons Measured through Mutual Information.- RBF Neural Networks, Multiobjective Optimization and Time Series Forecasting.- Evolving RBF Neural Networks.- Evolutionary Cellular Configurations for Designing Feed-Forward Neural Networks Architectures.- A Recurrent Multivalued Neural Network for the N-Queens Problem.- A Novel Approach to Self-Adaptation of Neuro-Fuzzy Controllers in Real Time.- Expert Mutation Operators for the Evolution of Radial Basis Function Neural Networks.- Studying Neural Networks of Bifurcating Recursive Processing Elements - Quantitative Methods for Architecture Design.- Topology-Preserving Elastic Nets.- Optimization with Linear Constraints in the Neural Network.- Optimizing RBF Networks with Cooperative/Competitive Evolution of Units and Fuzzy Rules.- Study of Chaos in a Simple Discrete Recurrence Neural Network.- Genetic Algorithm versus Scatter Search and Solving Hard MAX-W-SAT Problems.- A New Approach to Evolutionary Computation: Segregative Genetic Algorithms (SEGA).- Evolution of Firms in Complex Worlds: Generalized NK Model.- Learning Adaptive Parameters with Restricted Genetic Optimization Method.- Solving NP-Complete Problems with Networks of Evolutionary Processors.- Using SOM for Neural Network Visualization.- Comparison of Supervised Self-Organizing Maps Using Euclidian or Mahalanobis Distance in Classification Context.- Introducing Multi-objective Optimization in Cooperative Coevolution of Neural Networks.- STAR - Sparsity through Automated Rejection.- Ordinal Regression with K-SVCR Machines.- Large Margin Nearest Neighbor Classifiers.- Reduced Support Vector Selection by Linear Programs.- Edge Detection in Noisy Images Using the Support Vector Machines.- Initialization in Genetic Algorithms for Constraint Satisfaction Problems.- Evolving High-Posterior Self-Organizing Maps.- Using Statistical Techniques to Predict GA Performance.- Multilevel Genetic Algorithm for the Complete Development of ANN.- Graph Based GP Applied to Dynamical Systems Modeling.- Nonlinear System Dynamics in the Normalisation Process of a Self-Organising Neural Network for Combinatorial Optimisation.- Continuous Function Optimisation via Gradient Descent on a Neural Network Approxmiation Function.- An Evolutionary Algorithm for the Design of Hybrid Fiber Optic-Coaxial Cable Networks in Small Urban Areas.- Channel Assignment for Mobile Communications Using Stochastic Chaotic Simulated Annealing.- Artificial Intelligence and Cognitive Processes.- Seeing is Believing: Depictive Neuromodelling of Visual Awareness.- DIAGEN-WebDB: A Connectionist Approach to Medical Knowledge Representation and Inference.- Conceptual Spaces as Voltage Maps.- Determining Hyper-planes to Generate Symbolic Rules.- Automatic Symbolic Modelling of Co-evolutionarily Learned Robot Skills.- ANNs and the Neural Basis for General Intelligence.- Knowledge and Intelligence.- Conjecturing the Cognitive Plausibility of an ANN Theorem-Prover.

77 citations


Journal ArticleDOI
TL;DR: A technical framework is presented to assess the impact of re-sampling on the ability of a supervised learning to correctly learn a classification problem using the bootstrap expression of the prediction error to identify the optimal re-Sampling proportions in binary classification experiments using artificial neural networks.

59 citations


Proceedings ArticleDOI
15 Jul 2001
TL;DR: One of the systems, based on the long short-term memory neural network developed a learning algorithm that could learn any two-dimensional quadratic function (from a set of such functions) after only 30 training examples.
Abstract: Introduces gradient descent methods applied to meta-learning (learning how to learn) in neural networks. Meta-learning has been of interest in the machine learning field for decades because of its appealing applications to intelligent agents, non-stationary time series, autonomous robots, and improved learning algorithms. Many previous neural network-based approaches toward meta-learning have been based on evolutionary methods. We show how to use gradient descent for meta-learning in recurrent neural networks. Based on previous work on fixed-weight learning neural networks, we hypothesize that any recurrent network topology and its corresponding learning algorithm(s) is a potential meta-learning system. We tested several recurrent neural network topologies and their corresponding forms of backpropagation for their ability to meta-learn. One of our systems, based on the long short-term memory neural network developed a learning algorithm that could learn any two-dimensional quadratic function (from a set of such functions) after only 30 training examples.

57 citations



Book ChapterDOI
01 Jan 2001
TL;DR: The chapter shows how to solve the dynamics of fully connected as well as extremely diluted networks, emphasizing on the crucial issue of presence (or absence) of synaptic symmetry and compares the predictions of the (exact) generating functional formalism to both numerical simulations and simple approximate theories.
Abstract: Publisher Summary This chapter focuses on solving the dynamics of recurrent neural networks using nonequilibrium statistical mechanical techniques. It introduces recurrent neural networks and their properties. The chapter starts with relatively simple networks, with a small number of attractors (such as systems with uniform synapses or with a small number of patterns stored with Hebbian-type rules), which can be solved with relatively simple mathematical techniques. The chapter shows how to solve the dynamics of fully connected as well as extremely diluted networks, emphasizing on the crucial issue of presence (or absence) of synaptic symmetry and compares the predictions of the (exact) generating functional formalism to both numerical simulations and simple approximate theories. Any finite degree of synaptic symmetry, whether in a fully connected or in an extremely diluted attractor network, immediately generates an effective retarded self-interaction in the dynamics that is responsible for highly nontrivial “glassy” dynamics.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: This work explores the use of discrete-time recurrent neural networks for part-of-speech disambiguation of textual corpora using a standard hidden Markov model trained using the Baum-Welch algorithm.
Abstract: Explores the use of discrete-time recurrent neural networks for part-of-speech disambiguation of textual corpora. Our approach does not need a hand-tagged text for training the tagger, being probably the first neural approach doing so. Preliminary results show that the performance of this approach is, at least, similar to that of a standard hidden Markov model trained using the Baum-Welch algorithm.

Journal ArticleDOI
Eung Sup Jun1, Jae Kyu Lee1
TL;DR: To find the quasi-optimal model from the hierarchy of reduced neural network models, this work adopted the beam search technique and devised the case-set selection algorithm, and it is shown that the resulting model significantly outperforms the original full model for the software effort estimation.
Abstract: A number of software effort estimations have attempted using statistical models, case based reasoning, and neural networks. The research results showed that the neural network models perform at least as well as the other approaches, so we selected the neural network model as the estimator. However, since the computing environment changes so rapidly in terms of programming languages, development tools, and methodologies, it is very difficult to maintain the performance of estimation models for the new breed of projects. Therefore, we propose a search method that finds the right level of relevant cases for the neural network model. For the selected case set, the scale of the neural network model can be reduced by eliminating the qualitative input factors with the same values. Since there exist a multitude of combinations of case sets, we need to search for the optimal reduced neural network model and corresponding case set. To find the quasi-optimal model from the hierarchy of reduced neural network models, we adopted the beam search technique and devised the case-set selection algorithm. We have shown that the resulting model significantly outperforms the original full model for the software effort estimation. This approach can be also used for building any case-selective neural network.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: A general class of dynamic network is introduced, the layered digital dynamic network, and the backpropagation-through-time algorithm for computing the gradient of the network error with respect to the weights of thenetwork is derived.
Abstract: This paper introduces a general class of dynamic network, the layered digital dynamic network. It then derives the backpropagation-through-time algorithm for computing the gradient of the network error with respect to the weights of the network.

Journal ArticleDOI
TL;DR: A simple neural network with asymmetric basis functions is proposed as a feature extractor for P waves in electrocardiographic signals (ECG) using the classical backward-error-propagation algorithm.
Abstract: In this work a simple neural network with asymmetric basis functions is proposed as a feature extractor for P waves in electrocardiographic signals (ECG). The neural network is trained using the classical backward-error-propagation algorithm. The performance of the proposed network was tested using actual ECG signals and compared with other types of neural feature extractors.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: A new architecture that can be used in combining neural network ensembles is presented, based on training two neural networks to perform the aggregation, which is compared with standard fixed and trained combining schemes.
Abstract: We present a comparison between different combining techniques in neural network ensembles. The main focus of this paper is on a new architecture that can be used in combining neural network ensembles. This architecture is based on training two neural networks to perform the aggregation. One network is trained to establish a confidence factor for each member of the ensemble for every training entry. The other network performs the aggregation of the ensemble to present the final decision. Both these networks evolve together during training. This approach is compared with standard fixed and trained combining schemes.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: It is shown that for statistically neutral problems such as parity and majority function, the stacked generalization scheme improves classification performance and generalization accuracy over the single level cross-validation model.
Abstract: Generalization continues to be one of the most important topic in neural networks and other classifiers. In the last number of years, number of different methods have been developed to improve generalization accuracy. Any classifier that uses induction to find the class concept from the training patterns will have a hard time to achieve an acceptable level of generalization accuracy when the problem to be learned is a statistically neutral problem. A problem is statistically neutral if the probability of mapping an input onto an output is always the chance value of 0.5. We examine the generalization behaviour of multilayer neural networks on learning statistically neutral problems using single level learning models (e.g., conventional cross-validation scheme) as well as multiple level learning models (e.g., stacked generalization method). We show that for statistically neutral problems such as parity and majority function, the stacked generalization scheme improves classification performance and generalization accuracy over the single level cross-validation model.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: An incremental learning system (called RAN-LTM), in which long-term memory (LTM) is introduced into a resource allocating network (RAN) to suppress the interference, which is important if many LTM data are retrieved large computations are required.
Abstract: When neural networks are trained incrementally, input-output relationships that are trained formerly tend to be collapsed by the learning of new training data. This phenomenon is called "interference". To suppress the interference, we have proposed an incremental learning system (called RAN-LTM), in which long-term memory (LTM) is introduced into a resource allocating network (RAN). Since RAN-LTM needs to train not only new data but also some LTM data to suppress the interference, if many LTM data are retrieved large computations are required. Therefore, it is important to design appropriate procedures for producing and retrieving LTM data in RAN-LTM. In the paper, these procedures in the previous version of RAN-LTM are improved. In simulations, the improved RAN-LTM is applied to the approximation of a one-dimensional function, and the approximation error and the training speed are evaluated as compared with RAN and the previous RAN-LTM.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: Some of the difficulties in training recurrent neural network are described, explanations for why these difficulties occur are provided and how they can be mitigated are explained.
Abstract: This paper describes some of the difficulties in training recurrent neural network, provides explanations for why these difficulties occur and explains how they can be mitigated.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: The sensitivity analysis approach to incremental learning presented by Engelbrecht and Cloete (1999) is extended with an unsupervised clustering of the candidate training set, and the most informative pattern is then selected from each of the clusters.
Abstract: The sensitivity analysis approach to incremental learning presented by Engelbrecht and Cloete (1999) is extended in this paper. That approach selects at each subset selection interval only one new informative pattern from the candidate training set, and adds the selected pattern to the current training subset. This approach is extended with an unsupervised clustering of the candidate training set. The most informative pattern is then selected from each of the clusters. Experimental results are given to show that the clustering approach to incremental learning performs substantially better than the original approach.

Journal ArticleDOI
TL;DR: An heuristic pattern correction scheme is proposed using adaptively trained generalized regression neural networks (GRNNs), which is based upon both network growing and dual-stage shrinking mechanisms.
Abstract: In many pattern classification problems, an intelligent neural system is required which can learn the newly encountered but misclassified patterns incrementally, while keeping a good classification performance over the past patterns stored in the network. In the paper, an heuristic pattern correction scheme is proposed using adaptively trained generalized regression neural networks (GRNNs). The scheme is based upon both network growing and dual-stage shrinking mechanisms. In the network growing phase, a subset of the misclassified patterns in each incoming data set is iteratively added into the network until all the patterns in the incoming data set are classified correctly. Then, the redundancy in the growing phase is removed in the dual-stage network shrinking. Both long- and short-term memory models are considered in the network shrinking, which are motivated from biological study of the brain. The learning capability of the proposed scheme is investigated through extensive simulation studies.

Patent
29 Aug 2001
TL;DR: In this article, a technique for machine learning, such as supervised artificial neural network learning, includes receiving data and checking the dimensionality of the read data and reducing the dimensional to enhance machine learning performance using Principal Component Analysis methodology.
Abstract: A technique for machine learning, such as supervised artificial neural network learning includes receiving data and checking the dimensionality of the read data and reducing the dimensionality to enhance machine learning performance using Principal Component Analysis methodology. The technique further includes specifying the neural network architecture and initializing weights to establish a connection between read data including the reduced dimensionality and the predicted values. The technique also includes performing supervised machine learning using the specified neural network architecture, initialized weights, and the read data including the reduced dimensionality to predict values. Predicted values are then compared to a normalized system error threshold value and the initialized weights are revised based on the outcome of the comparison to generate a learnt neural network having a reduced error in weight space. The learnt neural network is validated using known values and is then used for predicting values.

Book ChapterDOI
21 Aug 2001
TL;DR: Long Short-Term Memory recurrent networks are trained to maximize two information-theoretic objectives for unsupervised learning: Binary Information Gain Optimization (BINGO) and Nonparametric Entropyoptimization (NEO).
Abstract: While much work has been done on unsupervised learning in feedforward neural network architectures, its potential with (theoretically more powerful) recurrent networks and time-varying inputs has rarely been explored. Here we train Long Short-Term Memory (LSTM) recurrent networks to maximize two information-theoretic objectives for unsupervised learning: Binary Information Gain Optimization (BINGO) and Nonparametric Entropy Optimization (NEO). LSTM learns to discriminate different types of temporal sequences and group them according to a variety of features.

01 Jan 2001
TL;DR: Two novel ART-based neural network architectures, namely Ellipsoid ART and Ellip soid ARTMAP, which utilize hyper-ellipsoids for category representation are introduced, which are capable of fast, stable learning and of aiding us in gaining a clearer understanding of the networks' training and performance phases.
Abstract: Fuzzy ART and Fuzzy ARTMAP are two prominent neural network architectures based on the principles of Grossberg's Adaptive Resonance Theory (ART). While the former architecture employs unsupervised learning to perform clustering tasks, the latter one associates clusters belonging to an input and output domain in a supervised manner. As a special case, Fuzzy ARTMAP can also be used as a classifier. Both networks implement an exemplar-based learning method and summarize training patterns into categories (exemplars), whose geometric representation are hyper-rectangles embedded in the input domain. We introduce two novel ART-based neural network architectures, namely Ellipsoid ART and Ellipsoid ARTMAP, which utilize hyper-ellipsoids for category representation. We have designed these two architectures, so that they share all essential properties and characteristics of their Fuzzy counterparts. Of foremost importance, they are capable of fast, stable learning, meaning that learning completes in a finite number of steps. We also present selected experimental results that illustrate the potential of Ellipsoid ARTMAP to successfully perform classification tasks by exhibiting high prediction accuracy, while maintaining a relatively small number of categories. Next, we introduce category regions as novel concepts that enrich the geometric facet of Fuzzy ART and Fuzzy ARTMAP operations. Their definition stems from the geometric interpretation of two particular conditions that are examined, in order to assess the degree to which an input pattern matches the characteristics of an existing category already memorized by a Fuzzy ART or Fuzzy ARTMAP network. Apart from aiding us in gaining a clearer understanding of the networks' training and performance phases, based on the regions' properties we arrive at several results that are primarily of theoretical interest. Furthermore, due to the underlying similarity, all these results are shown to be applicable to Ellipsoid ART and Ellipsoid ARTMAP as well.

Journal ArticleDOI
TL;DR: A neural model based on a partially recurrent neural network is proposed as a better alternative for multi-step time series prediction and the results suggest than the recurrent model can help in improving the prediction accuracy.
Abstract: Multi-step prediction is a difficult task that has attracted increasing interest in recent years. It tries to achieve predictions several steps ahead into the future starting from current information. The interest in this work is the development of nonlinear neural models for the purpose of building multi-step time series prediction schemes. In that context, the most popular neural models are based on the traditional feedforward neural networks. However, this kind of model may present some disadvantages when a long-term prediction problem is formulated because they are trained to predict only the next sampling time. In this paper, a neural model based on a partially recurrent neural network is proposed as a better alternative. For the recurrent model, a learning phase with the purpose of long-term prediction is imposed, which allows to obtain better predictions of time series in the future. In order to validate the performance of the recurrent neural model to predict the dynamic behaviour of the series in the future, three different data time series have been used as study cases. An artificial data time series, the logistic map, and two real time series, sunspots and laser data. Models based on feedforward neural networks have also been used and compared against the proposed model. The results suggest than the recurrent model can help in improving the prediction accuracy.


Proceedings ArticleDOI
15 Jul 2001
TL;DR: This paper first appliesnegative correlation learning to the traffic flow prediction problem, and then proposes an evolutionary approach to deciding the penalty coefficient automatically in negative correlation learning.
Abstract: It is well-known that large neural networks with many unshared weights can be very difficult to train. A neural network ensemble consisting of a number of individual neural networks usually performs better than a complex monolithic neural network. One of the motivations behind neural network ensembles is the divide-and-conquer strategy, where a complex problem is decomposed into different components each of which is tackled by an individual neural network. A promising algorithm for training neural network ensembles is the negative correlation learning algorithm which penalizes positive correlations among individual networks by introducing a penalty term in the error function. A penalty coefficient is used to balance the minimization of the error and the minimization of the correlation. It is often very difficult to select an optimal penalty coefficient for a given problem because as yet there is no systematic method available for setting the parameter. This paper first applies negative correlation learning to the traffic flow prediction problem, and then proposes an evolutionary approach to deciding the penalty coefficient automatically in negative correlation learning. Experimental results on the traffic flow prediction problem will be presented.

Proceedings ArticleDOI
25 Jul 2001
TL;DR: This paper examines the effect of instance and feature selection on the generalization ability of trained neural networks through computer simulations on various artificial and real-world pattern classification problems.
Abstract: We examine the effect of instance and feature selection on the generalization ability of trained neural networks for pattern classification problems. Before the learning of neural networks, a genetic-algorithm-based instance and feature selection method is applied for reducing the size of training data. Nearest neighbor classification is used for evaluating the classification ability of subsets of training data in instance and feature selection. Neural networks are trained by the selected subset (i.e., reduced training data). In this paper, we first explain our GA-based instance and feature selection method. Then we examine the effect of instance and feature selection on the generalization ability of trained neural networks through computer simulations on various artificial and real-world pattern classification problems.

Proceedings ArticleDOI
15 Jul 2001
TL;DR: The feedforward neural networks model is extended to one which works recursively with a small number of the real roots of a polynomial (less than the total number of roots to be found) obtained at a time.
Abstract: This paper proposes applying feedforward neural networks (FNN) with problem decomposition and constrained learning to finding the real roots of polynomials. In order to alleviate ihe load of the computational complexity for high order polynomials, this network model is extended to one which works recursively with a small number of the real roots of a polynomial (less than the total number of roots to be found) obtained at a time. The recursive formulae for finding i real roots at a time are presented Finallx some computer simulaiion results are reported.