scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 2000"


Journal ArticleDOI
TL;DR: The two-stage procedure--first using SOM to produce the prototypes that are then clustered in the second stage--is found to perform well when compared with direct clustering of the data and to reduce the computation time.
Abstract: The self-organizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects input space on prototypes of a low-dimensional regular grid that can be effectively utilized to visualize and explore properties of the data. When the number of SOM units is large, to facilitate quantitative analysis of the map and the data, similar units need to be grouped, i.e., clustered. In this paper, different approaches to clustering of the SOM are considered. In particular, the use of hierarchical agglomerative clustering and partitive clustering using K-means are investigated. The two-stage procedure-first using SOM to produce the prototypes that are then clustered in the second stage-is found to perform well when compared with direct clustering of the data and to reduce the computation time.

2,387 citations


Journal ArticleDOI
TL;DR: A system that is able to organize vast document collections according to textual similarities based on the self-organizing map (SOM) algorithm, based on 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms.
Abstract: Describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the self-organizing map (SOM) algorithm. As the feature vectors for the documents statistical representations of their vocabularies are used. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of high-dimensional data. In a practical experiment we mapped 6840568 patent abstracts onto a 1002240-node SOM. As the feature vectors we used 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms.

1,007 citations


Journal ArticleDOI
TL;DR: Using clues from the KKT conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO for regression that perform significantly faster than the original SMO on the datasets tried.
Abstract: This paper points out an important source of inefficiency in Smola and Scholkopf's (1998) sequential minimal optimization (SMO) algorithm for support vector machine regression that is caused by the use of a single threshold value. Using clues from the Karush-Kuhn-Tucker conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO for regression. These modified algorithms perform significantly faster than the original SMO on the datasets tried.

837 citations


Journal ArticleDOI
TL;DR: This article proposes to bring the various neuro-fuzzy models used for rule generation under a unified soft computing framework, and includes both rule extraction and rule refinement in the broader perspective of rule generation.
Abstract: The present article is a novel attempt in providing an exhaustive survey of neuro-fuzzy rule generation algorithms. Rule generation from artificial neural networks is gaining in popularity in recent times due to its capability of providing some insight to the user about the symbolic knowledge embedded within the network. Fuzzy sets are an aid in providing this information in a more human comprehensible or natural form, and can handle uncertainties at various levels. The neuro-fuzzy approach, symbiotically combining the merits of connectionist and fuzzy approaches, constitutes a key component of soft computing at this stage. To date, there has been no detailed and integrated categorization of the various neuro-fuzzy models used for rule generation. We propose to bring these together under a unified soft computing framework. Moreover, we include both rule extraction and rule refinement in the broader perspective of rule generation. Rules learned and generated for fuzzy reasoning and fuzzy control are also considered from this wider viewpoint. Models are grouped on the basis of their level of neuro-fuzzy synthesis. Use of other soft computing tools like genetic algorithms and rough sets are emphasized. Rule generation from fuzzy knowledge-based networks, which initially encode some crude domain knowledge, are found to result in more refined rules. Finally, real-life application to medical diagnosis is provided.

726 citations


Journal ArticleDOI
TL;DR: The growing self-organizing map (GSOM) is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated.
Abstract: The growing self-organizing map (GSOM) algorithm is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated. The spread factor is independent of the dimensionality of the data and as such can be used as a controlling measure for generating maps with different dimensionality, which can then be compared and analyzed with better accuracy. The spread factor is also presented as a method of achieving hierarchical clustering of a data set with the GSOM. Such hierarchical clustering allows the data analyst to identify significant and interesting clusters at a higher level of the hierarchy, and continue with finer clustering of the interesting clusters only. Therefore, only a small map is created in the beginning with a low spread factor, which can be generated for even a very large data set. Further analysis is conducted on selected sections of the data and of smaller volume. Therefore, this method facilitates the analysis of even very large data sets.

529 citations


Journal ArticleDOI
TL;DR: An adaptive output feedback control scheme for the output tracking of a class of continuous-time nonlinear plants is presented and it is shown that by using adaptive control in conjunction with robust control, it is possible to tolerate larger approximation errors resulting from the use of lower order networks.
Abstract: An adaptive output feedback control scheme for the output tracking of a class of continuous-time nonlinear plants is presented. An RBF neural network is used to adaptively compensate for the plant nonlinearities. The network weights are adapted using a Lyapunov-based design. The method uses parameter projection, control saturation, and a high-gain observer to achieve semi-global uniform ultimate boundedness. The effectiveness of the proposed method is demonstrated through simulations. The simulations also show that by using adaptive control in conjunction with robust control, it is possible to tolerate larger approximation errors resulting from the use of lower order networks.

529 citations


Journal ArticleDOI
TL;DR: An on-line version of the proposed algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems and reaches the error minimum in a much smaller number of iterations.
Abstract: How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.

432 citations


Journal ArticleDOI
TL;DR: Two networks are employed: a multilayer perceptron and a radial basis function network to account for the exact satisfaction of the boundary conditions of complex boundary geometry.
Abstract: Partial differential equations (PDEs) with boundary conditions (Dirichlet or Neumann) defined on boundaries with simple geometry have been successfully treated using sigmoidal multilayer perceptrons in previous works. The article deals with the case of complex boundary geometry, where the boundary is determined by a number of points that belong to it and are closely located, so as to offer a reasonable representation. Two networks are employed: a multilayer perceptron and a radial basis function network. The later is used to account for the exact satisfaction of the boundary conditions. The method has been successfully tested on two-dimensional and three-dimensional PDEs and has yielded accurate results.

420 citations


Journal ArticleDOI
TL;DR: Comparative computational evaluation of the new fast iterative algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that the algorithm is very competitive.
Abstract: In this paper we give a new fast iterative algorithm for support vector machine (SVM) classifier design. The basic problem treated is one that does not allow classification violations. The problem is converted to a problem of computing the nearest point between two convex polytopes. The suitability of two classical nearest point algorithms, due to Gilbert, and Mitchell et al., is studied. Ideas from both these algorithms are combined and modified to derive our fast algorithm. For problems which require classification violations to be allowed, the violations are quadratically penalized and an idea due to Cortes and Vapnik and Friess is used to convert it to a problem in which there are no classification violations. Comparative computational evaluation of our algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that our algorithm is very competitive.

401 citations


Journal ArticleDOI
TL;DR: A novel architecture of an oscillatory neural network that consists of phase-locked loop (PLL) circuits that stores and retrieves complex oscillatory patterns as synchronized states with appropriate phase relations between neurons is proposed.
Abstract: We propose a novel architecture of an oscillatory neural network that consists of phase-locked loop (PLL) circuits. It stores and retrieves complex oscillatory patterns as synchronized states with appropriate phase relations between neurons.

356 citations


Journal ArticleDOI
TL;DR: A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given.
Abstract: This paper describes a general fuzzy min-max (GFMM) neural network which is a generalization and extension of the fuzzy min-max clustering and classification algorithms of Simpson (1992, 1993). The GFMM method combines supervised and unsupervised learning in a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. It exhibits a property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes and consists of placing and adjusting the hyperboxes in the pattern space; this is an expansion-contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations. A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given.

Journal ArticleDOI
TL;DR: These experiments show that under a wide variety of assumptions concerning the cost of intervention and the retention rate resulting from intervention, using predictive techniques to identify potential churners and offering incentives can yield significant savings to a carrier.
Abstract: We explore techniques from statistical machine learning to predict churn and, based on these predictions, to determine what incentives should be offered to subscribers to improve retention and maximize profitability to the carrier. The techniques include legit regression, decision trees, neural networks, and boosting. Our experiments are based on a database of nearly 47000 USA domestic subscribers and includes information about their usage, billing, credit, application, and complaint history. Our experiments show that under a wide variety of assumptions concerning the cost of intervention and the retention rate resulting from intervention, using predictive techniques to identify potential churners and offering incentives can yield significant savings to a carrier. We also show the importance of a data representation crafted by domain experts. Finally, we report on a real-world test of the techniques that validate our simulation experiments.

Journal ArticleDOI
TL;DR: A supervised network structure determination algorithm that identifies an appropriate smoothing parameter using a genetic algorithm and determines suitable pattern layer neurons using a forward regression orthogonal algorithm is proposed.
Abstract: Network structure determination is an important issue in pattern classification based on a probabilistic neural network. In this study, a supervised network structure determination algorithm is proposed. The proposed algorithm consists of two parts and runs in an iterative way. The first part identifies an appropriate smoothing parameter using a genetic algorithm, while the second part determines suitable pattern layer neurons using a forward regression orthogonal algorithm. The proposed algorithm is capable of offering a fairly small network structure with satisfactory classification accuracy.

Journal ArticleDOI
TL;DR: It is shown that the information available in an ROLS algorithm after network training can be used to sequentially select centers to minimize the network output error and provide efficient methods for network reduction to achieve smaller architectures with acceptable accuracy and without retraining.
Abstract: Recursive orthogonal least squares (ROLS) is a numerically robust method for solving for the output layer weights of a radial basis function (RBF) network, and requires less computer memory than the batch alternative. In the paper, the use of ROLS is extended to selecting the centers of an RBF network. It is shown that the information available in an ROLS algorithm after network training can be used to sequentially select centers to minimize the network output error. This provides efficient methods for network reduction to achieve smaller architectures with acceptable accuracy and without retraining. Two selection methods are developed, forward and backward. The methods are illustrated in applications of RBF networks to modeling a nonlinear time series and a real multiinput-multioutput chemical process. The final network models obtained achieve acceptable accuracy with significant reductions in the number of required centers.

Journal ArticleDOI
TL;DR: A sufficient condition related to the existence of a unique equilibrium point and its global asymptotic stability for cellular network networks with delay (DCNNs) is derived and it is shown that the condition relies on the feedback matrices and is independent of the delay parameter.
Abstract: A sufficient condition related to the existence of a unique equilibrium point and its global asymptotic stability for cellular network networks with delay (DCNNs) is derived. It is shown that the condition relies on the feedback matrices and is independent of the delay parameter. Furthermore, this condition is less restrictive than that given in the literature.

Journal ArticleDOI
TL;DR: Two constructive learning algorithms MPyramid-real and MTiling-real are presented that extend the pyramid and tiling algorithms, respectively, for learning real to M-ary mappings and it is proved the convergence of these algorithms and empirically demonstrate their applicability to practical pattern classification problems.
Abstract: Constructive learning algorithms offer an attractive approach for the incremental construction of near-minimal neural-network architectures for pattern classification. They help overcome the need for ad hoc and often inappropriate choices of network topology in algorithms that search for suitable weights in a priori fixed network architectures. Several such algorithms are proposed in the literature and shown to converge to zero classification errors (under certain assumptions) on tasks that involve learning a binary to binary mapping (i.e., classification problems involving binary-valued input attributes and two output categories). We present two constructive learning algorithms, MPyramid-real and MTiling-real, that extend the pyramid and tiling algorithms, respectively, for learning real to M-ary mappings (i.e., classification problems involving real-valued input attributes and multiple output classes). We prove the convergence of these algorithms and empirically demonstrate their applicability to practical pattern classification problems. Additionally, we show how the incorporation of a local pruning step can eliminate several redundant neurons from MTiling-real networks.

Journal ArticleDOI
TL;DR: The variational methods of Jaakkola and Jordan are applied to Gaussian processes to produce an efficient Bayesian binary classifier.
Abstract: Gaussian processes are a promising nonlinear regression tool, but it is not straightforward to solve classification problems with them. In the paper the variational methods of Jaakkola and Jordan (2000) are applied to Gaussian processes to produce an efficient Bayesian binary classifier.

Journal ArticleDOI
TL;DR: In this article, the authors describe the application of mixtures of experts on gender and ethnic classification of human faces and pose classification, and show their feasibility on the FERET database of facial images.
Abstract: We describe the application of mixtures of experts on gender and ethnic classification of human faces, and pose classification, and show their feasibility on the FERET database of facial images. The mixture of experts is implemented using the "divide and conquer" modularity principle with respect to the granularity and/or the locality of information. The mixture of experts consists of ensembles of radial basis functions (RBFs). Inductive decision trees (DTs) and support vector machines (SVMs) implement the "gating network" components for deciding which of the experts should be used to determine the classification output and to restrict the support of the input space. Both the ensemble of RBF's (ERBF) and SVM use the RBF kernel ("expert") for gating the inputs. Our experimental results yield an average accuracy rate of 96% on gender classification and 92% on ethnic classification using the ERBF/DT approach from frontal face images, while the SVM yield 100% on pose classification.

Journal ArticleDOI
TL;DR: It is proved that single hidden layer feedforward neural networks (SLFN's) with any continuous bounded nonconstant activation function or any arbitrary bounded (continuous or not continuous) activation function which has unequal limits at infinities (not just perceptrons) can form disjoint decision regions with arbitrary shapes in multidimensional cases.
Abstract: Multilayer perceptrons with hard-limiting (signum) activation functions can form complex decision regions. It is well known that a three-layer perceptron (two hidden layers) can form arbitrary disjoint decision regions and a two-layer perceptron (one hidden layer) can form single convex decision regions. This paper further proves that single hidden layer feedforward neural networks (SLFN) with any continuous bounded nonconstant activation function or any arbitrary bounded (continuous or not continuous) activation function which has unequal limits at infinities (not just perceptrons) can form disjoint decision regions with arbitrary shapes in multidimensional cases, SLFN with some unbounded activation function can also form disjoint decision regions with arbitrary shapes.

Journal ArticleDOI
TL;DR: A new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals using a feature extractor using wavelet packets in conjunction with linear predictive coding, a feature selection scheme, and a backpropagation neural-network classifier.
Abstract: In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.

Journal ArticleDOI
TL;DR: A new robust control technique for induction motors using neural networks (NNs) which is systematic and robust to parameter variations and does not require regression matrices, so that no preliminary dynamical analysis is needed.
Abstract: We present a new robust control technique for induction motors using neural networks (NNs). The method is systematic and robust to parameter variations. Motivated by the backstepping design technique, we first treat certain signals in the system as fictitious control inputs to a simpler subsystem. A two-layer NN is used in this stage to design the fictitious controller. We then apply a second two-layer NN to robustly realize the fictitious NN signals designed in the previous step. A new tuning scheme is proposed which can guarantee the boundedness of tracking error and weight updates. A main advantage of our method is that it does not require regression matrices, so that no preliminary dynamical analysis is needed. Another salient feature of our NN approach is that the off-line learning phase is not needed. Full state feedback is needed for implementation. Load torque and rotor resistance can be unknown but bounded.

Journal ArticleDOI
TL;DR: This paper proposes a neural controller for a class of unknown, minimum phase, feedback linearizable nonlinear system with known relative degree, based on the backstepping design technique in conjunction with a linearly parameterized neural-network structure.
Abstract: We propose, from an adaptive control perspective, a neural controller for a class of unknown, minimum phase, feedback linearizable nonlinear system with known relative degree. The control scheme is based on the backstepping design technique in conjunction with a linearly parametrized neural-network structure. The resulting controller, however, moves the complex mechanics involved in a typical backstepping design from off-line to online. With appropriate choice of the network size and neural basis functions, the same controller can be trained online to control different nonlinear plants with the same relative degree, with semi-global stability as shown by the simple Lyapunov analysis. Meanwhile, the controller also preserves some of the performance properties of the standard backstepping controllers. Simulation results are shown to demonstrate these properties and to compare the neural controller with a standard backstepping controller.

Journal ArticleDOI
TL;DR: The annealing robust backpropagation learning algorithm (ARBP) that adopts the annealed concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers.
Abstract: Multilayer feedforward neural networks are often referred to as universal approximators. Nevertheless, if the used training data are corrupted by large noise, such as outliers, traditional backpropagation learning schemes may not always come up with acceptable performance. Even though various robust learning algorithms have been proposed in the literature, those approaches still suffer from the initialization problem. In those robust learning algorithms, the so-called M-estimator is employed. For the M-estimation type of learning algorithms, the loss function is used to play the role in discriminating against outliers from the majority by degrading the effects of those outliers in learning. However, the loss function used in those algorithms may not correctly discriminate against those outliers. In the paper, the annealing robust backpropagation learning algorithm (ARBP) that adopts the annealing concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers. The proposed algorithm has been employed in various examples. Those results all demonstrated the superiority over other robust learning algorithms independent of outliers. In the paper, not only is the annealing concept adopted into the robust learning algorithms but also the annealing schedule k/t was found experimentally to achieve the best performance among other annealing schedules, where k is a constant and t is the epoch number.

Journal ArticleDOI
H. Tsukimoto1
TL;DR: The algorithm is a decompositional approach which can be applied to any neural network whose output function is monotone such as sigmoid function, and it does not depend on training algorithms, and its computational complexity is polynomial.
Abstract: Presents an algorithm for extracting rules from trained neural networks. The algorithm is a decompositional approach which can be applied to any neural network whose output function is monotone such as a sigmoid function. Therefore, the algorithm can be applied to multilayer neural networks, recurrent neural networks and so on. It does not depend on training algorithms, and its computational complexity is polynomial. The basic idea is that the units of neural networks are approximated by Boolean functions. But the computational complexity of the approximation is exponential, and so a polynomial algorithm is presented. The author has applied the algorithm to several problems to extract understandable and accurate rules. The paper shows the results for the votes data, mushroom data, and others. The algorithm is extended to the continuous domain, where extracted rules are continuous Boolean functions. Roughly speaking, the representation by continuous Boolean functions means the representation using conjunction, disjunction, direct proportion, and reverse proportion. This paper shows the results for iris data.

Journal ArticleDOI
TL;DR: It is shown that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework, which allows automatic adjustment of the regularization parameter and the kernel parameter to their near-optimal values.
Abstract: We show that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework (1992). We further on show that levels 2 and 3 of the evidence framework can also be applied to SVMs. This integration allows automatic adjustment of the regularization parameter and the kernel parameter to their near-optimal values. Moreover, it opens up a wealth of Bayesian tools for use with SVMs. Performance of this method is evaluated on both synthetic and real-world data sets.

Journal ArticleDOI
TL;DR: This paper connects this method to projected gradient methods and provides theoretical proofs for a version of decomposition methods and shows that this convergence proof is valid for general decomposition Methods if their working set selection meets a simple requirement.
Abstract: The support vector machine (SVM) is a promising technique for pattern recognition. It requires the solution of a large dense quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, very few methods can handle the memory problem and an important one is the "decomposition method." However, there is no convergence proof so far. We connect this method to projected gradient methods and provide theoretical proofs for a version of decomposition methods. An extension to bound-constrained formulation of SVM is also provided. We then show that this convergence proof is valid for general decomposition methods if their working set selection meets a simple requirement.

Journal ArticleDOI
TL;DR: This paper presents a continuous-time recurrent neural-network model for nonlinear optimization with any continuously differentiable objective function and bound constraints and shows that the recurrent neural network is globally exponentially stable for almost any positive network parameters.
Abstract: This paper presents a continuous-time recurrent neural-network model for nonlinear optimization with any continuously differentiable objective function and bound constraints. Quadratic optimization with bound constraints is a special problem which can be solved by the recurrent neural network. The proposed recurrent neural network has the following characteristics. 1) It is regular in the sense that any optimum of the objective function with bound constraints is also an equilibrium point of the neural network. If the objective function to be minimized is convex, then the recurrent neural network is complete in the sense that the set of optima of the function with bound constraints coincides with the set of equilibria of the neural network. 2) The recurrent neural network is primal and quasiconvergent in the sense that its trajectory cannot escape from the feasible region and will converge to the set of equilibria of the neural network for any initial point in the feasible bound region. 3) The recurrent neural network has an attractivity property in the sense that its trajectory will eventually converge to the feasible region for any initial states even at outside of the bounded feasible region. 4) For minimizing any strictly convex quadratic objective function subject to bound constraints, the recurrent neural network is globally exponentially stable for almost any positive network parameters. Simulation results are given to demonstrate the convergence and performance of the proposed recurrent neural network for nonlinear optimization with bound constraints.

Journal ArticleDOI
TL;DR: The results indicate that combination strategies based on a single ANN outperform the other approaches to prediction of daily natural gas consumption needed by gas utilities.
Abstract: The focus of this paper is on combination of artificial neural-network (ANN) forecasters with application to the prediction of daily natural gas consumption needed by gas utilities. ANN forecasters can model the complex relationship between weather parameters and previous gas consumption with the future consumption. A two-stage system is proposed with the first stage containing two ANN forecasters, a multilayer feedforward ANN and a functional link ANN. These forecasters are initially trained with the error backpropagation algorithm, but an adaptive strategy is employed to adjust their weights during online forecasting. The second stage consists of a combination module to mix the two individual forecasts produced in the first stage. Eight different combination algorithms are examined, they are based on: averaging, recursive least squares, fuzzy logic, feedforward ANN, functional link ANN, temperature space approach, Karmarkar's linear programming algorithm (1984) and adaptive mixture of local experts (modular neural networks). The performance is tested on real data from six different gas utilities. The results indicate that combination strategies based on a single ANN outperform the other approaches.

Journal ArticleDOI
TL;DR: Elimination of variables through neural network sensitivity analysis and predicting performance through model cross-validation allows the analyst to reduce the number of inputs and improve the model's predictive ability at the same time.
Abstract: A novel neural network based technique, called "data strip mining" extracts predictive models from data sets which have a large number of potential inputs and comparatively few data points. This methodology uses neural network sensitivity analysis to determine which predictors are most significant in the problem. Neural network sensitivity analysis holds all but one input to a trained neural network constant while varying each input over its entire range to determine its effect on the output. Elimination of variables through neural network sensitivity analysis and predicting performance through model cross-validation allows the analyst to reduce the number of inputs and improve the model's predictive ability at the same time. This paper demonstrates its effectiveness on a pair of problems from combinatorial chemistry with over 400 potential inputs each. For these data sets, model selection by neural sensitivity analysis outperformed other variable selection methods including the forward selection and genetic algorithm.

Journal ArticleDOI
TL;DR: It is shown that the design of such nonadaptive indirect control systems necessitates only the training of the inverse of the model deprived from its delay, and that the presence of the delay thus does not increase the order ofThe inverse.
Abstract: We propose a design procedure of neural internal model control systems for stable processes with delay. We show that the design of such nonadaptive indirect control systems necessitates only the training of the inverse of the model deprived from its delay, and that the presence of the delay thus does not increase the order of the inverse. The controller is then obtained by cascading this inverse with a rallying model which imposes the regulation dynamic behavior and ensures the robustness of the stability. A change in the desired regulation dynamic behavior, or an improvement of the stability, can be obtained by simply tuning the rallying model, without retraining the whole model reference controller. The robustness properties of internal model control systems being obtained when the inverse is perfect, we detail the precautions which must be taken for the training of the inverse so that it is accurate in the whole space visited during operation with the process. In the same spirit, we make an emphasis on neural models affine in the control input, whose perfect inverse is derived without training. The control of simulated processes illustrates the proposed design procedure and the properties of the neural internal model control system for processes without and with delay.