Showing papers in &quot;IEEE Transactions on Neural Networks in 2000&quot;

Self organization of a massive document collection

TL;DR: The two-stage procedure--first using SOM to produce the prototypes that are then clustered in the second stage--is found to perform well when compared with direct clustering of the data and to reduce the computation time.

...read moreread less

Abstract: The self-organizing map (SOM) is an excellent tool in exploratory phase of data mining. It projects input space on prototypes of a low-dimensional regular grid that can be effectively utilized to visualize and explore properties of the data. When the number of SOM units is large, to facilitate quantitative analysis of the map and the data, similar units need to be grouped, i.e., clustered. In this paper, different approaches to clustering of the SOM are considered. In particular, the use of hierarchical agglomerative clustering and partitive clustering using K-means are investigated. The two-stage procedure-first using SOM to produce the prototypes that are then clustered in the second stage-is found to perform well when compared with direct clustering of the data and to reduce the computation time.

...read moreread less

2,387 citations

Journal Article•DOI•

[...]

Teuvo Kohonen¹, Samuel Kaski¹, Krista Lagus¹, Jarkko Salojärvi¹, J. Honkela¹, V. Paatero¹, Antti Saarela¹ - Show less +3 more•Institutions (1)

Helsinki University of Technology¹

Improvements to the SMO algorithm for SVM regression

TL;DR: A system that is able to organize vast document collections according to textual similarities based on the self-organizing map (SOM) algorithm, based on 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms.

...read moreread less

Abstract: Describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the self-organizing map (SOM) algorithm. As the feature vectors for the documents statistical representations of their vocabularies are used. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of high-dimensional data. In a practical experiment we mapped 6840568 patent abstracts onto a 1002240-node SOM. As the feature vectors we used 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms.

...read moreread less

1,007 citations

Journal Article•DOI•

[...]

Shirish Shevade¹, S. Sathiya Keerthi, Chiranjib Bhattacharyya, K.R.K. Murthy²•Institutions (2)

Indian Institute of Science¹, National University of Singapore²

Neuro-fuzzy rule generation: survey in soft computing framework

TL;DR: Using clues from the KKT conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO for regression that perform significantly faster than the original SMO on the datasets tried.

...read moreread less

Abstract: This paper points out an important source of inefficiency in Smola and Scholkopf's (1998) sequential minimal optimization (SMO) algorithm for support vector machine regression that is caused by the use of a single threshold value. Using clues from the Karush-Kuhn-Tucker conditions for the dual problem, two threshold parameters are employed to derive modifications of SMO for regression. These modified algorithms perform significantly faster than the original SMO on the datasets tried.

...read moreread less

837 citations

Journal Article•DOI•

[...]

Sushmita Mitra, Yoichi Hayashi¹•Institutions (1)

Meiji University¹

Dynamic self-organizing maps with controlled growth for knowledge discovery

TL;DR: This article proposes to bring the various neuro-fuzzy models used for rule generation under a unified soft computing framework, and includes both rule extraction and rule refinement in the broader perspective of rule generation.

...read moreread less

Abstract: The present article is a novel attempt in providing an exhaustive survey of neuro-fuzzy rule generation algorithms. Rule generation from artificial neural networks is gaining in popularity in recent times due to its capability of providing some insight to the user about the symbolic knowledge embedded within the network. Fuzzy sets are an aid in providing this information in a more human comprehensible or natural form, and can handle uncertainties at various levels. The neuro-fuzzy approach, symbiotically combining the merits of connectionist and fuzzy approaches, constitutes a key component of soft computing at this stage. To date, there has been no detailed and integrated categorization of the various neuro-fuzzy models used for rule generation. We propose to bring these together under a unified soft computing framework. Moreover, we include both rule extraction and rule refinement in the broader perspective of rule generation. Rules learned and generated for fuzzy reasoning and fuzzy control are also considered from this wider viewpoint. Models are grouped on the basis of their level of neuro-fuzzy synthesis. Use of other soft computing tools like genetic algorithms and rough sets are emphasized. Rule generation from fuzzy knowledge-based networks, which initially encode some crude domain knowledge, are found to result in more refined rules. Finally, real-life application to medical diagnosis is provided.

...read moreread less

726 citations

Journal Article•DOI•

[...]

Damminda Alahakoon¹, Saman K. Halgamuge², Bala Srinivasan³•Institutions (3)

Monash University, Clayton campus¹, University of Melbourne², Monash University³

Output feedback control of nonlinear systems using RBF neural networks

TL;DR: The growing self-organizing map (GSOM) is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated.

...read moreread less

Abstract: The growing self-organizing map (GSOM) algorithm is presented in detail and the effect of a spread factor, which can be used to measure and control the spread of the GSOM, is investigated. The spread factor is independent of the dimensionality of the data and as such can be used as a controlling measure for generating maps with different dimensionality, which can then be compared and analyzed with better accuracy. The spread factor is also presented as a method of achieving hierarchical clustering of a data set with the GSOM. Such hierarchical clustering allows the data analyst to identify significant and interesting clusters at a higher level of the hierarchy, and continue with finer clustering of the interesting clusters only. Therefore, only a small map is created in the beginning with a low spread factor, which can be generated for even a very large data set. Further analysis is conducted on selected sections of the data and of smaller volume. Therefore, this method facilitates the analysis of even very large data sets.

...read moreread less

529 citations

Journal Article•DOI•

[...]

Sridhar Seshagiri¹, Hassan K. Khalil²•Institutions (2)

Ford Motor Company¹, Michigan State University²

01 Jan 2000-IEEE Transactions on Neural Networks

TL;DR: An adaptive output feedback control scheme for the output tracking of a class of continuous-time nonlinear plants is presented and it is shown that by using adaptive control in conjunction with robust control, it is possible to tolerate larger approximation errors resulting from the use of lower order networks.

...read moreread less

Abstract: An adaptive output feedback control scheme for the output tracking of a class of continuous-time nonlinear plants is presented. An RBF neural network is used to adaptively compensate for the plant nonlinearities. The network weights are adapted using a Lyapunov-based design. The method uses parameter projection, control saturation, and a high-gain observer to achieve semi-global uniform ultimate boundedness. The effectiveness of the proposed method is demonstrated through simulations. The simulations also show that by using adaptive control in conjunction with robust control, it is possible to tolerate larger approximation errors resulting from the use of lower order networks.

...read moreread less

529 citations

Journal Article•DOI•

New results on recurrent network training: unifying the algorithms and accelerating convergence

[...]

Amir F. Atiya¹, Alexander G. Parlos²•Institutions (2)

California Institute of Technology¹, Texas A&M University²

Neural-network methods for boundary value problems with irregular boundaries

TL;DR: An on-line version of the proposed algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems and reaches the error minimum in a much smaller number of iterations.

...read moreread less

Abstract: How to efficiently train recurrent networks remains a challenging and active research topic. Most of the proposed training approaches are based on computational ways to efficiently obtain the gradient of the error function, and can be generally grouped into five major groups. In this study we present a derivation that unifies these approaches. We demonstrate that the approaches are only five different ways of solving a particular matrix equation. The second goal of this paper is develop a new algorithm based on the insights gained from the novel formulation. The new algorithm, which is based on approximating the error gradient, has lower computational complexity in computing the weight update than the competing techniques for most typical problems. In addition, it reaches the error minimum in a much smaller number of iterations. A desirable characteristic of recurrent network training algorithms is to be able to update the weights in an online fashion. We have also developed an online version of the proposed algorithm, that is based on updating the error gradient approximation in a recursive manner.

...read moreread less

432 citations

Journal Article•DOI•

[...]

Isaac E. Lagaris, Aristidis Likas¹, D.G. Papageorgiou¹•Institutions (1)

University of Ioannina¹

A fast iterative nearest point algorithm for support vector machine classifier design

TL;DR: Two networks are employed: a multilayer perceptron and a radial basis function network to account for the exact satisfaction of the boundary conditions of complex boundary geometry.

...read moreread less

Abstract: Partial differential equations (PDEs) with boundary conditions (Dirichlet or Neumann) defined on boundaries with simple geometry have been successfully treated using sigmoidal multilayer perceptrons in previous works. The article deals with the case of complex boundary geometry, where the boundary is determined by a number of points that belong to it and are closely located, so as to offer a reasonable representation. Two networks are employed: a multilayer perceptron and a radial basis function network. The later is used to account for the exact satisfaction of the boundary conditions. The method has been successfully tested on two-dimensional and three-dimensional PDEs and has yielded accurate results.

...read moreread less

420 citations

Journal Article•DOI•

[...]

S. Sathiya Keerthi¹, Shirish Shevade, Chiranjib Bhattacharyya, K.R.K. Murthy•Institutions (1)

National University of Singapore¹

01 Jan 2000-IEEE Transactions on Neural Networks

TL;DR: Comparative computational evaluation of the new fast iterative algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that the algorithm is very competitive.

...read moreread less

Abstract: In this paper we give a new fast iterative algorithm for support vector machine (SVM) classifier design. The basic problem treated is one that does not allow classification violations. The problem is converted to a problem of computing the nearest point between two convex polytopes. The suitability of two classical nearest point algorithms, due to Gilbert, and Mitchell et al., is studied. Ideas from both these algorithms are combined and modified to derive our fast algorithm. For problems which require classification violations to be allowed, the violations are quadratically penalized and an idea due to Cortes and Vapnik and Friess is used to convert it to a problem in which there are no classification violations. Comparative computational evaluation of our algorithm against powerful SVM methods such as Platt's sequential minimal optimization shows that our algorithm is very competitive.

...read moreread less

401 citations

Journal Article•DOI•

Pattern recognition via synchronization in phase-locked loop neural networks

[...]

Frank C. Hoppensteadt¹, Eugene M. Izhikevich•Institutions (1)

Arizona State University¹

General fuzzy min-max neural network for clustering and classification

TL;DR: A novel architecture of an oscillatory neural network that consists of phase-locked loop (PLL) circuits that stores and retrieves complex oscillatory patterns as synchronized states with appropriate phase relations between neurons is proposed.

...read moreread less

Abstract: We propose a novel architecture of an oscillatory neural network that consists of phase-locked loop (PLL) circuits. It stores and retrieves complex oscillatory patterns as synchronized states with appropriate phase relations between neurons.

...read moreread less

356 citations

Journal Article•DOI•

[...]

Bogdan Gabrys¹, Andrzej Bargiela¹•Institutions (1)

Nottingham Trent University¹

Predicting subscriber dissatisfaction and improving retention in the wireless telecommunications industry

TL;DR: A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given.

...read moreread less

Abstract: This paper describes a general fuzzy min-max (GFMM) neural network which is a generalization and extension of the fuzzy min-max clustering and classification algorithms of Simpson (1992, 1993). The GFMM method combines supervised and unsupervised learning in a single training algorithm. The fusion of clustering and classification resulted in an algorithm that can be used as pure clustering, pure classification, or hybrid clustering classification. It exhibits a property of finding decision boundaries between classes while clustering patterns that cannot be said to belong to any of existing classes. Similarly to the original algorithms, the hyperbox fuzzy sets are used as a representation of clusters and classes. Learning is usually completed in a few passes and consists of placing and adjusting the hyperboxes in the pattern space; this is an expansion-contraction process. The classification results can be crisp or fuzzy. New data can be included without the need for retraining. While retaining all the interesting features of the original algorithms, a number of modifications to their definition have been made in order to accommodate fuzzy input patterns in the form of lower and upper bounds, combine the supervised and unsupervised learning, and improve the effectiveness of operations. A detailed account of the GFMM neural network, its comparison with the Simpson's fuzzy min-max neural networks, a set of examples, and an application to the leakage detection and identification in water distribution systems are given.

...read moreread less

Journal Article•DOI•

[...]

Michael C. Mozer¹, Richard Wolniewicz, David B. Grimes, Eric A. Johnson, Howard Kaushansky - Show less +1 more•Institutions (1)

University of Colorado Boulder¹

Probabilistic neural-network structure determination for pattern classification

TL;DR: These experiments show that under a wide variety of assumptions concerning the cost of intervention and the retention rate resulting from intervention, using predictive techniques to identify potential churners and offering incentives can yield significant savings to a carrier.

...read moreread less

Abstract: We explore techniques from statistical machine learning to predict churn and, based on these predictions, to determine what incentives should be offered to subscribers to improve retention and maximize profitability to the carrier. The techniques include legit regression, decision trees, neural networks, and boosting. Our experiments are based on a database of nearly 47000 USA domestic subscribers and includes information about their usage, billing, credit, application, and complaint history. Our experiments show that under a wide variety of assumptions concerning the cost of intervention and the retention rate resulting from intervention, using predictive techniques to identify potential churners and offering incentives can yield significant savings to a carrier. We also show the importance of a data representation crafted by domain experts. Finally, we report on a real-world test of the techniques that validate our simulation experiments.

...read moreread less

Journal Article•DOI•

[...]

Kezhi Mao¹, K.-C. Tan¹, Wee Ser¹•Institutions (1)

Nanyang Technological University¹

01 Jul 2000-IEEE Transactions on Neural Networks

TL;DR: A supervised network structure determination algorithm that identifies an appropriate smoothing parameter using a genetic algorithm and determines suitable pattern layer neurons using a forward regression orthogonal algorithm is proposed.

...read moreread less

Abstract: Network structure determination is an important issue in pattern classification based on a probabilistic neural network. In this study, a supervised network structure determination algorithm is proposed. The proposed algorithm consists of two parts and runs in an iterative way. The first part identifies an appropriate smoothing parameter using a genetic algorithm, while the second part determines suitable pattern layer neurons using a forward regression orthogonal algorithm. The proposed algorithm is capable of offering a fairly small network structure with satisfactory classification accuracy.

...read moreread less

Journal Article•DOI•

Selecting radial basis function network centers with recursive orthogonal least squares training

[...]

J.B. Gomm¹, Dingli Yu¹•Institutions (1)

Liverpool John Moores University¹

Global stability for cellular neural networks with time delay

TL;DR: It is shown that the information available in an ROLS algorithm after network training can be used to sequentially select centers to minimize the network output error and provide efficient methods for network reduction to achieve smaller architectures with acceptable accuracy and without retraining.

...read moreread less

Abstract: Recursive orthogonal least squares (ROLS) is a numerically robust method for solving for the output layer weights of a radial basis function (RBF) network, and requires less computer memory than the batch alternative. In the paper, the use of ROLS is extended to selecting the centers of an RBF network. It is shown that the information available in an ROLS algorithm after network training can be used to sequentially select centers to minimize the network output error. This provides efficient methods for network reduction to achieve smaller architectures with acceptable accuracy and without retraining. Two selection methods are developed, forward and backward. The methods are illustrated in applications of RBF networks to modeling a nonlinear time series and a real multiinput-multioutput chemical process. The final network models obtained achieve acceptable accuracy with significant reductions in the number of required centers.

...read moreread less

Journal Article•DOI•

[...]

Teh-Lu Liao¹, Fong-Chin Wang•Institutions (1)

National Cheng Kung University¹

Constructive neural-network learning algorithms for pattern classification

TL;DR: A sufficient condition related to the existence of a unique equilibrium point and its global asymptotic stability for cellular network networks with delay (DCNNs) is derived and it is shown that the condition relies on the feedback matrices and is independent of the delay parameter.

...read moreread less

Abstract: A sufficient condition related to the existence of a unique equilibrium point and its global asymptotic stability for cellular network networks with delay (DCNNs) is derived. It is shown that the condition relies on the feedback matrices and is independent of the delay parameter. Furthermore, this condition is less restrictive than that given in the literature.

...read moreread less

Journal Article•DOI•

[...]

Rajesh Parekh, Jihoon Yang¹, Vasant Honavar•Institutions (1)

HRL Laboratories¹

Variational Gaussian process classifiers

TL;DR: Two constructive learning algorithms MPyramid-real and MTiling-real are presented that extend the pyramid and tiling algorithms, respectively, for learning real to M-ary mappings and it is proved the convergence of these algorithms and empirically demonstrate their applicability to practical pattern classification problems.

...read moreread less

Abstract: Constructive learning algorithms offer an attractive approach for the incremental construction of near-minimal neural-network architectures for pattern classification. They help overcome the need for ad hoc and often inappropriate choices of network topology in algorithms that search for suitable weights in a priori fixed network architectures. Several such algorithms are proposed in the literature and shown to converge to zero classification errors (under certain assumptions) on tasks that involve learning a binary to binary mapping (i.e., classification problems involving binary-valued input attributes and two output categories). We present two constructive learning algorithms, MPyramid-real and MTiling-real, that extend the pyramid and tiling algorithms, respectively, for learning real to M-ary mappings (i.e., classification problems involving real-valued input attributes and multiple output classes). We prove the convergence of these algorithms and empirically demonstrate their applicability to practical pattern classification problems. Additionally, we show how the incorporation of a local pruning step can eliminate several redundant neurons from MTiling-real networks.

...read moreread less

Journal Article•DOI•

[...]

M.N. Gibbs¹, D.J.C. Mackay•Institutions (1)

University of Cambridge¹

Mixture of experts for classification of gender, ethnic origin, and pose of human faces

TL;DR: The variational methods of Jaakkola and Jordan are applied to Gaussian processes to produce an efficient Bayesian binary classifier.

...read moreread less

Abstract: Gaussian processes are a promising nonlinear regression tool, but it is not straightforward to solve classification problems with them. In the paper the variational methods of Jaakkola and Jordan (2000) are applied to Gaussian processes to produce an efficient Bayesian binary classifier.

...read moreread less

Journal Article•DOI•

[...]

S. Gutta¹, Jeffrey Huang², P. Jonathon³, Harry Wechsler²•Institutions (3)

Philips¹, George Mason University², National Institute of Standards and Technology³

01 Jul 2000-IEEE Transactions on Neural Networks

TL;DR: In this article, the authors describe the application of mixtures of experts on gender and ethnic classification of human faces and pose classification, and show their feasibility on the FERET database of facial images.

...read moreread less

Abstract: We describe the application of mixtures of experts on gender and ethnic classification of human faces, and pose classification, and show their feasibility on the FERET database of facial images. The mixture of experts is implemented using the "divide and conquer" modularity principle with respect to the granularity and/or the locality of information. The mixture of experts consists of ensembles of radial basis functions (RBFs). Inductive decision trees (DTs) and support vector machines (SVMs) implement the "gating network" components for deciding which of the experts should be used to determine the classification output and to restrict the support of the input space. Both the ensemble of RBF's (ERBF) and SVM use the RBF kernel ("expert") for gating the inputs. Our experimental results yield an average accuracy rate of 96% on gender classification and 92% on ethnic classification using the ERBF/DT approach from frontal face images, while the SVM yield 100% on pose classification.

...read moreread less

Journal Article•DOI•

Classification ability of single hidden layer feedforward neural networks

[...]

Guang-Bin Huang¹, Yan-Qiu Chen¹, Haroon A. Babri²•Institutions (2)

Nanyang Technological University¹, Kuwait University²

Underwater target classification using wavelet packets and neural networks

TL;DR: It is proved that single hidden layer feedforward neural networks (SLFN's) with any continuous bounded nonconstant activation function or any arbitrary bounded (continuous or not continuous) activation function which has unequal limits at infinities (not just perceptrons) can form disjoint decision regions with arbitrary shapes in multidimensional cases.

...read moreread less

Abstract: Multilayer perceptrons with hard-limiting (signum) activation functions can form complex decision regions. It is well known that a three-layer perceptron (two hidden layers) can form arbitrary disjoint decision regions and a two-layer perceptron (one hidden layer) can form single convex decision regions. This paper further proves that single hidden layer feedforward neural networks (SLFN) with any continuous bounded nonconstant activation function or any arbitrary bounded (continuous or not continuous) activation function which has unequal limits at infinities (not just perceptrons) can form disjoint decision regions with arbitrary shapes in multidimensional cases, SLFN with some unbounded activation function can also form disjoint decision regions with arbitrary shapes.

...read moreread less

Journal Article•DOI•

[...]

Mahmood R. Azimi-Sadjadi¹, De Yao¹, Qiang Huang¹, Gerald J. Dobeck•Institutions (1)

Colorado State University¹

Robust backstepping control of induction motors using neural networks

TL;DR: A new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals using a feature extractor using wavelet packets in conjunction with linear predictive coding, a feature selection scheme, and a backpropagation neural-network classifier.

...read moreread less

Abstract: In this paper, a new subband-based classification scheme is developed for classifying underwater mines and mine-like targets from the acoustic backscattered signals. The system consists of a feature extractor using wavelet packets in conjunction with linear predictive coding (LPC), a feature selection scheme, and a backpropagation neural-network classifier. The data set used for this study consists of the backscattered signals from six different objects: two mine-like targets and four nontargets for several aspect angles. Simulation results on ten different noisy realizations and for signal-to-noise ratio (SNR) of 12 dB are presented. The receiver operating characteristic (ROC) curve of the classifier generated based on these results demonstrated excellent classification performance of the system. The generalization ability of the trained network was demonstrated by computing the error and classification rate statistics on a large data set. A multiaspect fusion scheme was also adopted in order to further improve the classification performance.

...read moreread less

Journal Article•DOI•

[...]

Chiman Kwan, Frank L. Lewis

Stable neural controller design for unknown nonlinear systems using backstepping

TL;DR: A new robust control technique for induction motors using neural networks (NNs) which is systematic and robust to parameter variations and does not require regression matrices, so that no preliminary dynamical analysis is needed.

...read moreread less

Abstract: We present a new robust control technique for induction motors using neural networks (NNs). The method is systematic and robust to parameter variations. Motivated by the backstepping design technique, we first treat certain signals in the system as fictitious control inputs to a simpler subsystem. A two-layer NN is used in this stage to design the fictitious controller. We then apply a second two-layer NN to robustly realize the fictitious NN signals designed in the previous step. A new tuning scheme is proposed which can guarantee the boundedness of tracking error and weight updates. A main advantage of our method is that it does not require regression matrices, so that no preliminary dynamical analysis is needed. Another salient feature of our NN approach is that the off-line learning phase is not needed. Full state feedback is needed for implementation. Load torque and rotor resistance can be unknown but bounded.

...read moreread less

Journal Article•DOI•

[...]

Youping Zhang, Pei-Yaun Peng¹, Zhong-Ping Jiang•Institutions (1)

United Technologies¹

The annealing robust backpropagation (ARBP) learning algorithm

TL;DR: This paper proposes a neural controller for a class of unknown, minimum phase, feedback linearizable nonlinear system with known relative degree, based on the backstepping design technique in conjunction with a linearly parameterized neural-network structure.

...read moreread less

Abstract: We propose, from an adaptive control perspective, a neural controller for a class of unknown, minimum phase, feedback linearizable nonlinear system with known relative degree. The control scheme is based on the backstepping design technique in conjunction with a linearly parametrized neural-network structure. The resulting controller, however, moves the complex mechanics involved in a typical backstepping design from off-line to online. With appropriate choice of the network size and neural basis functions, the same controller can be trained online to control different nonlinear plants with the same relative degree, with semi-global stability as shown by the simple Lyapunov analysis. Meanwhile, the controller also preserves some of the performance properties of the standard backstepping controllers. Simulation results are shown to demonstrate these properties and to compare the neural controller with a standard backstepping controller.

...read moreread less

Journal Article•DOI•

[...]

Chen-Chia Chuang¹, Shun-Feng Su¹, Chin-Ching Hsiao¹•Institutions (1)

National Taiwan University of Science and Technology¹

Extracting rules from trained neural networks

TL;DR: The annealing robust backpropagation learning algorithm (ARBP) that adopts the annealed concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers.

...read moreread less

Abstract: Multilayer feedforward neural networks are often referred to as universal approximators. Nevertheless, if the used training data are corrupted by large noise, such as outliers, traditional backpropagation learning schemes may not always come up with acceptable performance. Even though various robust learning algorithms have been proposed in the literature, those approaches still suffer from the initialization problem. In those robust learning algorithms, the so-called M-estimator is employed. For the M-estimation type of learning algorithms, the loss function is used to play the role in discriminating against outliers from the majority by degrading the effects of those outliers in learning. However, the loss function used in those algorithms may not correctly discriminate against those outliers. In the paper, the annealing robust backpropagation learning algorithm (ARBP) that adopts the annealing concept into the robust learning algorithms is proposed to deal with the problem of modeling under the existence of outliers. The proposed algorithm has been employed in various examples. Those results all demonstrated the superiority over other robust learning algorithms independent of outliers. In the paper, not only is the annealing concept adopted into the robust learning algorithms but also the annealing schedule k/t was found experimentally to achieve the best performance among other annealing schedules, where k is a constant and t is the epoch number.

...read moreread less

Journal Article•DOI•

[...]

H. Tsukimoto¹•Institutions (1)

Toshiba¹

The evidence framework applied to support vector machines

TL;DR: The algorithm is a decompositional approach which can be applied to any neural network whose output function is monotone such as sigmoid function, and it does not depend on training algorithms, and its computational complexity is polynomial.

...read moreread less

Abstract: Presents an algorithm for extracting rules from trained neural networks. The algorithm is a decompositional approach which can be applied to any neural network whose output function is monotone such as a sigmoid function. Therefore, the algorithm can be applied to multilayer neural networks, recurrent neural networks and so on. It does not depend on training algorithms, and its computational complexity is polynomial. The basic idea is that the units of neural networks are approximated by Boolean functions. But the computational complexity of the approximation is exponential, and so a polynomial algorithm is presented. The author has applied the algorithm to several problems to extract understandable and accurate rules. The paper shows the results for the votes data, mushroom data, and others. The algorithm is extended to the continuous domain, where extracted rules are continuous Boolean functions. Roughly speaking, the representation by continuous Boolean functions means the representation using conjunction, disjunction, direct proportion, and reverse proportion. This paper shows the results for iris data.

...read moreread less

Journal Article•DOI•

[...]

James T. Kwok¹•Institutions (1)

Hong Kong Baptist University¹

The analysis of decomposition methods for support vector machines

TL;DR: It is shown that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework, which allows automatic adjustment of the regularization parameter and the kernel parameter to their near-optimal values.

...read moreread less

Abstract: We show that training of the support vector machine (SVM) can be interpreted as performing the level 1 inference of MacKay's evidence framework (1992). We further on show that levels 2 and 3 of the evidence framework can also be applied to SVMs. This integration allows automatic adjustment of the regularization parameter and the kernel parameter to their near-optimal values. Moreover, it opens up a wealth of Bayesian tools for use with SVMs. Performance of this method is evaluated on both synthetic and real-world data sets.

...read moreread less

Journal Article•DOI•

[...]

Chih-Chung Chang¹, Hsu Chih-Wei¹, Chih-Jen Lin¹•Institutions (1)

National Taiwan University¹

01 Jul 2000-IEEE Transactions on Neural Networks

TL;DR: This paper connects this method to projected gradient methods and provides theoretical proofs for a version of decomposition methods and shows that this convergence proof is valid for general decomposition Methods if their working set selection meets a simple requirement.

...read moreread less

Abstract: The support vector machine (SVM) is a promising technique for pattern recognition. It requires the solution of a large dense quadratic programming problem. Traditional optimization methods cannot be directly applied due to memory restrictions. Up to now, very few methods can handle the memory problem and an important one is the "decomposition method." However, there is no convergence proof so far. We connect this method to projected gradient methods and provide theoretical proofs for a version of decomposition methods. An extension to bound-constrained formulation of SVM is also provided. We then show that this convergence proof is valid for general decomposition methods if their working set selection meets a simple requirement.

...read moreread less

Journal Article•DOI•

A recurrent neural network for nonlinear optimization with a continuously differentiable objective function and bound constraints

[...]

Xue-Bin Liang¹, Jun Wang•Institutions (1)

University of Delaware¹

Combination of artificial neural-network forecasters for prediction of natural gas consumption

TL;DR: This paper presents a continuous-time recurrent neural-network model for nonlinear optimization with any continuously differentiable objective function and bound constraints and shows that the recurrent neural network is globally exponentially stable for almost any positive network parameters.

...read moreread less

Abstract: This paper presents a continuous-time recurrent neural-network model for nonlinear optimization with any continuously differentiable objective function and bound constraints. Quadratic optimization with bound constraints is a special problem which can be solved by the recurrent neural network. The proposed recurrent neural network has the following characteristics. 1) It is regular in the sense that any optimum of the objective function with bound constraints is also an equilibrium point of the neural network. If the objective function to be minimized is convex, then the recurrent neural network is complete in the sense that the set of optima of the function with bound constraints coincides with the set of equilibria of the neural network. 2) The recurrent neural network is primal and quasiconvergent in the sense that its trajectory cannot escape from the feasible region and will converge to the set of equilibria of the neural network for any initial point in the feasible bound region. 3) The recurrent neural network has an attractivity property in the sense that its trajectory will eventually converge to the feasible region for any initial states even at outside of the bounded feasible region. 4) For minimizing any strictly convex quadratic objective function subject to bound constraints, the recurrent neural network is globally exponentially stable for almost any positive network parameters. Simulation results are given to demonstrate the convergence and performance of the proposed recurrent neural network for nonlinear optimization with bound constraints.

...read moreread less

Journal Article•DOI•

[...]

Alireza Khotanzad¹, H. Elragal¹, T.-L. Lu¹•Institutions (1)

Southern Methodist University¹

Data strip mining for the virtual design of pharmaceuticals with neural networks

TL;DR: The results indicate that combination strategies based on a single ANN outperform the other approaches to prediction of daily natural gas consumption needed by gas utilities.

...read moreread less

Abstract: The focus of this paper is on combination of artificial neural-network (ANN) forecasters with application to the prediction of daily natural gas consumption needed by gas utilities. ANN forecasters can model the complex relationship between weather parameters and previous gas consumption with the future consumption. A two-stage system is proposed with the first stage containing two ANN forecasters, a multilayer feedforward ANN and a functional link ANN. These forecasters are initially trained with the error backpropagation algorithm, but an adaptive strategy is employed to adjust their weights during online forecasting. The second stage consists of a combination module to mix the two individual forecasts produced in the first stage. Eight different combination algorithms are examined, they are based on: averaging, recursive least squares, fuzzy logic, feedforward ANN, functional link ANN, temperature space approach, Karmarkar's linear programming algorithm (1984) and adaptive mixture of local experts (modular neural networks). The performance is tested on real data from six different gas utilities. The results indicate that combination strategies based on a single ANN outperform the other approaches.

...read moreread less

Journal Article•DOI•

[...]

Robert Kewley¹, Mark J. Embrechts², Curt M. Breneman²•Institutions (2)

United States Military Academy¹, Rensselaer Polytechnic Institute²