scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 2006"


Journal ArticleDOI
TL;DR: This paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer.
Abstract: According to conventional neural network theories, single-hidden-layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes are universal approximators when all the parameters of the networks are allowed adjustable. However, as observed in most neural network implementations, tuning all the parameters of the networks may cause learning complicated and inefficient, and it may be difficult to train networks with nondifferential activation functions such as threshold networks. Unlike conventional neural network theories, this paper proves in an incremental constructive method that in order to let SLFNs work as universal approximators, one may simply randomly choose hidden nodes and then only need to adjust the output weights linking the hidden layer and the output layer. In such SLFNs implementations, the activation functions for additive nodes can be any bounded nonconstant piecewise continuous functions g:R→R and the activation functions for RBF nodes can be any integrable piecewise continuous functions g:R→R and ∫Rg(x)dx≠0. The proposed incremental method is efficient not only for SFLNs with continuous (including nondifferentiable) activation functions but also for SLFNs with piecewise continuous (such as threshold) activation functions. Compared to other popular methods such a new network is fully automatic and users need not intervene the learning process by manually tuning control parameters.

2,413 citations


Journal ArticleDOI
TL;DR: The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance on benchmark problems drawn from the regression, classification and time series prediction areas.
Abstract: In this paper, we develop an online sequential learning algorithm for single hidden layer feedforward networks (SLFNs) with additive or radial basis function (RBF) hidden nodes in a unified framework. The algorithm is referred to as online sequential extreme learning machine (OS-ELM) and can learn data one-by-one or chunk-by-chunk (a block of data) with fixed or varying chunk size. The activation functions for additive nodes in OS-ELM can be any bounded nonconstant piecewise continuous functions and the activation functions for RBF nodes can be any integrable piecewise continuous functions. In OS-ELM, the parameters of hidden nodes (the input weights and biases of additive nodes or the centers and impact factors of RBF nodes) are randomly selected and the output weights are analytically determined based on the sequentially arriving data. The algorithm uses the ideas of ELM of Huang developed for batch learning which has been shown to be extremely fast with generalization performance better than other batch training methods. Apart from selecting the number of hidden nodes, no other control parameters have to be manually chosen. Detailed performance comparison of OS-ELM is done with other popular sequential learning algorithms on benchmark problems drawn from the regression, classification and time series prediction areas. The results show that the OS-ELM is faster than the other sequential algorithms and produces better generalization performance

1,800 citations


Journal ArticleDOI
TL;DR: In this article, a mixed-mode analog/digital VLSI device comprising an array of leaky integrate-and-fire (I&F) neurons, adaptive synapses with spike-timing dependent plasticity, and an asynchronous event based communication infrastructure is presented.
Abstract: We present a mixed-mode analog/digital VLSI device comprising an array of leaky integrate-and-fire (I&F) neurons, adaptive synapses with spike-timing dependent plasticity, and an asynchronous event based communication infrastructure that allows the user to (re)configure networks of spiking neurons with arbitrary topologies. The asynchronous communication protocol used by the silicon neurons to transmit spikes (events) off-chip and the silicon synapses to receive spikes from the outside is based on the "address-event representation" (AER). We describe the analog circuits designed to implement the silicon neurons and synapses and present experimental data showing the neuron's response properties and the synapses characteristics, in response to AER input spike trains. Our results indicate that these circuits can be used in massively parallel VLSI networks of I&F neurons to simulate real-time complex spike-based learning algorithms.

876 citations


Journal ArticleDOI
TL;DR: This paper proposes some new feature extractors based on maximum margin criterion (MMC) and establishes a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA.
Abstract: In pattern recognition, feature extraction techniques are widely employed to reduce the dimensionality of data and to enhance the discriminatory information. Principal component analysis (PCA) and linear discriminant analysis (LDA) are the two most popular linear dimensionality reduction methods. However, PCA is not very effective for the extraction of the most discriminant features, and LDA is not stable due to the small sample size problem . In this paper, we propose some new (linear and nonlinear) feature extractors based on maximum margin criterion (MMC). Geometrically, feature extractors based on MMC maximize the (average) margin between classes after dimensionality reduction. It is shown that MMC can represent class separability better than PCA. As a connection to LDA, we may also derive LDA from MMC by incorporating some constraints. By using some other constraints, we establish a new linear feature extractor that does not suffer from the small sample size problem, which is known to cause serious stability problems for LDA. The kernelized (nonlinear) counterpart of this linear feature extractor is also established in the paper. Our extensive experiments demonstrate that the new feature extractors are effective, stable, and efficient.

838 citations


Journal ArticleDOI
TL;DR: It is shown that the addressed stochastic Cohen-Grossberg neural networks with mixed delays are globally asymptotically stable in the mean square if two LMIs are feasible, where the feasibility of LMIs can be readily checked by the Matlab LMI toolbox.
Abstract: In this letter, the global asymptotic stability analysis problem is considered for a class of stochastic Cohen-Grossberg neural networks with mixed time delays, which consist of both the discrete and distributed time delays. Based on an Lyapunov-Krasovskii functional and the stochastic stability analysis theory, a linear matrix inequality (LMI) approach is developed to derive several sufficient conditions guaranteeing the global asymptotic convergence of the equilibrium point in the mean square. It is shown that the addressed stochastic Cohen-Grossberg neural networks with mixed delays are globally asymptotically stable in the mean square if two LMIs are feasible, where the feasibility of LMIs can be readily checked by the Matlab LMI toolbox. It is also pointed out that the main results comprise some existing results as special cases. A numerical example is given to demonstrate the usefulness of the proposed global stability criteria

433 citations


Journal ArticleDOI
TL;DR: The presented deterministic learning mechanism and the neural learning control scheme provide elementary components toward the development of a biologically-plausible learning and control methodology.
Abstract: One of the amazing successes of biological systems is their ability to "learn by doing" and so adapt to their environment. In this paper, first, a deterministic learning mechanism is presented, by which an appropriately designed adaptive neural controller is capable of learning closed-loop system dynamics during tracking control to a periodic reference orbit. Among various neural network (NN) architectures, the localized radial basis function (RBF) network is employed. A property of persistence of excitation (PE) for RBF networks is established, and a partial PE condition of closed-loop signals, i.e., the PE condition of a regression subvector constructed out of the RBFs along a periodic state trajectory, is proven to be satisfied. Accurate NN approximation for closed-loop system dynamics is achieved in a local region along the periodic state trajectory, and a learning ability is implemented during a closed-loop feedback control process. Second, based on the deterministic learning mechanism, a neural learning control scheme is proposed which can effectively recall and reuse the learned knowledge to achieve closed-loop stability and improved control performance. The significance of this paper is that the presented deterministic learning mechanism and the neural learning control scheme provide elementary components toward the development of a biologically-plausible learning and control methodology. Simulation studies are included to demonstrate the effectiveness of the approach.

366 citations


Journal ArticleDOI
TL;DR: The notion of "reduced convex hull" is employed and supported by a set of new theoretical results that allow existing geometric algorithms to be directly and practically applied to solve not only separable, but also nonseparable classification problems both accurately and efficiently.
Abstract: The geometric framework for the support vector machine (SVM) classification problem provides an intuitive ground for the understanding and the application of geometric optimization algorithms, leading to practical solutions of real world classification problems. In this work, the notion of "reduced convex hull" is employed and supported by a set of new theoretical results. These results allow existing geometric algorithms to be directly and practically applied to solve not only separable, but also nonseparable classification problems both accurately and efficiently. As a practical application of the new theoretical results, a known geometric algorithm has been employed and transformed accordingly to solve nonseparable problems successfully

342 citations


Journal ArticleDOI
TL;DR: Two supervised methods for enhancing the classification accuracy of the Nonnegative Matrix Factorization (NMF) algorithm are presented and greatly enhance the performance of NMF for frontal face verification.
Abstract: In this paper, two supervised methods for enhancing the classification accuracy of the Nonnegative Matrix Factorization (NMF) algorithm are presented. The idea is to extend the NMF algorithm in order to extract features that enforce not only the spatial locality, but also the separability between classes in a discriminant manner. The first method employs discriminant analysis in the features derived from NMF. In this way, a two-phase discriminant feature extraction procedure is implemented, namely NMF plus Linear Discriminant Analysis (LDA). The second method incorporates the discriminant constraints inside the NMF decomposition. Thus, a decomposition of a face to its discriminant parts is obtained and new update rules for both the weights and the basis images are derived. The introduced methods have been applied to the problem of frontal face verification using the well-known XM2VTS database. Both methods greatly enhance the performance of NMF for frontal face verification

330 citations


Journal ArticleDOI
TL;DR: The main results include a simple asymptotic convergence proof, a general explanation of the shrinking and caching techniques, and the linear convergence of the methods.
Abstract: Decomposition methods are currently one of the major methods for training support vector machines. They vary mainly according to different working set selections. Existing implementations and analysis usually consider some specific selection rules. This paper studies sequential minimal optimization type decomposition methods under a general and flexible way of choosing the two-element working set. The main results include: 1) a simple asymptotic convergence proof, 2) a general explanation of the shrinking and caching techniques, and 3) the linear convergence of the methods. Extensions to some support vector machine variants are also discussed.

302 citations


Journal ArticleDOI
TL;DR: The proposed Lyapunov-Krasovskii functional and linear matrix inequality result is computationally efficient as it can be solved numerically using standard commercial software.
Abstract: By employing the Lyapunov-Krasovskii functional and linear matrix inequality (LMI) approach, the problem of global asymptotical stability is studied for recurrent neural networks with both discrete time-varying delays and distributed time-varying delays. Some sufficient conditions are given for checking the global asymptotical stability of recurrent neural networks with mixed time-varying delay. The proposed LMI result is computationally efficient as it can be solved numerically using standard commercial software. Two examples are given to show the usefulness of the results

296 citations


Journal ArticleDOI
TL;DR: A hybrid Taguchi-genetic algorithm (HTGA) is applied to solve the problem of tuning both network structure and parameters of a feedforward neural network and can obtain better results than the existing method reported recently in the literature.
Abstract: In this paper, a hybrid Taguchi-genetic algorithm (HTGA) is applied to solve the problem of tuning both network structure and parameters of a feedforward neural network. The HTGA approach is a method of combining the traditional genetic algorithm (TGA), which has a powerful global exploration capability, with the Taguchi method, which can exploit the optimum offspring. The Taguchi method is inserted between crossover and mutation operations of a TGA. Then, the systematic reasoning ability of the Taguchi method is incorporated in the crossover operations to select the better genes to achieve crossover, and consequently enhance the genetic algorithms. Therefore, the HTGA approach can be more robust, statistically sound, and quickly convergent. First, the authors evaluate the performance of the presented HTGA approach by studying some global numerical optimization problems. Then, the presented HTGA approach is effectively applied to solve three examples on forecasting the sunspot numbers, tuning the associative memory, and solving the XOR problem. The numbers of hidden nodes and the links of the feedforward neural network are chosen by increasing them from small numbers until the learning performance is good enough. As a result, a partially connected feedforward neural network can be obtained after tuning. This implies that the cost of implementation of the neural network can be reduced. In these studied problems of tuning both network structure and parameters of a feedforward neural network, there are many parameters and numerous local optima so that these studied problems are challenging enough for evaluating the performances of any proposed GA-based approaches. The computational experiments show that the presented HTGA approach can obtain better results than the existing method reported recently in the literature.

Journal ArticleDOI
TL;DR: An improved version of the FastICA algorithm is proposed which is asymptotically efficient, i.e., its accuracy given by the residual error variance attains the Cramer-Rao lower bound (CRB).
Abstract: FastICA is one of the most popular algorithms for independent component analysis (ICA), demixing a set of statistically independent sources that have been mixed linearly. A key question is how accurate the method is for finite data samples. We propose an improved version of the FastICA algorithm which is asymptotically efficient, i.e., its accuracy given by the residual error variance attains the Cramer-Rao lower bound (CRB). The error is thus as small as possible. This result is rigorously proven under the assumption that the probability distribution of the independent signal components belongs to the class of generalized Gaussian (GG) distributions with parameter alpha, denoted GG(alpha) for alpha>2. We name the algorithm efficient FastICA (EFICA). Computational complexity of a Matlab implementation of the algorithm is shown to be only slightly (about three times) higher than that of the standard symmetric FastICA. Simulations corroborate these claims and show superior performance of the algorithm compared with algorithm JADE of Cardoso and Souloumiac and nonparametric ICA of Boscolo on separating sources with distribution GG(alpha) with arbitrary alpha, as well as on sources with bimodal distribution, and a good performance in separating linearly mixed speech signals

Journal ArticleDOI
TL;DR: This correspondence addresses the facial expression recognition problem using kernel canonical correlation analysis (KCCA) and proposes an improved KCCA algorithm to tackle the singularity problem of the Gram matrix.
Abstract: In this correspondence, we address the facial expression recognition problem using kernel canonical correlation analysis (KCCA). Following the method proposed by Lyons et al. and Zhang et al. , we manually locate 34 landmark points from each facial image and then convert these geometric points into a labeled graph (LG) vector using the Gabor wavelet transformation method to represent the facial features. On the other hand, for each training facial image, the semantic ratings describing the basic expressions are combined into a six-dimensional semantic expression vector. Learning the correlation between the LG vector and the semantic expression vector is performed by KCCA. According to this correlation, we estimate the associated semantic expression vector of a given test image and then perform the expression classification according to this estimated semantic expression vector. Moreover, we also propose an improved KCCA algorithm to tackle the singularity problem of the Gram matrix. The experimental results on the Japanese female facial expression database and the Ekman's "Pictures of Facial Affect" database illustrate the effectiveness of the proposed method.

Journal ArticleDOI
TL;DR: A simple learning algorithm capable of real-time learning which can automatically select appropriate values of neural quantizers and analytically determine the parameters (weights and bias) of the network at one time only is proposed.
Abstract: In some practical applications of neural networks, fast response to external events within an extremely short time is highly demanded and expected. However, the extensively used gradient-descent-based learning algorithms obviously cannot satisfy the real-time learning needs in many applications, especially for large-scale applications and/or when higher generalization performance is required. Based on Huang's constructive network model, this paper proposes a simple learning algorithm capable of real-time learning which can automatically select appropriate values of neural quantizers and analytically determine the parameters (weights and bias) of the network at one time only. The performance of the proposed algorithm has been systematically investigated on a large batch of benchmark real-world regression and classification problems. The experimental results demonstrate that our algorithm can not only produce good generalization performance but also have real-time learning and prediction capability. Thus, it may provide an alternative approach for the practical applications of neural networks where real-time learning and prediction implementation is required.

Journal ArticleDOI
TL;DR: BTS and its enhanced version, c-BTS, decrease the number of binary classifiers to the greatest extent without increasing the complexity of the original problem to achieve high classification efficiency for multiclass problems.
Abstract: We present a new architecture named Binary Tree of support vector machine (SVM), or BTS, in order to achieve high classification efficiency for multiclass problems. BTS and its enhanced version, c-BTS, decrease the number of binary classifiers to the greatest extent without increasing the complexity of the original problem. In the training phase, BTS has N-1 binary classifiers in the best situation (N is the number of classes), while it has log4/3((N+3)/4) binary tests on average when making a decision. At the same time the upper bound of convergence complexity is determined. The experiments in this paper indicate that maintaining comparable accuracy, BTS is much faster to be trained than other methods. Especially in classification, due to its Log complexity, it is much faster than directed acyclic graph SVM (DAGSVM) and ECOC in problems that have big class number

Journal ArticleDOI
TL;DR: A stable neural network (NN)-based observer for general multivariable nonlinear systems is presented in this paper and the stability of the recurrent neural network observer is shown by Lyapunov's direct method.
Abstract: A stable neural network (NN)-based observer for general multivariable nonlinear systems is presented in this paper. Unlike most previous neural network observers, the proposed observer uses a nonlinear-in-parameters neural network (NLPNN). Therefore, it can be applied to systems with higher degrees of nonlinearity without any a priori knowledge about system dynamics. The learning rule for the neural network is a novel approach based on the modified backpropagation (BP) algorithm. An e-modification term is added to guarantee robustness of the observer. No strictly positive real (SPR) or any other strong assumption is imposed on the proposed approach. The stability of the recurrent neural network observer is shown by Lyapunov's direct method. Simulation results for a flexible-joint manipulator are presented to demonstrate the enhanced performance achieved by utilizing the proposed neural network observer.

Journal ArticleDOI
TL;DR: In this paper, a neural net-based actuator saturation compensation scheme for nonlinear systems in Brunovsky canonical form is presented and rigorously proved and verified using a general "pendulum type" and a robot manipulator dynamical systems.
Abstract: A neural net (NN)-based actuator saturation compensation scheme for the nonlinear systems in Brunovsky canonical form is presented. The scheme that leads to stability, command following, and disturbance rejection is rigorously proved and verified using a general "pendulum type" and a robot manipulator dynamical systems. Online weights tuning law, the overall closed-loop system performance, and the boundedness of the NN weights are derived and guaranteed based on Lyapunov approach. The actuator saturation is assumed to be unknown and the saturation compensator is inserted into a feedforward path. Simulation results indicate that the proposed scheme can effectively compensate for the saturation nonlinearity in the presence of system uncertainty.

Journal ArticleDOI
TL;DR: Simulation results verify that the proposed WABC can achieve favorable tracking performance by incorporating of WNN identification, adaptive backstepping control, and L2 robust control techniques.
Abstract: This paper proposes a wavelet adaptive backstepping control (WABC) system for a class of second-order nonlinear systems. The WABC comprises a neural backstepping controller and a robust controller. The neural backstepping controller containing a wavelet neural network (WNN) identifier is the principal controller, and the robust controller is designed to achieve L2 tracking performance with desired attenuation level. Since the WNN uses wavelet functions, its learning capability is superior to the conventional neural network for system identification. Moreover, the adaptation laws of the control system are derived in the sense of Lyapunov function and Barbalat's lemma, thus the system can be guaranteed to be asymptotically stable. The proposed WABC is applied to two nonlinear systems, a chaotic system and a wing-rock motion system to illustrate its effectiveness. Simulation results verify that the proposed WABC can achieve favorable tracking performance by incorporating of WNN identification, adaptive backstepping control, and L2 robust control techniques

Journal ArticleDOI
TL;DR: The simplified dual neural network is shown to be globally convergent to the exact optimal solution of k-winners-take-all (KWTA) operation.
Abstract: The design, analysis, and application of a new recurrent neural network for quadratic programming, called simplified dual neural network, are discussed. The analysis mainly concentrates on the convergence property and the computational complexity of the neural network. The simplified dual neural network is shown to be globally convergent to the exact optimal solution. The complexity of the neural network architecture is reduced with the number of neurons equal to the number of inequality constraints. Its application to k-winners-take-all (KWTA) operation is discussed to demonstrate how to solve problems with this neural network

Journal ArticleDOI
TL;DR: The proposed method is based on the free-weighting matrix approach and is applicable to the case that the derivative of a time-varying delay takes any value, and an algorithm is presented to compute the state estimator.
Abstract: In this letter, the delay-dependent state estimation problem for neural networks with time-varying delay is investigated. A delay-dependent criterion is established to estimate the neuron states through available output measurements such that the dynamics of the estimation error is globally exponentially stable. The proposed method is based on the free-weighting matrix approach and is applicable to the case that the derivative of a time-varying delay takes any value. An algorithm is presented to compute the state estimator. Finally, a numerical example is given to demonstrate the effectiveness of this approach and the improvement over existing ones.

Journal ArticleDOI
TL;DR: It is shown that the projection neural network can also be used to solve pseudomonotone variational inequalities and related pseudoconvex optimization problems, and a new concept, called componentwise pseudomononicity, different from pseudomon onicity in general is introduced.
Abstract: In recent years, a recurrent neural network called projection neural network was proposed for solving monotone variational inequalities and related convex optimization problems. In this paper, we show that the projection neural network can also be used to solve pseudomonotone variational inequalities and related pseudoconvex optimization problems. Under various pseudomonotonicity conditions and other conditions, the projection neural network is proved to be stable in the sense of Lyapunov and globally convergent, globally asymptotically stable, and globally exponentially stable. Since monotonicity is a special case of pseudomononicity, the projection neural network can be applied to solve a broader class of constrained optimization problems related to variational inequalities. Moreover, a new concept, called componentwise pseudomononicity, different from pseudomononicity in general, is introduced. Under this new concept, two stability results of the projection neural network for solving variational inequalities are also obtained. Finally, numerical examples show the effectiveness and performance of the projection neural network

Journal ArticleDOI
TL;DR: The proposed approach solves some problems inherent to objective metrics that should predict subjective quality score obtained using the single stimulus continuous quality evaluation (SSCQE) method and relies on the use of a convolutional neural network that allows a continuous time scoring of the video.
Abstract: This paper describes an application of neural networks in the field of objective measurement method designed to automatically assess the perceived quality of digital videos. This challenging issue aims to emulate human judgment and to replace very complex and time consuming subjective quality assessment. Several metrics have been proposed in literature to tackle this issue. They are based on a general framework that combines different stages, each of them addressing complex problems. The ambition of this paper is not to present a global perfect quality metric but rather to focus on an original way to use neural networks in such a framework in the context of reduced reference (RR) quality metric. Especially, we point out the interest of such a tool for combining features and pooling them in order to compute quality scores. The proposed approach solves some problems inherent to objective metrics that should predict subjective quality score obtained using the single stimulus continuous quality evaluation (SSCQE) method. This latter has been adopted by video quality expert group (VQEG) in its recently finalized reduced referenced and no reference (RRNR-TV) test plan. The originality of such approach compared to previous attempts to use neural networks for quality assessment, relies on the use of a convolutional neural network (CNN) that allows a continuous time scoring of the video. Objective features are extracted on a frame-by-frame basis on both the reference and the distorted sequences; they are derived from a perceptual-based representation and integrated along the temporal axis using a time-delay neural network (TDNN). Experiments conducted on different MPEG-2 videos, with bit rates ranging 2-6 Mb/s, show the effectiveness of the proposed approach to get a plausible model of temporal pooling from the human vision system (HVS) point of view. More specifically, a linear correlation criteria, between objective and subjective scoring, up to 0.92 has been obtained on a set of typical TV videos

Journal ArticleDOI
TL;DR: A novel weakness analysis theory is developed that attempts to boost a strong learner by increasing the diversity between the classifiers created by the learner, at the expense of decreasing their margins, so as to achieve a tradeoff suggested by recent boosting studies for a low generalization error.
Abstract: In this paper, we propose a novel ensemble-based approach to boost performance of traditional Linear Discriminant Analysis (LDA)-based methods used in face recognition. The ensemble-based approach is based on the recently emerged technique known as "boosting". However, it is generally believed that boosting-like learning rules are not suited to a strong and stable learner such as LDA. To break the limitation, a novel weakness analysis theory is developed here. The theory attempts to boost a strong learner by increasing the diversity between the classifiers created by the learner, at the expense of decreasing their margins, so as to achieve a tradeoff suggested by recent boosting studies for a low generalization error. In addition, a novel distribution accounting for the pairwise class discriminant information is introduced for effective interaction between the booster and the LDA-based learner. The integration of all these methodologies proposed here leads to the novel ensemble-based discriminant learning approach, capable of taking advantage of both the boosting and LDA techniques. Promising experimental results obtained on various difficult face recognition scenarios demonstrate the effectiveness of the proposed approach. We believe that this work is especially beneficial in extending the boosting framework to accommodate general (strong/weak) learners.

Journal ArticleDOI
TL;DR: This paper shows that the good convergence properties of the one-unit case are also shared by the full algorithm with symmetrical normalization and the global behavior is illustrated numerically for two sources and two mixtures in several typical cases.
Abstract: The fast independent component analysis (FastICA) algorithm is one of the most popular methods to solve problems in ICA and blind source separation. It has been shown experimentally that it outperforms most of the commonly used ICA algorithms in convergence speed. A rigorous local convergence analysis has been presented only for the so-called one-unit case, in which just one of the rows of the separating matrix is considered. However, in the FastICA algorithm, there is also an explicit normalization step, and it may be questioned whether the extra rotation caused by the normalization will affect the convergence speed. The purpose of this paper is to show that this is not the case and the good convergence properties of the one-unit case are also shared by the full algorithm with symmetrical normalization. A local convergence analysis is given for the general case, and the global behavior is illustrated numerically for two sources and two mixtures in several typical cases

Journal ArticleDOI
TL;DR: The parallel SMO is developed using message passing interface (MPI) and shows great speedup on the adult data set and the Mixing National Institute of Standard and Technology (MNIST) data set when many processors are used.
Abstract: Sequential minimal optimization (SMO) is one popular algorithm for training support vector machine (SVM), but it still requires a large amount of computation time for solving large size problems. This paper proposes one parallel implementation of SMO for training SVM. The parallel SMO is developed using message passing interface (MPI). Specifically, the parallel SMO first partitions the entire training data set into smaller subsets and then simultaneously runs multiple CPU processors to deal with each of the partitioned data sets. Experiments show that there is great speedup on the adult data set and the Mixing National Institute of Standard and Technology (MNIST) data set when many processors are used. There are also satisfactory results on the Web data set.

Journal ArticleDOI
TL;DR: The center-constrained MEB problem is introduced and the generalized CVM algorithm is extended, which can now be used with any linear/nonlinear kernel and can also be applied to kernel methods such as SVR and the ranking SVM.
Abstract: Kernel methods, such as the support vector machine (SVM), are often formulated as quadratic programming (QP) problems. However, given m training patterns, a naive implementation of the QP solver takes O(m 3) training time and at least O(m2) space. Hence, scaling up these QPs is a major stumbling block in applying kernel methods on very large data sets, and a replacement of the naive method for finding the QP solutions is highly desirable. Recently, by using approximation algorithms for the minimum enclosing ball (MEB) problem, we proposed the core vector machine (CVM) algorithm that is much faster and can handle much larger data sets than existing SVM implementations. However, the CVM can only be used with certain kernel functions and kernel methods. For example, the very popular support vector regression (SVR) cannot be used with the CVM. In this paper, we introduce the center-constrained MEB problem and subsequently extend the CVM algorithm. The generalized CVM algorithm can now be used with any linear/nonlinear kernel and can also be applied to kernel methods such as SVR and the ranking SVM. Moreover, like the original CVM, its asymptotic time complexity is again linear in m and its space complexity is independent of m. Experiments show that the generalized CVM has comparable performance with state-of-the-art SVM and SVR implementations, but is faster and produces fewer support vectors on very large data sets

Journal ArticleDOI
TL;DR: Investigations made in this paper help to better understand the learning procedure of feedforward neural networks in terms of adaptive learning rate, convergence speed, and local minima.
Abstract: This paper investigates new learning algorithms (LF I and LF II) based on Lyapunov function for the training of feedforward neural networks. It is observed that such algorithms have interesting parallel with the popular backpropagation (BP) algorithm where the fixed learning rate is replaced by an adaptive learning rate computed using convergence theorem based on Lyapunov stability theory. LF II, a modified version of LF I, has been introduced with an aim to avoid local minima. This modification also helps in improving the convergence speed in some cases. Conditions for achieving global minimum for these kind of algorithms have been studied in detail. The performances of the proposed algorithms are compared with BP algorithm and extended Kalman filtering (EKF) on three bench-mark function approximation problems: XOR, 3-bit parity, and 8-3 encoder. The comparisons are made in terms of number of learning iterations and computational time required for convergence. It is found that the proposed algorithms (LF I and II) are much faster in convergence than other two algorithms to attain same accuracy. Finally, the comparison is made on a complex two-dimensional (2-D) Gabor function and effect of adaptive learning rate for faster convergence is verified. In a nutshell, the investigations made in this paper help us better understand the learning procedure of feedforward neural networks in terms of adaptive learning rate, convergence speed, and local minima

Journal ArticleDOI
TL;DR: Using new theoretical results, the global exponential stability of recurrent neural networks can be derived, and the estimated location of the equilibrium point can be obtained.
Abstract: This paper presents new theoretical results on global exponential stability of recurrent neural networks with bounded activation functions and time-varying delays. The stability conditions depend on external inputs, connection weights, and time delays of recurrent neural networks. Using these results, the global exponential stability of recurrent neural networks can be derived, and the estimated location of the equilibrium point can be obtained. As typical representatives, the Hopfield neural network (HNN) and the cellular neural network (CNN) are examined in detail

Journal ArticleDOI
TL;DR: A class of unknown perturbed nonlinear systems is theoretically stabilized by using adaptive neural network control and the idea of backstepping, semiglobal, uniformal, and ultimate boundedness of all the signals in the closed-loop is proved at equilibrium point.
Abstract: In this paper, a class of unknown perturbed nonlinear systems is theoretically stabilized by using adaptive neural network control. The systems, with disturbances and nonaffine unknown functions, have low triangular structure, which generalizes both strict-feedback uncertain systems and pure-feedback ones. There do not exist any effective methods to stabilize this kind of systems. With some new conclusions for Nussbaum-Gain functions (NGF) and the idea of backstepping, semiglobal, uniformal, and ultimate boundedness of all the signals in the closed-loop is proved at equilibrium point. The two problems, control directions and control singularity, are well dealt with. The effectiveness of proposed scheme is shown by simulation on a proper nonlinear system.

Journal ArticleDOI
TL;DR: The approach combines the advantages of simulated annealing, tabu search and the backpropagation training algorithm in order to generate an automatic process for producing networks with high classification performance and low complexity.
Abstract: This paper introduces a methodology for neural network global optimization. The aim is the simultaneous optimization of multilayer perceptron (MLP) network weights and architectures, in order to generate topologies with few connections and high classification performance for any data sets. The approach combines the advantages of simulated annealing, tabu search and the backpropagation training algorithm in order to generate an automatic process for producing networks with high classification performance and low complexity. Experimental results obtained with four classification problems and one prediction problem has shown to be better than those obtained by the most commonly used optimization techniques