scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 2009"


Journal ArticleDOI
TL;DR: A new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains, and implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an m-dimensional Euclidean space.
Abstract: Many underlying relationships among data in several areas of science and engineering, e.g., computer vision, molecular chemistry, molecular biology, pattern recognition, and data mining, can be represented in terms of graphs. In this paper, we propose a new neural network model, called graph neural network (GNN) model, that extends existing neural network methods for processing the data represented in graph domains. This GNN model, which can directly process most of the practically useful types of graphs, e.g., acyclic, cyclic, directed, and undirected, implements a function tau(G,n) isin IRm that maps a graph G and one of its nodes n into an m-dimensional Euclidean space. A supervised learning algorithm is derived to estimate the parameters of the proposed GNN model. The computational cost of the proposed algorithm is also considered. Some experimental results are shown to validate the proposed learning algorithm, and to demonstrate its generalization capabilities.

5,701 citations


Journal ArticleDOI
TL;DR: A filter method of feature selection based on mutual information, called normalized mutual information feature selection (NMIFS), is presented and is combined with a genetic algorithm to form a hybrid filter/wrapper method called GAMIFS.
Abstract: A filter method of feature selection based on mutual information, called normalized mutual information feature selection (NMIFS), is presented. NMIFS is an enhancement over Battiti's MIFS, MIFS-U, and mRMR methods. The average normalized mutual information is proposed as a measure of redundancy among features. NMIFS outperformed MIFS, MIFS-U, and mRMR on several artificial and benchmark data sets without requiring a user-defined parameter. In addition, NMIFS is combined with a genetic algorithm to form a hybrid filter/wrapper method called GAMIFS. This includes an initialization procedure and a mutation operator based on NMIFS to speed up the convergence of the genetic algorithm. GAMIFS overcomes the limitations of incremental search algorithms that are unable to find dependencies between groups of features.

989 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a large overview of the SSL methods and classify these methods into four classes that correspond to the first four main parts of the book (this would include generative models; low-density separation methods; graph-based methods; and algorithms).
Abstract: This book addresses some theoretical aspects of semisupervised learning (SSL). The book is organized as a collection of different contributions of authors who are experts on this topic. The objectives of this book are to present a large overview of the SSL methods and to classify these methods into four classes that correspond to the first four main parts of the book (this would include generative models; low-density separation methods; graph-based methods; and algorithms). The last two parts are devoted to applications and perspectives of SSL. The book responds to its major objectives and could serve as a basis for an intermediate level graduate course on SSL. It may also serve as a useful self study and reference source for practicing engineers.

777 citations


Journal ArticleDOI
TL;DR: A simple and efficient approach to automatically determine the number of hidden nodes in generalized single-hidden-layer feedforward networks (SLFNs) which need not be neural alike which is much faster than other sequential/incremental/growing algorithms with good generalization performance.
Abstract: One of the open problems in neural network research is how to automatically determine network architectures for given applications. In this brief, we propose a simple and efficient approach to automatically determine the number of hidden nodes in generalized single-hidden-layer feedforward networks (SLFNs) which need not be neural alike. This approach referred to as error minimized extreme learning machine (EM-ELM) can add random hidden nodes to SLFNs one by one or group by group (with varying group size). During the growth of the networks, the output weights are updated incrementally. The convergence of this approach is proved in this brief as well. Simulation results demonstrate and verify that our new approach is much faster than other sequential/incremental/growing algorithms with good generalization performance.

600 citations


Journal ArticleDOI
TL;DR: The near-optimal control problem for a class of nonlinear discrete-time systems with control constraints is solved by iterative adaptive dynamic programming algorithm.
Abstract: In this paper, the near-optimal control problem for a class of nonlinear discrete-time systems with control constraints is solved by iterative adaptive dynamic programming algorithm. First, a novel nonquadratic performance functional is introduced to overcome the control constraints, and then an iterative adaptive dynamic programming algorithm is developed to solve the optimal feedback control problem of the original constrained system with convergence analysis. In the present control scheme, there are three neural networks used as parametric structures for facilitating the implementation of the iterative algorithm. Two examples are given to demonstrate the convergence and feasibility of the proposed optimal control scheme.

574 citations


Journal ArticleDOI
TL;DR: The new model allows the extension of the input domain for supervised neural networks to a general class of graphs including both acyclic/cyclic, directed/undirected labeled graphs and can realize adaptive contextual transductions, learning the mapping from graphs for both classification and regression tasks.
Abstract: This paper presents a new approach for learning in structured domains (SDs) using a constructive neural network for graphs (NN4G). The new model allows the extension of the input domain for supervised neural networks to a general class of graphs including both acyclic/cyclic, directed/undirected labeled graphs. In particular, the model can realize adaptive contextual transductions, learning the mapping from graphs for both classification and regression tasks. In contrast to previous neural networks for structures that had a recursive dynamics, NN4G is based on a constructive feedforward architecture with state variables that uses neurons with no feedback connections. The neurons are applied to the input graphs by a general traversal process that relaxes the constraints of previous approaches derived by the causality assumption over hierarchical input data. Moreover, the incremental approach eliminates the need to introduce cyclic dependencies in the definition of the system state variables. In the traversal process, the NN4G units exploit (local) contextual information of the graphs vertices. In spite of the simplicity of the approach, we show that, through the compositionality of the contextual information developed by the learning, the model can deal with contextual information that is incrementally extended according to the graphs topology. The effectiveness and the generality of the new approach are investigated by analyzing its theoretical properties and providing experimental results.

465 citations


Journal ArticleDOI
TL;DR: CAVIAR is a massively parallel hardware implementation of a spike-based sensing-processing-learning-actuating system inspired by the physiology of the nervous system that achieves millisecond object recognition and tracking latencies.
Abstract: This paper describes CAVIAR, a massively parallel hardware implementation of a spike-based sensing-processing-learning-actuating system inspired by the physiology of the nervous system. CAVIAR uses the asynchronous address-event representation (AER) communication framework and was developed in the context of a European Union funded project. It has four custom mixed-signal AER chips, five custom digital AER interface components, 45 k neurons (spiking cells), up to 5 M synapses, performs 12 G synaptic operations per second, and achieves millisecond object recognition and tracking latencies.

338 citations


Journal ArticleDOI
TL;DR: A unified LMI approach is developed to solve the stability analysis and synchronization problems of the class of neural networks under investigation, where the LMIs can be easily solved by using the available Matlab LMI toolbox.
Abstract: In this paper, we introduce a new class of discrete-time neural networks (DNNs) with Markovian jumping parameters as well as mode-dependent mixed time delays (both discrete and distributed time delays). Specifically, the parameters of the DNNs are subject to the switching from one to another at different times according to a Markov chain, and the mixed time delays consist of both discrete and distributed delays that are dependent on the Markovian jumping mode. We first deal with the stability analysis problem of the addressed neural networks. A special inequality is developed to account for the mixed time delays in the discrete-time setting, and a novel Lyapunov-Krasovskii functional is put forward to reflect the mode-dependent time delays. Sufficient conditions are established in terms of linear matrix inequalities (LMIs) that guarantee the stochastic stability. We then turn to the synchronization problem among an array of identical coupled Markovian jumping neural networks with mixed mode-dependent time delays. By utilizing the Lyapunov stability theory and the Kronecker product, it is shown that the addressed synchronization problem is solvable if several LMIs are feasible. Hence, different from the commonly used matrix norm theories (such as the M-matrix method), a unified LMI approach is developed to solve the stability analysis and synchronization problems of the class of neural networks under investigation, where the LMIs can be easily solved by using the available Matlab LMI toolbox. Two numerical examples are presented to illustrate the usefulness and effectiveness of the main results obtained.

329 citations


Journal ArticleDOI
TL;DR: A theoretical overview of the global optimum solution to the TR problem via the equivalent trace difference problem is proposed, and Eigenvalue perturbation theory is introduced to derive an efficient algorithm based on the Newton-Raphson method.
Abstract: Dimensionality reduction is an important issue in many machine learning and pattern recognition applications, and the trace ratio (TR) problem is an optimization problem involved in many dimensionality reduction algorithms. Conventionally, the solution is approximated via generalized eigenvalue decomposition due to the difficulty of the original problem. However, prior works have indicated that it is more reasonable to solve it directly than via the conventional way. In this brief, we propose a theoretical overview of the global optimum solution to the TR problem via the equivalent trace difference problem. Eigenvalue perturbation theory is introduced to derive an efficient algorithm based on the Newton-Raphson method. Theoretical issues on the convergence and efficiency of our algorithm compared with prior literature are proposed, and are further supported by extensive empirical results.

279 citations


Journal ArticleDOI
TL;DR: In this paper, adaptive neural network (NN) tracking control is investigated for a class of uncertain multiple-input-multiple-output (MIMO) nonlinear systems in triangular control structure with unknown nonsymmetric dead zones and control directions and is proved to be semiglobally uniformly ultimately bounded.
Abstract: In this paper, adaptive neural network (NN) tracking control is investigated for a class of uncertain multiple-input-multiple-output (MIMO) nonlinear systems in triangular control structure with unknown nonsymmetric dead zones and control directions. The design is based on the principle of sliding mode control and the use of Nussbaum-type functions in solving the problem of the completely unknown control directions. It is shown that the dead-zone output can be represented as a simple linear system with a static time-varying gain and bounded disturbance by introducing characteristic function. By utilizing the integral-type Lyapunov function and introducing an adaptive compensation term for the upper bound of the optimal approximation error and the dead-zone disturbance, the closed-loop control system is proved to be semiglobally uniformly ultimately bounded, with tracking errors converging to zero under the condition that the slopes of unknown dead zones are equal. Simulation results demonstrate the effectiveness of the approach.

251 citations


Journal ArticleDOI
TL;DR: Using these new Lyapunov-Krasovskii functionals, some new delay-dependent criteria for global asymptotic stability are derived for delayed neural networks, where both constant time delays and time-varying delays are treated.
Abstract: This brief deals with the problem of global asymptotic stability for a class of delayed neural networks. Some new Lyapunov-Krasovskii functionals are constructed by nonuniformly dividing the delay interval into multiple segments, and choosing proper functionals with different weighting matrices corresponding to different segments in the Lyapunov-Krasovskii functionals. Then using these new Lyapunov-Krasovskii functionals, some new delay-dependent criteria for global asymptotic stability are derived for delayed neural networks, where both constant time delays and time-varying delays are treated. These criteria are much less conservative than some existing results, which is shown through a numerical example.

Journal ArticleDOI
TL;DR: A systematic sparsification scheme is proposed, which can drastically reduce the time and space complexity without harming the performance of kernel adaptive filters.
Abstract: This paper discusses an information theoretic approach of designing sparse kernel adaptive filters. To determine useful data to be learned and remove redundant ones, a subjective information measure called surprise is introduced. Surprise captures the amount of information a datum contains which is transferable to a learning system. Based on this concept, we propose a systematic sparsification scheme, which can drastically reduce the time and space complexity without harming the performance of kernel adaptive filters. Nonlinear regression, short term chaotic time-series prediction, and long term time-series forecasting examples are presented.

Journal ArticleDOI
TL;DR: This paper aims to design a state estimator to estimate the network states such that, for all admissible parameter uncertainties and time-varying delays, the dynamics of the estimation error is guaranteed to be globally exponentially stable in the mean square.
Abstract: This paper is concerned with the problem of state estimation for a class of discrete-time coupled uncertain stochastic complex networks with missing measurements and time-varying delay. The parameter uncertainties are assumed to be norm-bounded and enter into both the network state and the network output. The stochastic Brownian motions affect not only the coupling term of the network but also the overall network dynamics. The nonlinear terms that satisfy the usual Lipschitz conditions exist in both the state and measurement equations. Through available output measurements described by a binary switching sequence that obeys a conditional probability distribution, we aim to design a state estimator to estimate the network states such that, for all admissible parameter uncertainties and time-varying delays, the dynamics of the estimation error is guaranteed to be globally exponentially stable in the mean square. By employing the Lyapunov functional method combined with the stochastic analysis approach, several delay-dependent criteria are established that ensure the existence of the desired estimator gains, and then the explicit expression of such estimator gains is characterized in terms of the solution to certain linear matrix inequalities (LMIs). Two numerical examples are exploited to illustrate the effectiveness of the proposed estimator design schemes.

Journal ArticleDOI
TL;DR: Learning++ .NC is described, specifically designed for efficient incremental learning of multiple new classes using significantly fewer classifiers, and introduces dynamically weighted consult and vote (DW-CAV) , a novel voting mechanism for combining classifiers.
Abstract: We have previously introduced an incremental learning algorithm Learn++, which learns novel information from consecutive data sets by generating an ensemble of classifiers with each data set, and combining them by weighted majority voting. However, Learn++ suffers from an inherent ldquooutvotingrdquo problem when asked to learn a new class omeganew introduced by a subsequent data set, as earlier classifiers not trained on this class are guaranteed to misclassify omeganew instances. The collective votes of earlier classifiers, for an inevitably incorrect decision, then outweigh the votes of the new classifiers' correct decision on omeganew instances-until there are enough new classifiers to counteract the unfair outvoting. This forces Learn++ to generate an unnecessarily large number of classifiers. This paper describes Learn++ .NC, specifically designed for efficient incremental learning of multiple new classes using significantly fewer classifiers. To do so, Learn ++.NC introduces dynamically weighted consult and vote (DW-CAV) , a novel voting mechanism for combining classifiers: individual classifiers consult with each other to determine which ones are most qualified to classify a given instance, and decide how much weight, if any, each classifier's decision should carry. Experiments on real-world problems indicate that the new algorithm performs remarkably well with substantially fewer classifiers, not only as compared to its predecessor Learn++, but also as compared to several other algorithms recently proposed for similar problems.

Journal ArticleDOI
TL;DR: The functions that can be approximated by GNNs, in probability, up to any prescribed degree of precision are described, and includes most of the practically useful functions on graphs.
Abstract: In this paper, we will consider the approximation properties of a recently introduced neural network model called graph neural network (GNN), which can be used to process-structured data inputs, e.g., acyclic graphs, cyclic graphs, and directed or undirected graphs. This class of neural networks implements a function tau(G, n) isin R m that maps a graph G and one of its nodes n onto an m-dimensional Euclidean space. We characterize the functions that can be approximated by GNNs, in probability, up to any prescribed degree of precision. This set contains the maps that satisfy a property called preservation of the unfolding equivalence, and includes most of the practically useful functions on graphs; the only known exception is when the input graph contains particular patterns of symmetries when unfolding equivalence may not be preserved. The result can be considered an extension of the universal approximation property established for the classic feedforward neural networks (FNNs). Some experimental examples are used to show the computational capabilities of the proposed model.

Journal ArticleDOI
TL;DR: A robust fault detection and isolation (FDI) scheme for a general class of nonlinear systems using a neural-network-based observer strategy and requires no restrictive assumptions on the system and/or the FDI algorithm.
Abstract: This paper presents a robust fault detection and isolation (FDI) scheme for a general class of nonlinear systems using a neural-network-based observer strategy. Both actuator and sensor faults are considered. The nonlinear system considered is subject to both state and sensor uncertainties and disturbances. Two recurrent neural networks are employed to identify general unknown actuator and sensor faults, respectively. The neural network weights are updated according to a modified backpropagation scheme. Unlike many previous methods developed in the literature, our proposed FDI scheme does not rely on availability of full state measurements. The stability of the overall FDI scheme in presence of unknown sensor and actuator faults as well as plant and sensor noise and uncertainties is shown by using the Lyapunov's direct method. The stability analysis developed requires no restrictive assumptions on the system and/or the FDI algorithm. Magnetorquer-type actuators and magnetometer-type sensors that are commonly employed in the attitude control subsystem (ACS) of low-Earth orbit (LEO) satellites for attitude determination and control are considered in our case studies. The effectiveness and capabilities of our proposed fault diagnosis strategy are demonstrated and validated through extensive simulation studies.

Journal ArticleDOI
TL;DR: Pinning stabilization problem of linearly coupled stochastic neural networks (LCSNNs) is studied and some criteria are derived to judge whether the LCSNNs can be controlled in mean square by using designed controllers.
Abstract: Pinning stabilization problem of linearly coupled stochastic neural networks (LCSNNs) is studied in this paper. A minimum number of controllers are used to force the LCSNNs to the desired equilibrium point by fully utilizing the structure of the network. In order to pinning control the LCSNNs to a certain desired state, only one controller is required for strongly connected network topology, and m controllers, which will be shown to be the minimum number, are needed for LCSNNs with m -reducible coupling matrix. The isolate node of the LCSNNs can be stable, periodic, or even chaotic. The coupling Laplacian matrix of the LCSNNs can be symmetric irreducible, asymmetric irreducible, or m-reducible, which means that the network topology can be strongly connected, weakly connected, or even unconnected. There is no constraint on the network topology. Some criteria are derived to judge whether the LCSNNs can be controlled in mean square by using designed controllers. The given criteria are expressed in terms of strict linear matrix inequalities, which can be easily checked by resorting to recently developed algorithm. Moreover, numerical examples including small-world and scale-free networks are also given to demonstrate that our theoretical results are valid and efficient for large systems.

Journal ArticleDOI
TL;DR: The logic-based input-state dynamics of Boolean networks, called the Boolean control networks, is converted into an algebraic discrete-time dynamic system, and the structure of cycles of Boolean control systems is obtained as compounded cycles.
Abstract: This paper investigates the structure of Boolean networks via input-state structure. Using the algebraic form proposed by the author, the logic-based input-state dynamics of Boolean networks, called the Boolean control networks, is converted into an algebraic discrete-time dynamic system. Then the structure of cycles of Boolean control systems is obtained as compounded cycles. Using the obtained input-state description, the structure of Boolean networks is investigated, and their attractors are revealed as nested compounded cycles, called rolling gears. This structure explains why small cycles mainly decide the behaviors of cellular networks. Some illustrative examples are presented.

Journal ArticleDOI
TL;DR: This paper shows that data topology can be integrated into the visualization of the SOM and thereby provide a more elaborate view of the cluster structure than existing schemes, by introducing a weighted Delaunay triangulation and draping it over the SOM.
Abstract: The self-organizing map (SOM) is a powerful method for visualization, cluster extraction, and data mining. It has been used successfully for data of high dimensionality and complexity where traditional methods may often be insufficient. In order to analyze data structure and capture cluster boundaries from the SOM, one common approach is to represent the SOM's knowledge by visualization methods. Different aspects of the information learned by the SOM are presented by existing methods, but data topology, which is present in the SOM's knowledge, is greatly underutilized. We show in this paper that data topology can be integrated into the visualization of the SOM and thereby provide a more elaborate view of the cluster structure than existing schemes. We achieve this by introducing a weighted Delaunay triangulation (a connectivity matrix) and draping it over the SOM. This new visualization, CONNvis, also shows both forward and backward topology violations along with the severity of forward ones, which indicate the quality of the SOM learning and the data complexity. CONNvis greatly assists in detailed identification of cluster boundaries. We demonstrate the capabilities on synthetic data sets and on a real 8D remote sensing spectral image.

Journal ArticleDOI
TL;DR: Two alternative activation functions with circularity are presented, one based on a multilevel sigmoid function defined on a circle, the other a characteristic of a bifurcating neuron represented by a circle map.
Abstract: A widely used complex-valued activation function for complex-valued multistate Hopfield networks is revealed to be essentially based on a multilevel step function. By replacing the multilevel step function with other multilevel characteristics, we present two alternative complex-valued activation functions. One is based on a multilevel sigmoid function, while the other on a characteristic of a multistate bifurcating neuron. Numerical experiments show that both modifications to the complex-valued activation function bring about improvements in network performance for a multistate associative memory. The advantage of the proposed networks over the complex-valued Hopfield networks with the multilevel step function is more outstanding when a complex-valued neuron represents a larger number of multivalued states. Further, the performance of the proposed networks in reconstructing noisy 256 gray-level images is demonstrated in comparison with other recent associative memories to clarify their advantages and disadvantages.

Journal ArticleDOI
TL;DR: New stability results for recurrent neural networks with Markovian switching are presented, showing that the almost sure exponential stability of such a neural network does not require the stability of the neural network at every individual parametric configuration.
Abstract: This paper presents new stability results for recurrent neural networks with Markovian switching. First, algebraic criteria for the almost sure exponential stability of recurrent neural networks with Markovian switching and without time delays are derived. The results show that the almost sure exponential stability of such a neural network does not require the stability of the neural network at every individual parametric configuration. Next, both delay-dependent and delay-independent criteria for the almost sure exponential stability of recurrent neural networks with time-varying delays and Markovian-switching parameters are derived by means of a generalized stochastic Halanay inequality. The results herein include existing ones for recurrent neural networks without Markovian switching as special cases. Finally, simulation results in three numerical examples are discussed to illustrate the theoretical results.

Journal ArticleDOI
TL;DR: Experiments on a number of synthetic and real-world data sets demonstrate that the proposed approach is more accurate, much faster, and can handle data sets that are hundreds of times larger than the largest data set reported in the MMC literature.
Abstract: Motivated by the success of large margin methods in supervised learning, maximum margin clustering (MMC) is a recent approach that aims at extending large margin methods to unsupervised learning. However, its optimization problem is nonconvex and existing MMC methods all rely on reformulating and relaxing the nonconvex optimization problem as semidefinite programs (SDP). Though SDP is convex and standard solvers are available, they are computationally very expensive and only small data sets can be handled. To make MMC more practical, we avoid SDP relaxations and propose in this paper an efficient approach that performs alternating optimization directly on the original nonconvex problem. A key step to avoid premature convergence in the resultant iterative procedure is to change the loss function from the hinge loss to the Laplacian/square loss so that overconfident predictions are penalized. Experiments on a number of synthetic and real-world data sets demonstrate that the proposed approach is more accurate, much faster (hundreds to tens of thousands of times faster), and can handle data sets that are hundreds of times larger than the largest data set reported in the MMC literature.

Journal ArticleDOI
TL;DR: The approximate ANN solution automatically satisfies BCs at all stages of training, including before training commences, due to its unconstrained nature and because automatic satisfaction of Dirichlet BCs provides a good starting approximate solution for significant portions of the domain.
Abstract: A method for solving boundary value problems (BVPs) is introduced using artificial neural networks (ANNs) for irregular domain boundaries with mixed Dirichlet/Neumann boundary conditions (BCs). The approximate ANN solution automatically satisfies BCs at all stages of training, including before training commences. This method is simpler than other ANN methods for solving BVPs due to its unconstrained nature and because automatic satisfaction of Dirichlet BCs provides a good starting approximate solution for significant portions of the domain. Automatic satisfaction of BCs is accomplished by the introduction of an innovative length factor. Several examples of BVP solution are presented for both linear and nonlinear differential equations in two and three dimensions. Error norms in the approximate solution on the order of 10-4 to 10-5 are reported for all example problems.

Journal ArticleDOI
TL;DR: In a static case where the barriers and targets are stationary, this paper proves that the generated wave in the network spreads outward with travel times proportional to the linking strength among neurons, so the generated path is always the global shortest path from the robot to the target.
Abstract: This paper presents a modified pulse-coupled neural network (MPCNN) model for real-time collision-free path planning of mobile robots in nonstationary environments. The proposed neural network for robots is topologically organized with only local lateral connections among neurons. It works in dynamic environments and requires no prior knowledge of target or barrier movements. The target neuron fires first, and then the firing event spreads out, through the lateral connections among the neurons, like the propagation of a wave. Obstacles have no connections to their neighbors. Each neuron records its parent, that is, the neighbor that caused it to fire. The real-time optimal path is then the sequence of parents from the robot to the target. In a static case where the barriers and targets are stationary, this paper proves that the generated wave in the network spreads outward with travel times proportional to the linking strength among neurons. Thus, the generated path is always the global shortest path from the robot to the target. In addition, each neuron in the proposed model can propagate a firing event to its neighboring neuron without any comparing computations. The proposed model is applied to generate collision-free paths for a mobile robot to solve a maze-type problem, to circumvent concave U-shaped obstacles, and to track a moving target in an environment with varying obstacles. The effectiveness and efficiency of the proposed approach is demonstrated through simulation and comparison studies.

Journal ArticleDOI
TL;DR: The global kernel k-means algorithm is proposed, a deterministic and incremental approach to kernel-based clustering, which identifies nonlinearly separable clusters and locates near-optimal solutions avoiding poor local minima.
Abstract: Kernel k-means is an extension of the standard k-means clustering algorithm that identifies nonlinearly separable clusters. In order to overcome the cluster initialization problem associated with this method, we propose the global kernel k-means algorithm, a deterministic and incremental approach to kernel-based clustering. Our method adds one cluster at each stage, through a global search procedure consisting of several executions of kernel k-means from suitable initializations. This algorithm does not depend on cluster initialization, identifies nonlinearly separable clusters, and, due to its incremental nature and search procedure, locates near-optimal solutions avoiding poor local minima. Furthermore, two modifications are developed to reduce the computational cost that do not significantly affect the solution quality. The proposed methods are extended to handle weighted data points, which enables their application to graph partitioning. We experiment with several data sets and the proposed approach compares favorably to kernel k-means with random restarts.

Journal ArticleDOI
TL;DR: This paper analyzes NCL and reveals that the training of NCL corresponds to training the entire ensemble as a single learning machine that only minimizes the MSE without regularization, which explains the reason why NCL is prone to overfitting the noise in the training set.
Abstract: Negative correlation learning (NCL) is a neural network ensemble learning algorithm that introduces a correlation penalty term to the cost function of each individual network so that each neural network minimizes its mean square error (MSE) together with the correlation of the ensemble. This paper analyzes NCL and reveals that the training of NCL (when ? = 1) corresponds to training the entire ensemble as a single learning machine that only minimizes the MSE without regularization. This analysis explains the reason why NCL is prone to overfitting the noise in the training set. This paper also demonstrates that tuning the correlation parameter ? in NCL by cross validation cannot overcome the overfitting problem. The paper analyzes this problem and proposes the regularized negative correlation learning (RNCL) algorithm which incorporates an additional regularization term for the whole ensemble. RNCL decomposes the ensemble's training objectives, including MSE and regularization, into a set of sub-objectives, and each sub-objective is implemented by an individual neural network. In this paper, we also provide a Bayesian interpretation for RNCL and provide an automatic algorithm to optimize regularization parameters based on Bayesian inference. The RNCL formulation is applicable to any nonlinear estimator minimizing the MSE. The experiments on synthetic as well as real-world data sets demonstrate that RNCL achieves better performance than NCL, especially when the noise level is nontrivial in the data set.

Journal ArticleDOI
TL;DR: It is found that time matrix of the model can be recognized as a human subjective sense of stimulus intensity and the retrieval scheme is effective in extracting the rotation and scale invariant features.
Abstract: Based on the studies of existing local-connected neural network models, in this brief, we present a new spiking cortical neural networks model and find that time matrix of the model can be recognized as a human subjective sense of stimulus intensity. The series of output pulse images of a proposed model represents the segment, edge, and texture features of the original image, and can be calculated based on several efficient measures and forms a sequence as the feature of the original image. We characterize texture images by the sequence for an invariant texture retrieval. The experimental results show that the retrieval scheme is effective in extracting the rotation and scale invariant features. The new model can also obtain good results when it is used in other image processing applications.

Journal ArticleDOI
TL;DR: This brief presents an approach to detect premature ventricular contractions (PVCs) using the neural network with weighted fuzzy membership functions (NEWFMs), and it is shown that the locations of the eight features are not only around the QRS complex that represents ventricular depolarization in the electrocardiogram (ECG) containing a Q wave, an R wave, and an S wave.
Abstract: Fuzzy neural networks (FNNs) have been successfully applied to generate predictive rules for medical or diagnostic data. This brief presents an approach to detect premature ventricular contractions (PVCs) using the neural network with weighted fuzzy membership functions (NEWFMs). The NEWFM classifies normal and PVC beats by the trained bounded sum of weighted fuzzy membership functions (BSWFMs) using wavelet transformed coefficients from the MIT-BIH PVC database. The eight generalized coefficients, locally related to the time signal, are extracted by the nonoverlap area distribution measurement method. The eight generalized coefficients are used for the three PVC data sets with reliable accuracy rates of 99.80%, 99.21%, and 98.78%, respectively, which means that the selected features are less dependent on the data sets. It is shown that the locations of the eight features are not only around the QRS complex that represents ventricular depolarization in the electrocardiogram (ECG) containing a Q wave, an R wave, and an S wave, but also the QR segment from the Q wave to the R wave has more discriminate information than the RS segment from the R wave to the S wave. The BSWFMs of the eight features trained by NEWFM are shown visually, which makes the features explicitly interpretable. Since each BSWFM combines multiple weighted fuzzy membership functions into one using the bounded sum, the eight small-sized BSWFMs can realize real-time PVC detection in a mobile environment.

Journal ArticleDOI
TL;DR: PCVMs outperform other algorithms, including SVMSoft, SVMHard, RVM, and SVMPCVM, on most of the data sets under the three metrics, especially under AUC.
Abstract: In this paper, a sparse learning algorithm, probabilistic classification vector machines (PCVMs), is proposed. We analyze relevance vector machines (RVMs) for classification problems and observe that adopting the same prior for different classes may lead to unstable solutions. In order to tackle this problem, a signed and truncated Gaussian prior is adopted over every weight in PCVMs, where the sign of prior is determined by the class label, i.e., +1 or -1. The truncated Gaussian prior not only restricts the sign of weights but also leads to a sparse estimation of weight vectors, and thus controls the complexity of the model. In PCVMs, the kernel parameters can be optimized simultaneously within the training algorithm. The performance of PCVMs is extensively evaluated on four synthetic data sets and 13 benchmark data sets using three performance metrics, error rate (ERR), area under the curve of receiver operating characteristic (AUC), and root mean squared error (RMSE). We compare PCVMs with soft-margin support vector machines (SVMSoft), hard-margin support vector machines (SVMHard), SVM with the kernel parameters optimized by PCVMs (SVMPCVM), relevance vector machines (RVMs), and some other baseline classifiers. Through five replications of twofold cross-validation F test, i.e., 5 times 2 cross-validation F test, over single data sets and Friedman test with the corresponding post-hoc test to compare these algorithms over multiple data sets, we notice that PCVMs outperform other algorithms, including SVMSoft, SVMHard, RVM, and SVMPCVM, on most of the data sets under the three metrics, especially under AUC. Our results also reveal that the performance of SVMPCVM is slightly better than SVMSoft, implying that the parameter optimization algorithm in PCVMs is better than cross validation in terms of performance and computational complexity. In this paper, we also discuss the superiority of PCVMs' formulation using maximum a posteriori (MAP) analysis and margin analysis, which explain the empirical success of PCVMs.

Journal ArticleDOI
TL;DR: Experiments show that the proposed approach effectively reduces the number of prototypes while maintaining the same level of classification accuracy as the traditional KNN, and is a simple and a fast condensing algorithm.
Abstract: The K-nearest neighbor (KNN) rule is one of the most widely used pattern classification algorithms. For large data sets, the computational demands for classifying patterns using KNN can be prohibitive. A way to alleviate this problem is through the condensing approach. This means we remove patterns that are more of a computational burden but do not contribute to better classification accuracy. In this brief, we propose a new condensing algorithm. The proposed idea is based on defining the so-called chain. This is a sequence of nearest neighbors from alternating classes. We make the point that patterns further down the chain are close to the classification boundary and based on that we set a cutoff for the patterns we keep in the training set. Experiments show that the proposed approach effectively reduces the number of prototypes while maintaining the same level of classification accuracy as the traditional KNN. Moreover, it is a simple and a fast condensing algorithm.