scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 2007"


Journal ArticleDOI
TL;DR: A traffic classifier that can achieve a high accuracy across a range of application types without any source or destination host-address or port information is presented, using supervised machine learning based on a Bayesian trained neural network.
Abstract: Internet traffic identification is an important tool for network management. It allows operators to better predict future traffic matrices and demands, security personnel to detect anomalous behavior, and researchers to develop more realistic traffic models. We present here a traffic classifier that can achieve a high accuracy across a range of application types without any source or destination host-address or port information. We use supervised machine learning based on a Bayesian trained neural network. Though our technique uses training data with categories derived from packet content, training and testing were done using features derived from packet streams consisting of one or more packet headers. By providing classification without access to the contents of packets, our technique offers wider application than methods that require full packet/payloads for classification. This is a powerful advantage, using samples of classified traffic to permit the categorization of traffic based only upon commonly available information

514 citations


Journal ArticleDOI
TL;DR: A new method is proposed for stability analysis of neural networks (NNs) with a time-varying delay by considering the additional useful terms, which were ignored in previous methods, when estimating the upper bound of the derivative of Lyapunov functionals and introducing the new free-weighting matrices.
Abstract: In this letter, a new method is proposed for stability analysis of neural networks (NNs) with a time-varying delay. Some less conservative delay-dependent stability criteria are established by considering the additional useful terms, which were ignored in previous methods, when estimating the upper bound of the derivative of Lyapunov functionals and introducing the new free-weighting matrices. Numerical examples are given to demonstrate the effectiveness and the benefits of the proposed method

503 citations


Journal ArticleDOI
TL;DR: This paper proposes slight modifications of existing updates and proves their convergence, and techniques invented in this paper may be applied to prove the convergence for other bound-constrained optimization problems.
Abstract: Nonnegative matrix factorization (NMF) is useful to find basis information of nonnegative data. Currently, multiplicative updates are a simple and popular way to find the factorization. However, for the common NMF approach of minimizing the Euclidean distance between approximate and true values, no proof has shown that multiplicative updates converge to a stationary point of the NMF optimization problem. Stationarity is important as it is a necessary condition of a local minimum. This paper discusses the difficulty of proving the convergence. We propose slight modifications of existing updates and prove their convergence. Techniques invented in this paper may be applied to prove the convergence for other bound-constrained optimization problems.

440 citations


Journal ArticleDOI
TL;DR: Some improved delay/interval-dependent stability criteria for NNs with time-varying interval delay are proposed andumerical examples are given to demonstrate the effectiveness and the merits of the proposed method.
Abstract: This letter is concerned with the stability analysis of neural networks (NNs) with time-varying interval delay. The relationship between the time-varying delay and its lower and upper bounds is taken into account when estimating the upper bound of the derivative of Lyapunov functional. As a result, some improved delay/interval-dependent stability criteria for NNs with time-varying interval delay are proposed. Numerical examples are given to demonstrate the effectiveness and the merits of the proposed method.

318 citations


Journal ArticleDOI
TL;DR: Simulation results using the Massachusetts Institute of Technology/Beth Israel Hospital (MIT-BIH) arrhythmia database demonstrate high average detection accuracies of ventricular ectopic beats and supraventricular ectopy beats patterns for heartbeat monitoring, being a significant improvement over previously reported electrocardiogram (ECG) classification results.
Abstract: This paper presents evolvable block-based neural networks (BbNNs) for personalized ECG heartbeat pattern classification. A BbNN consists of a 2-D array of modular component NNs with flexible structures and internal configurations that can be implemented using reconfigurable digital hardware such as field-programmable gate arrays (FPGAs). Signal flow between the blocks determines the internal configuration of a block as well as the overall structure of the BbNN. Network structure and the weights are optimized using local gradient-based search and evolutionary operators with the rates changing adaptively according to their effectiveness in the previous evolution period. Such adaptive operator rate update scheme ensures higher fitness on average compared to predetermined fixed operator rates. The Hermite transform coefficients and the time interval between two neighboring R-peaks of ECG signals are used as inputs to the BbNN. A BbNN optimized with the proposed evolutionary algorithm (EA) makes a personalized heartbeat pattern classifier that copes with changing operating environments caused by individual difference and time-varying characteristics of ECG signals. Simulation results using the Massachusetts Institute of Technology/Beth Israel Hospital (MIT-BIH) arrhythmia database demonstrate high average detection accuracies of ventricular ectopic beats (98.1%) and supraventricular ectopic beats (96.6%) patterns for heartbeat monitoring, being a significant improvement over previously reported electrocardiogram (ECG) classification results.

316 citations


Journal ArticleDOI
TL;DR: This paper study the RSVM from the viewpoint of sampling design, its robustness, and the spectral analysis of the reduced kernel, which indicates that the approximation kernels can retain most of the relevant information for learning tasks in the full kernel.
Abstract: In dealing with large data sets, the reduced support vector machine (RSVM) was proposed for the practical objective to overcome some computational difficulties as well as to reduce the model complexity. In this paper, we study the RSVM from the viewpoint of sampling design, its robustness, and the spectral analysis of the reduced kernel. We consider the nonlinear separating surface as a mixture of kernels. Instead of a full model, the RSVM uses a reduced mixture with kernels sampled from certain candidate set. Our main results center on two major themes. One is the robustness of the random subset mixture model. The other is the spectral analysis of the reduced kernel. The robustness is judged by a few criteria as follows: 1) model variation measure; 2) model bias (deviation) between the reduced model and the full model; and 3) test power in distinguishing the reduced model from the full one. For the spectral analysis, we compare the eigenstructures of the full kernel matrix and the approximation kernel matrix. The approximation kernels are generated by uniform random subsets. The small discrepancies between them indicate that the approximation kernels can retain most of the relevant information for learning tasks in the full kernel. We focus on some statistical theory of the reduced set method mainly in the context of the RSVM. The use of a uniform random subset is not limited to the RSVM. This approach can act as a supplemental algorithm on top of a basic optimization algorithm, wherein the actual optimization takes place on the subset-approximated data. The statistical properties discussed in this paper are still valid

280 citations


Journal ArticleDOI
TL;DR: The KLSPI algorithm provides a general RL method with generalization performance and convergence guarantee for large-scale Markov decision problems (MDPs) and can be applied to online learning control by incorporating an initial controller to ensure online performance.
Abstract: In this paper, we present a kernel-based least squares policy iteration (KLSPI) algorithm for reinforcement learning (RL) in large or continuous state spaces, which can be used to realize adaptive feedback control of uncertain dynamic systems. By using KLSPI, near-optimal control policies can be obtained without much a priori knowledge on dynamic models of control plants. In KLSPI, Mercer kernels are used in the policy evaluation of a policy iteration process, where a new kernel-based least squares temporal-difference algorithm called KLSTD-Q is proposed for efficient policy evaluation. To keep the sparsity and improve the generalization ability of KLSTD-Q solutions, a kernel sparsification procedure based on approximate linear dependency (ALD) is performed. Compared to the previous works on approximate RL methods, KLSPI makes two progresses to eliminate the main difficulties of existing results. One is the better convergence and (near) optimality guarantee by using the KLSTD-Q algorithm for policy evaluation with high precision. The other is the automatic feature selection using the ALD-based kernel sparsification. Therefore, the KLSPI algorithm provides a general RL method with generalization performance and convergence guarantee for large-scale Markov decision problems (MDPs). Experimental results on a typical RL task for a stochastic chain problem demonstrate that KLSPI can consistently achieve better learning efficiency and policy quality than the previous least squares policy iteration (LSPI) algorithm. Furthermore, the KLSPI method was also evaluated on two nonlinear feedback control problems, including a ship heading control problem and the swing up control of a double-link underactuated pendulum called acrobot. Simulation results illustrate that the proposed method can optimize controller performance using little a priori information of uncertain dynamic systems. It is also demonstrated that KLSPI can be applied to online learning control by incorporating an initial controller to ensure online performance.

279 citations


Journal ArticleDOI
TL;DR: The silicon neuron circuits are described, experimental data characterizing the 3 mm times 3 mm chip fabricated in 0.5-mum complementary metal-oxide-semiconductor (CMOS) technology is presented, and its utility is demonstrated by configuring the hardware to emulate a model of attractor dynamics and waves of neural activity during sleep in rat hippocampus.
Abstract: A mixed-signal very large scale integration (VLSI) chip for large scale emulation of spiking neural networks is presented. The chip contains 2400 silicon neurons with fully programmable and reconfigurable synaptic connectivity. Each neuron implements a discrete-time model of a single-compartment cell. The model allows for analog membrane dynamics and an arbitrary number of synaptic connections, each with tunable conductance and reversal potential. The array of silicon neurons functions as an address-event (AE) transceiver, with incoming and outgoing spikes communicated over an asynchronous event-driven digital bus. Address encoding and conflict resolution of spiking events are implemented via a randomized arbitration scheme that ensures balanced servicing of event requests across the array. Routing of events is implemented externally using dynamically programmable random-access memory that stores a postsynaptic address, the conductance, and the reversal potential of each synaptic connection. Here, we describe the silicon neuron circuits, present experimental data characterizing the 3 mm times 3 mm chip fabricated in 0.5-mum complementary metal-oxide-semiconductor (CMOS) technology, and demonstrate its utility by configuring the hardware to emulate a model of attractor dynamics and waves of neural activity during sleep in rat hippocampus

261 citations


Journal ArticleDOI
TL;DR: A novel chaotic time-series prediction method based on support vector machines (SVMs) and echo-state mechanisms is proposed, and its generalization ability and robustness are obtained by regularization operator and robust loss function.
Abstract: A novel chaotic time-series prediction method based on support vector machines (SVMs) and echo-state mechanisms is proposed. The basic idea is replacing "kernel trick" with "reservoir trick" in dealing with nonlinearity, that is, performing linear support vector regression (SVR) in the high-dimension "reservoir" state space, and the solution benefits from the advantages from structural risk minimization principle, and we call it support vector echo-state machines (SVESMs). SVESMs belong to a special kind of recurrent neural networks (RNNs) with convex objective function, and their solution is global, optimal, and unique. SVESMs are especially efficient in dealing with real life nonlinear time series, and its generalization ability and robustness are obtained by regularization operator and robust loss function. The method is tested on the benchmark prediction problem of Mackey-Glass time series and applied to some real life time series such as monthly sunspots time series and runoff time series of the Yellow River, and the prediction results are promising

244 citations


Journal ArticleDOI
TL;DR: The sequential processing of the layers in an NN has been exploited in this paper to implement large NNs using a method of layer multiplexing, so that a larger NN can be realized on a single chip at a lower cost.
Abstract: This paper presents a hardware implementation of multilayer feedforward neural networks (NN) using reconfigurable field-programmable gate arrays (FPGAs). Despite improvements in FPGA densities, the numerous multipliers in an NN limit the size of the network that can be implemented using a single FPGA, thus making NN applications not viable commercially. The proposed implementation is aimed at reducing resource requirement, without much compromise on the speed, so that a larger NN can be realized on a single chip at a lower cost. The sequential processing of the layers in an NN has been exploited in this paper to implement large NNs using a method of layer multiplexing. Instead of realizing a complete network, only the single largest layer is implemented. The same layer behaves as different layers with the help of a control block. The control block ensures proper functioning by assigning the appropriate inputs, weights, biases, and excitation function of the layer that is currently being computed. Multilayer networks have been implemented using Xilinx FPGA "XCV400hq240." The concept used is shown to be very effective in reducing resource requirements at the cost of a moderate overhead on speed. This implementation is proposed to make NN applications viable in terms of cost and speed for online applications. An NN-based flux estimator is implemented in FPGA and the results obtained are presented

205 citations


Journal ArticleDOI
TL;DR: Experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy and can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained.
Abstract: This paper presents a new classifier called total margin-based adaptive fuzzy support vector machines (TAF-SVM) that deals with several problems that may occur in support vector machines (SVMs) when applied to the face recognition. The proposed TAF-SVM not only solves the overfitting problem resulted from the outlier with the approach of fuzzification of the penalty, but also corrects the skew of the optimal separating hyperplane due to the very imbalanced data sets by using different cost algorithm. In addition, by introducing the total margin algorithm to replace the conventional soft margin algorithm, a lower generalization error bound can be obtained. Those three functions are embodied into the traditional SVM so that the TAF-SVM is proposed and reformulated in both linear and nonlinear cases. By using two databases, the Chung Yuan Christian University (CYCU) multiview and the facial recognition technology (FERET) face databases, and using the kernel Fisher's discriminant analysis (KFDA) algorithm to extract discriminating face features, experimental results show that the proposed TAF-SVM is superior to SVM in terms of the face-recognition accuracy. The results also indicate that the proposed TAF-SVM can achieve smaller error variances than SVM over a number of tests such that better recognition stability can be obtained

Journal ArticleDOI
TL;DR: A neural network architecture is proposed to form an unsupervised Bayesian classifier for this application domain that efficiently handles the segmentation in natural-scene sequences with complex background motion and changes in illumination.
Abstract: This paper presents a novel background modeling and subtraction approach for video object segmentation. A neural network (NN) architecture is proposed to form an unsupervised Bayesian classifier for this application domain. The constructed classifier efficiently handles the segmentation in natural-scene sequences with complex background motion and changes in illumination. The weights of the proposed NN serve as a model of the background and are temporally updated to reflect the observed statistics of background. The segmentation performance of the proposed NN is qualitatively and quantitatively examined and compared to two extant probabilistic object segmentation algorithms, based on a previously published test pool containing diverse surveillance-related sequences. The proposed algorithm is parallelized on a subpixel level and designed to enable efficient hardware implementation.

Journal ArticleDOI
TL;DR: A novel nonlinear neural network (NN) predictive control strategy based on the new tent-map chaotic particle swarm optimization (TCPSO) is presented to enhance the convergence and accuracy of the TCPSO.
Abstract: In this letter, a novel nonlinear neural network (NN) predictive control strategy based on the new tent-map chaotic particle swarm optimization (TCPSO) is presented. The TCPSO incorporating tent-map chaos, which can avoid trapping to local minima and improve the searching performance of standard particle swarm optimization (PSO), is applied to perform the nonlinear optimization to enhance the convergence and accuracy. Numerical simulations of two benchmark functions are used to test the performance of TCPSO. Furthermore, simulation on a nonlinear plant is given to illustrate the effectiveness of the proposed control scheme

Journal ArticleDOI
TL;DR: Two fast sparse approximation schemes for least squares support vector machine (LS-SVM) are presented to overcome the limitation of LS-S VM that it is not applicable to large data sets and to improve test speed.
Abstract: In this paper, we present two fast sparse approximation schemes for least squares support vector machine (LS-SVM), named FSALS-SVM and PFSALS-SVM, to overcome the limitation of LS-SVM that it is not applicable to large data sets and to improve test speed. FSALS-SVM iteratively builds the decision function by adding one basis function from a kernel-based dictionary at one time. The process is terminated by using a flexible and stable epsilon insensitive stopping criterion. A probabilistic speedup scheme is employed to further improve the speed of FSALS-SVM and the resulting classifier is named PFSALS-SVM. Our algorithms are of two compelling features: low complexity and sparse solution. Experiments on benchmark data sets show that our algorithms obtain sparse classifiers at a rather low cost without sacrificing the generalization performance

Journal ArticleDOI
TL;DR: The neurogenetic hybrid showed notable improvement on the average over the buy-and-hold strategy and the context-based ensemble further improved the results, which implies that the proposed neurogenetics hybrid can be used for financial portfolio construction.
Abstract: In this paper, we propose a hybrid neurogenetic system for stock trading. A recurrent neural network (NN) having one hidden layer is used for the prediction model. The input features are generated from a number of technical indicators being used by financial experts. The genetic algorithm (GA) optimizes the NN's weights under a 2-D encoding and crossover. We devised a context-based ensemble method of NNs which dynamically changes on the basis of the test day's context. To reduce the time in processing mass data, we parallelized the GA on a Linux cluster system using message passing interface. We tested the proposed method with 36 companies in NYSE and NASDAQ for 13 years from 1992 to 2004. The neurogenetic hybrid showed notable improvement on the average over the buy-and-hold strategy and the context-based ensemble further improved the results. We also observed that some companies were more predictable than others, which implies that the proposed neurogenetic hybrid can be used for financial portfolio construction

Journal ArticleDOI
TL;DR: A localizedgeneralization error model is proposed which bounds from above the generalization error within a neighborhood of the training samples using stochastic sensitivity measure and is used to develop an architecture selection technique for a classifier with maximal coverage of unseen samples by specifying a generalizationerror threshold.
Abstract: The generalization error bounds found by current error models using the number of effective parameters of a classifier and the number of training samples are usually very loose. These bounds are intended for the entire input space. However, support vector machine (SVM), radial basis function neural network (RBFNN), and multilayer perceptron neural network (MLPNN) are local learning machines for solving problems and treat unseen samples near the training samples to be more important. In this paper, we propose a localized generalization error model which bounds from above the generalization error within a neighborhood of the training samples using stochastic sensitivity measure. It is then used to develop an architecture selection technique for a classifier with maximal coverage of unseen samples by specifying a generalization error threshold. Experiments using 17 University of California at Irvine (UCI) data sets show that, in comparison with cross validation (CV), sequential learning, and two other ad hoc methods, our technique consistently yields the best testing classification accuracy with fewer hidden neurons and less training time.

Journal ArticleDOI
TL;DR: A bidirectional associative memory NN with four neurons and multiple delays is considered and analysis of its linear stability and Hopf bifurcation is performed by applying the normal form theory and the center manifold theorem.
Abstract: Various local periodic solutions may represent different classes of storage patterns or memory patterns, and arise from the different equilibrium points of neural networks (NNs) by applying Hopf bifurcation technique. In this paper, a bidirectional associative memory NN with four neurons and multiple delays is considered. By applying the normal form theory and the center manifold theorem, analysis of its linear stability and Hopf bifurcation is performed. An algorithm is worked out for determining the direction and stability of the bifurcated periodic solutions. Numerical simulation results supporting the theoretical analysis are also given

Journal ArticleDOI
TL;DR: A kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets by achieving minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value.
Abstract: Many kernel classifier construction algorithms adopt classification accuracy as performance metrics in model evaluation. Moreover, equal weighting is often applied to each data sample in parameter estimation. These modeling practices often become problematic if the data sets are imbalanced. We present a kernel classifier construction algorithm using orthogonal forward selection (OFS) in order to optimize the model generalization for imbalanced two-class data sets. This kernel classifier identification algorithm is based on a new regularized orthogonal weighted least squares (ROWLS) estimator and the model selection criterion of maximal leave-one-out area under curve (LOO-AUC) of the receiver operating characteristics (ROCs). It is shown that, owing to the orthogonalization procedure, the LOO-AUC can be calculated via an analytic formula based on the new regularized orthogonal weighted least squares parameter estimator, without actually splitting the estimation data set. The proposed algorithm can achieve minimal computational expense via a set of forward recursive updating formula in searching model terms with maximal incremental LOO-AUC value. Numerical examples are used to demonstrate the efficacy of the algorithm

Journal ArticleDOI
TL;DR: The problem of robust output tracking control for a class of time-delay nonlinear systems in the form of triangular structure with unmodeled dynamics is considered and an observer whose gain matrix is scheduled via linear matrix inequality approach is constructed.
Abstract: In this paper, the problem of robust output tracking control for a class of time-delay nonlinear systems is considered. The systems are in the form of triangular structure with unmodeled dynamics. First, we construct an observer whose gain matrix is scheduled via linear matrix inequality approach. For the case that the information of uncertainties bounds is not completely available, we design an observer-based neural network (NN) controller by employing the backstepping method. The resulting closed-loop system is ensured to be stable in the sense of semiglobal boundedness with the help of changing supplying function idea. The observer and the controller designed are both independent of the time delays. Finally, numerical simulations are conducted to verify the effectiveness of the main theoretic results obtained

Journal ArticleDOI
TL;DR: A completely dynamical approach is proposed, in which the problem of dynamical pattern recognition is turned into the stability and convergence of a recognition error system.
Abstract: Recognition of temporal/dynamical patterns is among the most difficult pattern recognition tasks. In this paper, based on a recent result on deterministic learning theory, a deterministic framework is proposed for rapid recognition of dynamical patterns. First, it is shown that a time-varying dynamical pattern can be effectively represented in a time-invariant and spatially distributed manner through deterministic learning. Second, a definition for characterizing similarity of dynamical patterns is given based on system dynamics inherently within dynamical patterns. Third, a mechanism for rapid recognition of dynamical patterns is presented, by which a test dynamical pattern is recognized as similar to a training dynamical pattern if state synchronization is achieved according to a kind of internal and dynamical matching on system dynamics. The synchronization errors can be taken as the measure of similarity between the test and training patterns. The significance of the paper is that a completely dynamical approach is proposed, in which the problem of dynamical pattern recognition is turned into the stability and convergence of a recognition error system. Simulation studies are included to demonstrate the effectiveness of the proposed approach

Journal ArticleDOI
TL;DR: A new criterion for the global asymptotic stability of the equilibrium point of cellular neural networks with multiple time delays is presented and possesses the structure of a linear matrix inequality.
Abstract: A new criterion for the global asymptotic stability of the equilibrium point of cellular neural networks with multiple time delays is presented. The obtained result possesses the structure of a linear matrix inequality and can be solved efficiently using the recently developed interior-point algorithm. A numerical example is used to show the effectiveness of the obtained result

Journal ArticleDOI
TL;DR: This letter presents an adaptive synchronization scheme between two different kinds of delayed chaotic neural networks (NNs) with partly unknown parameters to guarantee the global asymptotic synchronization of state trajectories for two different chaotic NNs with time delay.
Abstract: This letter presents an adaptive synchronization scheme between two different kinds of delayed chaotic neural networks (NNs) with partly unknown parameters. An adaptive controller is designed to guarantee the global asymptotic synchronization of state trajectories for two different chaotic NNs with time delay. An illustrative example is given to demonstrate the effectiveness of the present method.

Journal ArticleDOI
TL;DR: The results show that an MLP-BP network uses less clock cycles and consumes less real estate when compiled in an FXP format, compared with a larger and slower functioning compilation in an FLP format with similar data representation width, in bits, or a similar precision and range.
Abstract: In this paper, arithmetic representations for implementing multilayer perceptrons trained using the error backpropagation algorithm (MLP-BP) neural networks on field-programmable gate arrays (FPGAs) are examined in detail. Both floating-point (FLP) and fixed-point (FXP) formats are studied and the effect of precision of representation and FPGA area requirements are considered. A generic very high-speed integrated circuit hardware description language (VHDL) program was developed to help experiment with a large number of formats and designs. The results show that an MLP-BP network uses less clock cycles and consumes less real estate when compiled in an FXP format, compared with a larger and slower functioning compilation in an FLP format with similar data representation width, in bits, or a similar precision and range

Journal ArticleDOI
TL;DR: A neural network (NN) approach to forecasting quarterly time series with a large data set from the M3 forecasting competition is presented and results indicate that simpler models, in general, outperform more complex models.
Abstract: Forecasting of time series that have seasonal and other variations remains an important problem for forecasters. This paper presents a neural network (NN) approach to forecasting quarterly time series. With a large data set of 756 quarterly time series from the M3 forecasting competition, we conduct a comprehensive investigation of the effectiveness of several data preprocessing and modeling approaches. We consider two data preprocessing methods and 48 NN models with different possible combinations of lagged observations, seasonal dummy variables, trigonometric variables, and time index as inputs to the NN. Both parametric and nonparametric statistical analyses are performed to identify the best models under different circumstances and categorize similar models. Results indicate that simpler models, in general, outperform more complex models. In addition, data preprocessing especially with deseasonalization and detrending is very helpful in improving NN performance. Practical guidelines are also provided.

Journal ArticleDOI
TL;DR: In this paper, a high-order neural network (HONN) structure is used to approximate a control law designed by the backstepping technique, applied to a block strict feedback form (BSFF).
Abstract: This paper deals with adaptive tracking for discrete-time multiple-input-multiple-output (MIMO) nonlinear systems in presence of bounded disturbances. In this paper, a high-order neural network (HONN) structure is used to approximate a control law designed by the backstepping technique, applied to a block strict feedback form (BSFF). This paper also includes the respective stability analysis, on the basis of the Lyapunov approach, for the whole controlled system, including the extended Kalman filter (EKF)-based NN learning algorithm. Applicability of the scheme is illustrated via simulation for a discrete-time nonlinear model of an electric induction motor.

Journal ArticleDOI
TL;DR: Simulation results show that the SAFNC can achieve favorable tracking performances and all the parameter learning algorithms are derived based on Lyapunov function candidate, thus the system stability can be guaranteed.
Abstract: This paper proposes a self-organizing adaptive fuzzy neural control (SAFNC) via sliding-mode approach for a class of nonlinear systems. The proposed SAFNC system is comprised of a computation controller and a supervisory controller. The computation controller including a self-organizing fuzzy neural network (SOFNN) identifier is the principal controller. The SOFNN identifier is used to online estimate the controlled system dynamics with the structure and parameter learning phases of fuzzy neural network (FNN), simultaneously. The structure learning phase possesses the ability of online generation and elimination of fuzzy rules to achieve optimal neural structure, and the parameter learning phase adjusts the interconnection weights of neural network to achieve favorable approximation performance. The supervisory controller is used to achieve the L2-norm bound tracking performance with a desired attenuation level. Moreover, all the parameter learning algorithms are derived based on Lyapunov function candidate, thus the system stability can be guaranteed. Finally, simulation results show that the SAFNC can achieve favorable tracking performances.

Journal ArticleDOI
TL;DR: It is demonstrated that the BPTT algorithm is more efficient for gradient calculations, but the RTRL algorithm isMore efficient for Jacobian calculations.
Abstract: This paper introduces a general framework for describing dynamic neural networks-the layered digital dynamic network (LDDN). This framework allows the development of two general algorithms for computing the gradients and Jacobians for these dynamic networks: backpropagation-through-time (BPTT) and real-time recurrent learning (RTRL). The structure of the LDDN framework enables an efficient implementation of both algorithms for arbitrary dynamic networks. This paper demonstrates that the BPTT algorithm is more efficient for gradient calculations, but the RTRL algorithm is more efficient for Jacobian calculations

Journal ArticleDOI
TL;DR: Three new functions for ranking candidate models are proposed, constructed by symmetrizing the Kullback-Leibler divergence between the true model and the approximating candidate model, found that the original AIC criterion is an asymptotically unbiased estimator of these three different functions.
Abstract: The Akaike information criterion (AIC) is a widely used tool for model selection. AIC is derived as an asymptotically unbiased estimator of a function used for ranking candidate models which is a variant of the Kullback-Leibler divergence between the true model and the approximating candidate model. Despite the Kullback-Leibler's computational and theoretical advantages, what can become inconvenient in model selection applications is their lack of symmetry. Simple examples can show that reversing the role of the arguments in the Kullback-Leibler divergence can yield substantially different results. In this paper, three new functions for ranking candidate models are proposed. These functions are constructed by symmetrizing the Kullback-Leibler divergence between the true model and the approximating candidate model. The operations used for symmetrizing are the average, geometric, and harmonic means. It is found that the original AIC criterion is an asymptotically unbiased estimator of these three different functions. Using one of these proposed ranking functions, an example of new bias correction to AIC is derived for univariate linear regression models. A simulation study based on polynomial regression is provided to compare the different proposed ranking functions with AIC and the new derived correction with AICc

Journal ArticleDOI
TL;DR: This paper applies PyraNet to determine gender from a facial image, and compares its performance on the standard facial recognition technology (FERET) database with three classifiers: The convolutional neural network (NN), the k-nearest neighbor (k-NN), and the support vector machine (SVM).
Abstract: In this paper, we propose a new neural architecture for classification of visual patterns that is motivated by the two concepts of image pyramids and local receptive fields. The new architecture, called pyramidal neural network (PyraNet), has a hierarchical structure with two types of processing layers: Pyramidal layers and one-dimensional (1-D) layers. In the new network, nonlinear two-dimensional (2-D) neurons are trained to perform both image feature extraction and dimensionality reduction. We present and analyze five training methods for PyraNet [gradient descent (GD), gradient descent with momentum, resilient backpropagation (RPROP), Polak-Ribiere conjugate gradient (CG), and Levenberg-Marquadrt (LM)] and two choices of error functions [mean-square-error (mse) and cross-entropy (CE)]. In this paper, we apply PyraNet to determine gender from a facial image, and compare its performance on the standard facial recognition technology (FERET) database with three classifiers: The convolutional neural network (NN), the k-nearest neighbor (k-NN), and the support vector machine (SVM)

Journal ArticleDOI
TL;DR: A continuous-time formulation of an adaptive critic design (ACD) is investigated, where backpropagation through time (BPTT) and real-time recurrent learning (RTRL) are prevalent and second-order actor adaptation using Newton's method is established for fast actor convergence for a general plant and critic.
Abstract: A continuous-time formulation of an adaptive critic design (ACD) is investigated. Connections to the discrete case are made, where backpropagation through time (BPTT) and real-time recurrent learning (RTRL) are prevalent. Practical benefits are that this framework fits in well with plant descriptions given by differential equations and that any standard integration routine with adaptive step-size does an adaptive sampling for free. A second-order actor adaptation using Newton's method is established for fast actor convergence for a general plant and critic. Also, a fast critic update for concurrent actor-critic training is introduced to immediately apply necessary adjustments of critic parameters induced by actor updates to keep the Bellman optimality correct to first-order approximation after actor changes. Thus, critic and actor updates may be performed at the same time until some substantial error build up in the Bellman optimality or temporal difference equation, when a traditional critic training needs to be performed and then another interval of concurrent actor-critic training may resume