Showing papers in &quot;IEEE Transactions on Neural Networks in 2004&quot;

Adaptive neural control of uncertain MIMO nonlinear systems

TL;DR: The biological plausibility and computational efficiency of some of the most useful models of spiking and bursting neurons are discussed and their applicability to large-scale simulations of cortical neural networks is compared.

...read moreread less

Abstract: We discuss the biological plausibility and computational efficiency of some of the most useful models of spiking and bursting neurons. We compare their applicability to large-scale simulations of cortical neural networks.

...read moreread less

2,396 citations

Journal Article•DOI•

[...]

Shuzhi Sam Ge¹, Cong Wang²•Institutions (2)

National University of Singapore¹, South China University of Technology²

The pre-image problem in kernel methods

TL;DR: Adapt neural control schemes are proposed for two classes of uncertain multi-input/multi-output (MIMO) nonlinear systems in block-triangular forms that avoid the controller singularity problem completely without using projection algorithms.

...read moreread less

Abstract: In this paper, adaptive neural control schemes are proposed for two classes of uncertain multi-input/multi-output (MIMO) nonlinear systems in block-triangular forms. The MIMO systems consist of interconnected subsystems, with couplings in the forms of unknown nonlinearities and/or parametric uncertainties in the input matrices, as well as in the system interconnections without any bounding restrictions. Using the block-triangular structure properties, the stability analyses of the closed-loop MIMO systems are shown in a nested iterative manner for all the states. By exploiting the special properties of the affine terms of the two classes of MIMO systems, the developed neural control schemes avoid the controller singularity problem completely without using projection algorithms. Semiglobal uniform ultimate boundedness (SGUUB) of all the signals in the closed-loop of MIMO nonlinear systems is achieved. The outputs of the systems are proven to converge to a small neighborhood of the desired trajectories. The control performance of the closed-loop system is guaranteed by suitably choosing the design parameters. The proposed schemes offer systematic design procedures for the control of the two classes of uncertain MIMO nonlinear systems. Simulation results are presented to show the effectiveness of the approach.

...read moreread less

771 citations

Journal Article•DOI•

[...]

James T. Kwok¹, Ivor W. Tsang¹•Institutions (1)

Hong Kong University of Science and Technology¹

01 Nov 2004-IEEE Transactions on Neural Networks

TL;DR: In this article, the authors address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel, which is of central importance in some kernel applications such as on using kernel principal component analysis (PCA) for image denoising.

...read moreread less

Abstract: In this paper, we address the problem of finding the pre-image of a feature vector in the feature space induced by a kernel. This is of central importance in some kernel applications, such as on using kernel principal component analysis (PCA) for image denoising. Unlike the traditional method in which relies on nonlinear optimization, our proposed method directly finds the location of the pre-image based on distance constraints in the feature space. It is noniterative, involves only linear algebra and does not suffer from numerical instability or local minimum problems. Evaluations on performing kernel PCA and kernel clustering on the USPS data set show much improved performance.

...read moreread less

414 citations

Journal Article•DOI•

Robust and adaptive backstepping control for nonlinear systems using RBF neural networks

[...]

Yahui Li¹, Sheng Qiang¹, Xianyi Zhuang¹, Okyay Kaynak²•Institutions (2)

Harbin Institute of Technology¹, Boğaziçi University²

Monaural speech segregation based on pitch tracking and amplitude modulation

TL;DR: Two different backstepping neural network (NN) control approaches are presented for a class of affine nonlinear systems in the strict-feedback form with unknown nonlinearities and the controller singularity problem is avoided perfectly in both approaches.

...read moreread less

Abstract: In this paper, two different backstepping neural network (NN) control approaches are presented for a class of affine nonlinear systems in the strict-feedback form with unknown nonlinearities. By a special design scheme, the controller singularity problem is avoided perfectly in both approaches. Furthermore, the closed loop signals are guaranteed to be semiglobally uniformly ultimately bounded and the outputs of the system are proved to converge to a small neighborhood of the desired trajectory. The control performances of the closed-loop systems can be shaped as desired by suitably choosing the design parameters. Simulation results obtained demonstrate the effectiveness of the approaches proposed. The differences observed between the inputs of the two controllers are analyzed briefly.

...read moreread less

404 citations

Journal Article•DOI•

[...]

Guoning Hu¹, DeLiang Wang¹•Institutions (1)

Ohio State University¹

TL;DR: This work proposes a novel system for voiced speech segregation that segregates resolved and unresolved harmonics differently, and it yields substantially better performance, especially for the high-frequency part of speech.

...read moreread less

Abstract: Segregating speech from one monaural recording has proven to be very challenging. Monaural segregation of voiced speech has been studied in previous systems that incorporate auditory scene analysis principles. A major problem for these systems is their inability to deal with the high-frequency part of speech. Psychoacoustic evidence suggests that different perceptual mechanisms are involved in handling resolved and unresolved harmonics. We propose a novel system for voiced speech segregation that segregates resolved and unresolved harmonics differently. For resolved harmonics, the system generates segments based on temporal continuity and cross-channel correlation, and groups them according to their periodicities. For unresolved harmonics, it generates segments based on common amplitude modulation (AM) in addition to temporal continuity and groups them according to AM rates. Underlying the segregation process is a pitch contour that is first estimated from speech segregated according to dominant pitch and then adjusted according to psychoacoustic constraints. Our system is systematically evaluated and compared with pervious systems, and it yields substantially better performance, especially for the high-frequency part of speech.

...read moreread less

394 citations

Journal Article•DOI•

The generalized LASSO

[...]

Volker Roth¹•Institutions (1)

University of Bonn¹

Face recognition by applying wavelet subband representation and kernel associative memory

TL;DR: This paper presents a different class of kernel regressors that effectively overcome the above problems, and presents a highly efficient algorithm with guaranteed global convergence that defies a unique framework for sparse regression models in the very rich class of IRLS models.

...read moreread less

Abstract: In the last few years, the support vector machine (SVM) method has motivated new interest in kernel regression techniques. Although the SVM has been shown to exhibit excellent generalization properties in many experiments, it suffers from several drawbacks, both of a theoretical and a technical nature: the absence of probabilistic outputs, the restriction to Mercer kernels, and the steep growth of the number of support vectors with increasing size of the training set. In this paper, we present a different class of kernel regressors that effectively overcome the above problems. We call this approach generalized LASSO regression. It has a clear probabilistic interpretation, can handle learning sets that are corrupted by outliers, produces extremely sparse solutions, and is capable of dealing with large-scale problems. For regression functionals which can be modeled as iteratively reweighted least-squares (IRLS) problems, we present a highly efficient algorithm with guaranteed global convergence. This defies a unique framework for sparse regression models in the very rich class of IRLS models, including various types of robust regression models and logistic regression. Performance studies for many standard benchmark datasets effectively demonstrate the advantages of this model over related approaches.

...read moreread less

281 citations

Journal Article•DOI•

[...]

Bailing Zhang¹, Haihong Zhang, Shuzhi Sam Ge²•Institutions (2)

Bond University¹, National University of Singapore²

A general projection neural network for solving monotone variational inequalities and related optimization problems

TL;DR: An efficient face recognition scheme which has two features: representation of face images by two-dimensional wavelet subband coefficients and recognition by a modular, personalised classification method based on kernel associative memory models.

...read moreread less

Abstract: In this paper, we propose an efficient face recognition scheme which has two features: 1) representation of face images by two-dimensional (2D) wavelet subband coefficients and 2) recognition by a modular, personalised classification method based on kernel associative memory models. Compared to PCA projections and low resolution "thumb-nail" image representations, wavelet subband coefficients can efficiently capture substantial facial features while keeping computational complexity low. As there are usually very limited samples, we constructed an associative memory (AM) model for each person and proposed to improve the performance of AM models by kernel methods. Specifically, we first applied kernel transforms to each possible training pair of faces sample and then mapped the high-dimensional feature space back to input space. Our scheme using modular autoassociative memory for face recognition is inspired by the same motivation as using autoencoders for optical character recognition (OCR), for which the advantages has been proven. By associative memory, all the prototypical faces of one particular person are used to reconstruct themselves and the reconstruction error for a probe face image is used to decide if the probe face is from the corresponding person. We carried out extensive experiments on three standard face recognition datasets, the FERET data, the XM2VTS data, and the ORL data. Detailed comparisons with earlier published results are provided and our proposed scheme offers better recognition accuracy on all of the face datasets.

...read moreread less

268 citations

Journal Article•DOI•

[...]

Youshen Xia¹, Jun Wang²•Institutions (2)

Nanjing University¹, The Chinese University of Hong Kong²

A generalized LMI-based approach to the global asymptotic stability of delayed cellular neural networks

TL;DR: Under various mild conditions, the proposed general projection neural network is shown to be globally convergent, globally asymptotically stable, and globally exponentially stable.

...read moreread less

Abstract: Recently, a projection neural network for solving monotone variational inequalities and constrained optimization problems was developed. In this paper, we propose a general projection neural network for solving a wider class of variational inequalities and related optimization problems. In addition to its simple structure and low complexity, the proposed neural network includes existing neural networks for optimization, such as the projection neural network, the primal-dual neural network, and the dual neural network, as special cases. Under various mild conditions, the proposed general projection neural network is shown to be globally convergent, globally asymptotically stable, and globally exponentially stable. Furthermore, several improved stability criteria on two special cases of the general projection neural network are obtained under weaker conditions. Simulation results demonstrate the effectiveness and characteristics of the proposed neural network.

...read moreread less

254 citations

Journal Article•DOI•

[...]

V. Singh¹•Institutions (1)

Atılım University¹

Adaptive probabilistic neural networks for pattern classification in time-varying environment

TL;DR: A novel linear matrix inequality (LMI)-based criterion for the global asymptotic stability and uniqueness of the equilibrium point of a class of delayed cellular neural networks (CNNs) is presented and turns out to be a generalization and improvement over some previous criteria.

...read moreread less

Abstract: A novel linear matrix inequality (LMI)-based criterion for the global asymptotic stability and uniqueness of the equilibrium point of a class of delayed cellular neural networks (CNNs) is presented. The criterion turns out to be a generalization and improvement over some previous criteria.

...read moreread less

216 citations

Journal Article•DOI•

[...]

Leszek Rutkowski

01 Jul 2004-IEEE Transactions on Neural Networks

TL;DR: A new class of probabilistic neural networks (PNNs) working in nonstationary environment is proposed and definitions of optimality of PNNs in time-varying environment are presented, for the first time in literature,.

...read moreread less

Abstract: In this paper, we propose a new class of probabilistic neural networks (PNNs) working in nonstationary environment. The novelty is summarized as follows: 1) We formulate the problem of pattern classification in nonstationary environment as the prediction problem and design a probabilistic neural network to classify patterns having time-varying probability distributions. We note that the problem of pattern classification in the nonstationary case is closely connected with the problem of prediction because on the basis of a learning sequence of the length n, a pattern in the moment n+k, k/spl ges/1 should be classified. 2) We present, for the first time in literature, definitions of optimality of PNNs in time-varying environment. Moreover, we prove that our PNNs asymptotically approach the Bayes-optimal (time-varying) decision surface. 3) We investigate the speed of convergence of constructed PNNs. 4) We design in detail PNNs based on Parzen kernels and multivariate Hermite series.

...read moreread less

211 citations

Journal Article•DOI•

A constructive approach for finding arbitrary roots of polynomials by neural networks

[...]

De-Shuang Huang¹•Institutions (1)

Chinese Academy of Sciences¹

Fusing images with different focuses using support vector machines

TL;DR: Experimental results show that the proposed neural connectionism approaches, with respect to the nonneural ones, are more efficient and feasible in finding the arbitrary roots of arbitrary polynomials.

...read moreread less

Abstract: This paper proposes a constructive approach for finding arbitrary (real or complex) roots of arbitrary (real or complex) polynomials by multilayer perceptron network (MLPN) using constrained learning algorithm (CLA), which encodes the a priori information of constraint relations between root moments and coefficients of a polynomial into the usual BP algorithm (BPA). Moreover, the root moment method (RMM) is also simplified into a recursive version so that the computational complexity can be further decreased, which leads the roots of those higher order polynomials to be readily found. In addition, an adaptive learning parameter with the CLA is also proposed in this paper; an initial weight selection method is also given. Finally, several experimental results show that our proposed neural connectionism approaches, with respect to the nonneural ones, are more efficient and feasible in finding the arbitrary roots of arbitrary polynomials.

...read moreread less

Journal Article•DOI•

[...]

Shutao Li¹, James T. Kwok², Ivor W. Tsang², Yaonan Wang¹•Institutions (2)

Hunan University¹, Hong Kong University of Science and Technology²

01 Nov 2004-IEEE Transactions on Neural Networks

TL;DR: Improved wavelet-based image fusion procedure is improved by applying the discrete wavelet frame transform (DWFT) and the support vector machines (SVM), which yields a translation-invariant signal representation.

...read moreread less

Abstract: Many vision-related processing tasks, such as edge detection, image segmentation and stereo matching, can be performed more easily when all objects in the scene are in good focus. However, in practice, this may not be always feasible as optical lenses, especially those with long focal lengths, only have a limited depth of field. One common approach to recover an everywhere-in-focus image is to use wavelet-based image fusion. First, several source images with different focuses of the same scene are taken and processed with the discrete wavelet transform (DWT). Among these wavelet decompositions, the wavelet coefficient with the largest magnitude is selected at each pixel location. Finally, the fused image can be recovered by performing the inverse DWT. In this paper, we improve this fusion procedure by applying the discrete wavelet frame transform (DWFT) and the support vector machines (SVM). Unlike DWT, DWFT yields a translation-invariant signal representation. Using features extracted from the DWFT coefficients, a SVM is trained to select the source image that has the best focus at each pixel location, and the corresponding DWFT coefficients are then incorporated into the composite wavelet representation. Experimental results show that the proposed method outperforms the traditional approach both visually and quantitatively.

...read moreread less

Journal Article•DOI•

Independent component analysis based on nonparametric density estimation

[...]

Riccardo Boscolo¹, Hong Pan², Vwani P. Roychowdhury¹•Institutions (2)

University of California, Los Angeles¹, Cornell University²

Robust global exponential stability of Cohen-Grossberg neural networks with time delays

TL;DR: A novel independent component analysis algorithm, which is truly blind to the particular underlying distribution of the mixed signals, is introduced, which consistently outperformed all state-of-the-art ICA methods and demonstrated the following properties.

...read moreread less

Abstract: In this paper, we introduce a novel independent component analysis (ICA) algorithm, which is truly blind to the particular underlying distribution of the mixed signals. Using a nonparametric kernel density estimation technique, the algorithm performs simultaneously the estimation of the unknown probability density functions of the source signals and the estimation of the unmixing matrix. Following the proposed approach, the blind signal separation framework can be posed as a nonlinear optimization problem, where a closed form expression of the cost function is available, and only the elements of the unmixing matrix appear as unknowns. We conducted a series of Monte Carlo simulations, involving linear mixtures of various source signals with different statistical characteristics and sample sizes. The new algorithm not only consistently outperformed all state-of-the-art ICA methods, but also demonstrated the following properties: 1) Only a flexible model, capable of learning the source statistics, can consistently achieve an accurate separation of all the mixed signals. 2) Adopting a suitably designed optimization framework, it is possible to derive a flexible ICA algorithm that matches the stability and convergence properties of conventional algorithms. 3) A nonparametric approach does not necessarily require large sample sizes in order to outperform methods with fixed or partially adaptive contrast functions.

...read moreread less

Journal Article•DOI•

[...]

Tianping Chen¹, Libin Rong¹•Institutions (1)

Fudan University¹

A neuro-fuzzy scheme for simultaneous feature selection and fuzzy rule-based classification

TL;DR: The authors discuss delayed Cohen-Grossberg neural network models and investigate their global exponential stability of the equilibrium point for the systems.

...read moreread less

Abstract: The authors discuss delayed Cohen-Grossberg neural network models and investigate their global exponential stability of the equilibrium point for the systems. A set of sufficient conditions ensuring robust global exponential convergence of the Cohen-Grossberg neural networks with time delays are given.

...read moreread less

Journal Article•DOI•

[...]

Debrup Chakraborty, Nikhil R. Pal

Markovian architectural bias of recurrent neural networks

TL;DR: This paper proposes a neuro-fuzzy scheme for designing a classifier along with feature selection, a four-layered feed-forward network for realizing a fuzzy rule-based classifier.

...read moreread less

Abstract: Most methods of classification either ignore feature analysis or do it in a separate phase, offline prior to the main classification task. This paper proposes a neuro-fuzzy scheme for designing a classifier along with feature selection. It is a four-layered feed-forward network for realizing a fuzzy rule-based classifier. The network is trained by error backpropagation in three phases. In the first phase, the network learns the important features and the classification rules. In the subsequent phases, the network is pruned to an "optimal" architecture that represents an "optimal" set of rules. Pruning is found to drastically reduce the size of the network without degrading the performance. The pruned network is further tuned to improve performance. The rules learned by the network can be easily read from the network. The system is tested on both synthetic and real data sets and found to perform quite well.

...read moreread less

Journal Article•DOI•

[...]

Peter Tino¹, M. Cernansky, Lubica Benuskova²•Institutions (2)

University of Birmingham¹, Comenius University in Bratislava²

Synchrony detection and amplification by silicon neurons with STDP synapses

TL;DR: This paper elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training.

...read moreread less

Abstract: In this paper, we elaborate upon the claim that clustering in the recurrent layer of recurrent neural networks (RNNs) reflects meaningful information processing states even prior to training. By concentrating on activation clusters in RNNs, while not throwing away the continuous state space network dynamics, we extract predictive models that we call neural prediction machines (NPMs). When RNNs with sigmoid activation functions are initialized with small weights (a common technique in the RNN community), the clusters of recurrent activations emerging prior to training are indeed meaningful and correspond to Markov prediction contexts. In this case, the extracted NPMs correspond to a class of Markov models, called variable memory length Markov models (VLMMs). In order to appreciate how much information has really been induced during the training, the RNN performance should always be compared with that of VLMMs and NPMs extracted before training as the "null" base models. Our arguments are supported by experiments on a chaotic symbolic sequence and a context-free language with a deep recursive structure.

...read moreread less

Journal Article•DOI•

[...]

A. Bofill-i-Petit¹, Alan F. Murray¹•Institutions (1)

University of Edinburgh¹

Scalar equations for synchronous Boolean networks with biological applications

TL;DR: This paper presents a neuromorphic analog very large scale integration (VLSI) circuit that contains a feedforward network of silicon neurons with STDP synapses and shows that the chip can detect and amplify hierarchical spike-timing synchrony structures embedded in noisy spike trains.

...read moreread less

Abstract: Spike-timing dependent synaptic plasticity (STDP) is a form of plasticity driven by precise spike-timing differences between presynaptic and postsynaptic spikes. Thus, the learning rules underlying STDP are suitable for learning neuronal temporal phenomena such as spike-timing synchrony. It is well known that weight-independent STDP creates unstable learning processes resulting in balanced bimodal weight distributions. In this paper, we present a neuromorphic analog very large scale integration (VLSI) circuit that contains a feedforward network of silicon neurons with STDP synapses. The learning rule implemented can be tuned to have a moderate level of weight dependence. This helps stabilise the learning process and still generates binary weight distributions. From on-chip learning experiments we show that the chip can detect and amplify hierarchical spike-timing synchrony structures embedded in noisy spike trains. The weight distributions of the network emerging from learning are bimodal.

...read moreread less

Journal Article•DOI•

[...]

Christopher L. Farrow¹, Jack Heidel¹, John Maloney¹, Jim A. Rogers¹•Institutions (1)

University of Nebraska Omaha¹

Bayesian support vector regression using a unified loss function

TL;DR: The scalar equation approach to Boolean network models is further developed and then applied to two interesting biological models and gives immediate information about both cycle and transient structure of the network.

...read moreread less

Abstract: One way of coping with the complexity of biological systems is to use the simplest possible models which are able to reproduce at least some nontrivial features of reality. Although two value Boolean models have a long history in technology, it is perhaps a little bit surprising that they can also represent important features of living organizms. In this paper, the scalar equation approach to Boolean network models is further developed and then applied to two interesting biological models. In particular, a linear reduced scalar equation is derived from a more rudimentary nonlinear scalar equation. This simpler, but higher order, two term equation gives immediate information about both cycle and transient structure of the network.

...read moreread less

Journal Article•DOI•

[...]

Wei Chu¹, S. Sathiya Keerthi², Chong Jin Ong²•Institutions (2)

University College London¹, National University of Singapore²

New results on error correcting output codes of kernel machines

TL;DR: Experimental results on simulated and real-world data sets indicate that the approach works well even on large data sets, and has the advantages of Bayesian methods for model adaptation and error bars of its predictions.

...read moreread less

Abstract: In this paper, we use a unified loss function, called the soft insensitive loss function, for Bayesian support vector regression. We follow standard Gaussian processes for regression to set up the Bayesian framework, in which the unified loss function is used in the likelihood evaluation. Under this framework, the maximum a posteriori estimate of the function values corresponds to the solution of an extended support vector regression problem. The overall approach has the merits of support vector regression such as convex quadratic programming and sparsity in solution representation. It also has the advantages of Bayesian methods for model adaptation and error bars of its predictions. Experimental results on simulated and real-world data sets indicate that the approach works well even on large data sets.

...read moreread less

Journal Article•DOI•

[...]

Andrea Passerini¹, Massimiliano Pontil², Paolo Frasconi¹•Institutions (2)

University of Florence¹, University College London²

Identification and control of dynamical systems using the self-organizing map

TL;DR: A new decoding function is introduced that combines the margins through an estimate of their class conditional probabilities, which can be used to tune kernel hyperparameters and empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.

...read moreread less

Abstract: We study the problem of multiclass classification within the framework of error correcting output codes (ECOC) using margin-based binary classifiers. Specifically, we address two important open problems in this context: decoding and model selection. The decoding problem concerns how to map the outputs of the classifiers into class codewords. In this paper we introduce a new decoding function that combines the margins through an estimate of their class conditional probabilities. Concerning model selection, we present new theoretical results bounding the leave-one-out (LOO) error of ECOC of kernel machines, which can be used to tune kernel hyperparameters. We report experiments using support vector machines as the base binary classifiers, showing the advantage of the proposed decoding function over other functions of I he margin commonly used in practice. Moreover, our empirical evaluations on model selection indicate that the bound leads to good estimates of kernel parameters.

...read moreread less

Journal Article•DOI•

[...]

Guilherme A. Barreto¹, Aluizio F. R. Araújo²•Institutions (2)

Federal University of Ceará¹, Federal University of Pernambuco²

Multistability of discrete-time recurrent neural networks with unsaturating piecewise linear activation functions

TL;DR: It is demonstrated that the estimation errors decrease as the SOM training proceeds, allowing the VQTAM scheme to be understood as a self-supervised gradient-based error reduction method.

...read moreread less

Abstract: In this paper, we introduce a general modeling technique, called vector-quantized temporal associative memory (VQTAM), which uses Kohonen's self-organizing map (SOM) as an alternative to multilayer perceptron (MLP) and radial basis function (RBF) neural models for dynamical system identification and control. We demonstrate that the estimation errors decrease as the SOM training proceeds, allowing the VQTAM scheme to be understood as a self-supervised gradient-based error reduction method. The performance of the proposed approach is evaluated on a variety of complex tasks, namely: i) time series prediction; ii) identification of SISO/MIMO systems; and iii) nonlinear predictive control. For all tasks, the simulation results produced by the SOM are as accurate as those produced by the MLP network, and better than those produced by the RBF network. The SOM has also shown to be less sensitive to weight initialization than MLP networks. We conclude the paper by discussing the main properties of the VQTAM and their relationships to other well established methods for dynamical system identification. We also suggest directions for further work.

...read moreread less

Journal Article•DOI•

[...]

Zhang Yi, Kok Kiong Tan¹•Institutions (1)

National University of Singapore¹

From blind signal extraction to blind instantaneous signal separation: criteria, algorithms, and stability

TL;DR: Using the local inhibition, conditions for nondivergence are derived, which not only guarantee nondiversgence, but also allow for the existence of multiequilibrium points.

...read moreread less

Abstract: This paper studies the multistability of a class of discrete-time recurrent neural networks with unsaturating piecewise linear activation functions. It addresses the nondivergence, global attractivity, and complete stability of the networks. Using the local inhibition, conditions for nondivergence are derived, which not only guarantee nondivergence, but also allow for the existence of multiequilibrium points. Under these nondivergence conditions, global attractive compact sets are obtained. Complete stability is studied via constructing novel energy functions and using the well-known Cauchy Convergence Principle. Examples and simulation results are used to illustrate the theory.

...read moreread less

Journal Article•DOI•

[...]

S.A. Cruces-Alvarez, Andrzej Cichocki¹, Shun-ichi Amari•Institutions (1)

Warsaw University of Technology¹

01 Jul 2004-IEEE Transactions on Neural Networks

TL;DR: A general overview and unification of several information theoretic criteria for the extraction of a single independent component is presented and tools that extend these criteria to allow the simultaneous blind extraction of subsets with an arbitrary number of independent components are presented.

...read moreread less

Abstract: This paper reports a study on the problem of the blind simultaneous extraction of specific groups of independent components from a linear mixture. This paper first presents a general overview and unification of several information theoretic criteria for the extraction of a single independent component. Then, our contribution fills the theoretical gap that exists between extraction and separation by presenting tools that extend these criteria to allow the simultaneous blind extraction of subsets with an arbitrary number of independent components. In addition, we analyze a family of learning algorithms based on Stiefel manifolds and the natural gradient ascent, present the nonlinear optimal activations (score) functions, and provide new or extended local stability conditions. Finally, we illustrate the performance and features of the proposed approach by computer-simulation experiments.

...read moreread less

Journal Article•DOI•

Discriminative learning quadratic discriminant function for handwriting recognition

[...]

Cheng-Lin Liu¹, Hiroshi Sako¹, Hiromichi Fujisawa¹•Institutions (1)

Hitachi¹

Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres

TL;DR: A discriminative learning algorithm to optimize the parameters of MQDF with aim to improve the classification accuracy while preserving the superior noncharacter resistance is proposed, which is justified in handwritten digit recognition and numeral string recognition.

...read moreread less

Abstract: In character string recognition integrating segmentation and classification, high classification accuracy and resistance to noncharacters are desired to the underlying classifier. In a previous evaluation study, the modified quadratic discriminant function (MQDF) proposed by Kimura et al. was shown to be superior in noncharacter resistance but inferior in classification accuracy to neural networks. This paper proposes a discriminative learning algorithm to optimize the parameters of MQDF with aim to improve the classification accuracy while preserving the superior noncharacter resistance. We refer to the resulting classifier as discriminative learning QDF (DLQDF). The parameters of DLQDF adhere to the structure of MQDF under the Gaussian density assumption and are optimized under the minimum classification error (MCE) criterion. The promise of DLQDF is justified in handwritten digit recognition and numeral string recognition, where the performance of DLQDF is comparable to or superior to that of neural classifiers. The results are also competitive to the best ones reported in the literature.

...read moreread less

Journal Article•DOI•

[...]

Arindam Banerjee¹, Joydeep Ghosh¹•Institutions (1)

University of Texas at Austin¹

A "nonnegative PCA" algorithm for independent component analysis

TL;DR: This paper shows that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von Mises-Fisher distributions as the generative model, and in fact, it can be considered as a batch-mode version of (normalized) competitive learning.

...read moreread less

Abstract: Competitive learning mechanisms for clustering, in general, suffer from poor performance for very high-dimensional (>1000) data because of "curse of dimensionality" effects. In applications such as document clustering, it is customary to normalize the high-dimensional input vectors to unit length, and it is sometimes also desirable to obtain balanced clusters, i.e., clusters of comparable sizes. The spherical kmeans (spkmeans) algorithm, which normalizes the cluster centers as well as the inputs, has been successfully used to cluster normalized text documents in 2000+ dimensional space. Unfortunately, like regular kmeans and its soft expectation-maximization-based version, spkmeans tends to generate extremely imbalanced clusters in high-dimensional spaces when the desired number of clusters is large (tens or more). This paper first shows that the spkmeans algorithm can be derived from a certain maximum likelihood formulation using a mixture of von Mises-Fisher distributions as the generative model, and in fact, it can be considered as a batch-mode version of (normalized) competitive learning. The proposed generative model is then adapted in a principled way to yield three frequency-sensitive competitive learning variants that are applicable to static data and produced high-quality and well-balanced clusters for high-dimensional data. Like kmeans, each iteration is linear in the number of data points and in the number of clusters for all the three algorithms. A frequency-sensitive algorithm to cluster streaming data is also proposed. Experimental results on clustering of high-dimensional text data sets are provided to show the effectiveness and applicability of the proposed techniques.

...read moreread less

Journal Article•DOI•

[...]

Mark D. Plumbley¹, Erkki Oja•Institutions (1)

University of London¹

A novel neural network for nonlinear convex programming

TL;DR: This work proposes the use of a "nonnegative principal component analysis (nonnegative PCA)" algorithm, which is a special case of the nonlinear PCA algorithm, but with a rectification nonlinearity, and conjecture that this algorithm will find such nonnegative well-grounded independent sources, under reasonable initial conditions.

...read moreread less

Abstract: We consider the task of independent component analysis when the independent sources are known to be nonnegative and well-grounded, so that they have a nonzero probability density function (pdf) in the region of zero. We propose the use of a "nonnegative principal component analysis (nonnegative PCA)" algorithm, which is a special case of the nonlinear PCA algorithm, but with a rectification nonlinearity, and we conjecture that this algorithm will find such nonnegative well-grounded independent sources, under reasonable initial conditions. While the algorithm has proved difficult to analyze in the general case, we give some analytical results that are consistent with this conjecture and some numerical simulations that illustrate its operation.

...read moreread less

Journal Article•DOI•

[...]

Xing-Bao Gao¹•Institutions (1)

Shaanxi Normal University¹

Adaptive stochastic resonance in noisy neurons based on mutual information

TL;DR: It is shown that the proposed neural network is stable in the sense of Lyapunov and can converge to an exact optimal solution of the original problem.

...read moreread less

Abstract: In this paper, we present a neural network for solving the nonlinear convex programming problem in real time by means of the projection method. The main idea is to convert the convex programming problem into a variational inequality problem. Then a dynamical system and a convex energy function are constructed for resulting variational inequality problem. It is shown that the proposed neural network is stable in the sense of Lyapunov and can converge to an exact optimal solution of the original problem. Compared with the existing neural networks for solving the nonlinear convex programming problem, the proposed neural network has no Lipschitz condition, no adjustable parameter, and its structure is simple. The validity and transient behavior of the proposed neural network are demonstrated by some simulation results.

...read moreread less

Journal Article•DOI•

[...]

Sanya Mitaim¹, Bart Kosko²•Institutions (2)

Thammasat University¹, University of Southern California²

01 Nov 2004-IEEE Transactions on Neural Networks

TL;DR: This work presents theoretical and simulation evidence that lone noisy threshold and continuous neurons exhibit the SR effect in terms of the mutual information between random input and output sequences, and a new statistically robust learning law can find this entropy-optimal noise level.

...read moreread less

Abstract: Noise can improve how memoryless neurons process signals and maximize their throughput information. Such favorable use of noise is the so-called "stochastic resonance" or SR effect at the level of threshold neurons and continuous neurons. This work presents theoretical and simulation evidence that 1) lone noisy threshold and continuous neurons exhibit the SR effect in terms of the mutual information between random input and output sequences, 2) a new statistically robust learning law can find this entropy-optimal noise level, and 3) the adaptive SR effect is robust against highly impulsive noise with infinite variance. Histograms estimate the relevant probability density functions at each learning iteration. A theorem shows that almost all noise probability density functions produce some SR effect in threshold neurons even if the noise is impulsive and has infinite variance. The optimal noise level in threshold neurons also behaves nonlinearly as the input signal amplitude increases. Simulations further show that the SR effect persists for several sigmoidal neurons and for Gaussian radial-basis-function neurons.

...read moreread less

Journal Article•DOI•

Generalized regression neural networks in time-varying environment

[...]

Leszek Rutkowski