scispace - formally typeset
Search or ask a question

Showing papers in "IEEE Transactions on Neural Networks in 1999"


Journal ArticleDOI
TL;DR: Using maximum entropy approximations of differential entropy, a family of new contrast (objective) functions for ICA enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions.
Abstract: Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. We use a combination of two different approaches for linear ICA: Comon's information theoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions. The statistical properties of the estimators based on such contrast functions are analyzed under the assumption of the linear mixture model, and it is shown how to choose contrast functions that are robust and/or of minimum variance. Finally, we introduce simple fixed-point algorithms for practical optimization of the contrast functions.

6,144 citations


Journal ArticleDOI
Vladimir Vapnik1
TL;DR: How the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms are demonstrated and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems are demonstrated.
Abstract: Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990's new types of learning algorithms (called support vector machines) based on the developed theory were proposed. This made statistical learning theory not only a tool for the theoretical analysis but also a tool for creating practical algorithms for estimating multidimensional functions. This article presents a very general overview of statistical learning theory including both theoretical and algorithmic aspects of the theory. The goal of this overview is to demonstrate how the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems.

5,370 citations


Journal ArticleDOI
TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.
Abstract: We study the use of support vector machines (SVM) in classifying e-mail as spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees. These four algorithms were tested on two different data sets: one data set where the number of features were constrained to the 1000 best features and another data set where the dimensionality was over 7000. SVM performed best when using binary features. For both data sets, boosting trees and SVM had acceptable test performance in terms of accuracy and speed. However, SVM had significantly less training time.

1,536 citations


Journal ArticleDOI
TL;DR: It is observed that a simple remapping of the input x(i)-->x(i)(a) improves the performance of linear SVM's to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.
Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x, y)=e/sup -/spl rho///spl Sigma//sub i//sup |xia-yia|b/ with a /spl les/1 and b/spl les/2 are evaluated on the classification of images extracted from the Corel stock photo collection and shown to far outperform traditional polynomial or Gaussian radial basis function (RBF) kernels. Moreover, we observed that a simple remapping of the input x/sub i//spl rarr/x/sub i//sup a/ improves the performance of linear SVM to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

1,510 citations


Journal ArticleDOI
TL;DR: The geometry of feature space is reviewed, and the connection between feature space and input space is discussed by dealing with the question of how one can, given some vector in feature space, find a preimage in input space.
Abstract: This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data.

1,258 citations


Journal ArticleDOI
TL;DR: A novel classification method, called the nearest feature line (NFL), for face recognition, based on the nearest distance from the query feature point to each FL, which achieves the lowest error rate reported for the ORL face database.
Abstract: We propose a classification method, called the nearest feature line (NFL), for face recognition. Any two feature points of the same class (person) are generalized by the feature line (FL) passing through the two points. The derived FL can capture more variations of face images than the original points and thus expands the capacity of the available database. The classification is based on the nearest distance from the query feature point to each FL. With a combined face database, the NFL error rate is about 43.7-65.4% of that of the standard eigenface method. Moreover, the NFL achieves the lowest error rate reported to date for the ORL face database.

555 citations


Journal ArticleDOI
TL;DR: The linking field modulation term is shown to be a universal feature of any biologically grounded dendritic model and the PCNN image decomposition (factoring) model is described in new detail.
Abstract: Pulse coupled neural network (PCNN) models are described. The linking field modulation term is shown to be a universal feature of any biologically grounded dendritic model. Applications and implementations of PCNNs are reviewed. Application based variations and simplifications are summarized. The PCNN image decomposition (factoring) model is described in detail.

555 citations


Journal ArticleDOI
TL;DR: Successive overrelaxation for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two massive datasets, each with millions of points.
Abstract: Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a time, similar to Platt's sequential minimal optimization (SMO) algorithm (1999) which handles two constraints at a time and Joachims' SVM/sup light/ (1998) which handles a small number of points at a time, SOR can process very large datasets that need not reside in memory. The algorithm converges linearly to a solution. Encouraging numerical results are presented on datasets with up to 10 000 000 points. Such massive discrimination problems cannot be processed by conventional linear or quadratic programming methods, and to our knowledge have not been solved by other methods. On smaller problems, SOR was faster than SVM/sup light/ and comparable or faster than SMO.

443 citations


Journal ArticleDOI
TL;DR: This work proposes to evaluate different binary classification schemes (support vector machine, multilayer perceptron, C4.5 decision tree, Fisher's linear discriminant, Bayesian classifier) to carry on the fusion of experts for taking a final decision on identity authentication.
Abstract: Biometric person identity authentication is gaining more and more attention. The authentication task performed by an expert is a binary classification problem: reject or accept identity claim. Combining experts, each based on a different modality (speech, face, fingerprint, etc.), increases the performance and robustness of identity authentication systems. In this context, a key issue is the fusion of the different experts for taking a final decision (i.e., accept or reject identity claim). We propose to evaluate different binary classification schemes (support vector machine, multilayer perceptron, C4.5 decision tree, Fisher's linear discriminant, Bayesian classifier) to carry on the fusion. The experimental results show that support vector machines and Bayesian classifier achieve almost the same performances, and both outperform the other evaluated classifiers.

383 citations


Journal ArticleDOI
TL;DR: Conditions for perfect image segmentation are derived and it is shown that addition of an inhibition receptive field to the neuron model increases the possibility of perfect segmentation.
Abstract: This paper describes a method for segmenting digital images using pulse coupled neural networks (PCNN). The pulse coupled neuron (PCN) model used in PCNN is a modification of the cortical neuron model of Eckhorn et al. (1990). A single layered laterally connected PCNN is capable of perfectly segmenting digital images even when there is a considerable overlap in the intensity ranges of adjacent regions. Conditions for perfect image segmentation are derived. It is also shown that addition of an inhibition receptive field to the neuron model increases the possibility of perfect segmentation. The inhibition input reduces the overlap of intensity ranges of adjacent regions by effectively compressing the intensity range of each region.

326 citations


Journal ArticleDOI
TL;DR: A multistage neural model is proposed for an auditory scene analysis task--segregating speech from interfering sound sources, a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation.
Abstract: A multistage neural model is proposed for an auditory scene analysis task-segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corresponds to an auditory feature, and different streams are represented by desynchronized oscillator populations. Lateral connections between oscillators encode harmonicity, and proximity in frequency and time. Prior to the oscillator network are a model of the auditory periphery and a stage in which mid-level auditory representations are formed. The model has been systematically evaluated using a corpus of voiced speech mixed with interfering sounds, and produces improvements in terms of signal-to-noise ratio for every mixture. A number of issues including biological plausibility and real-time implementation are also discussed.

Journal ArticleDOI
TL;DR: The recurrent property of the RSONFIN makes it suitable for dealing with temporal problems and no predetermination, like the number of hidden nodes, must be given, since the RsonFIN can find its optimal structure and parameters automatically and quickly.
Abstract: A recurrent self-organizing neural fuzzy inference network (RSONFIN) is proposed. The RSONFIN is inherently a recurrent multilayered connectionist network for realizing the basic elements and functions of dynamic fuzzy inference, and may be considered to be constructed from a series of dynamic fuzzy rules. The temporal relations embedded in the network are built by adding some feedback connections representing the memory elements to a feedforward neural fuzzy network. Each weight as well as node in the RSONFIN has its own meaning and represents a special element in a fuzzy rule. There are no hidden nodes initially in the RSONFIN. They are created online via concurrent structure identification and parameter identification. The structure learning together with the parameter learning forms a fast learning algorithm for building a small, yet powerful, dynamic neural fuzzy network. Two major characteristics of the RSONFIN can thus be seen: 1) the recurrent property of the RSONFIN makes it suitable for dealing with temporal problems and 2) no predetermination, like the number of hidden nodes, must be given, since the RSONFIN can find its optimal structure and parameters automatically and quickly. Moreover, to reduce the number of fuzzy rules generated, a flexible input partition method, the aligned clustering-based algorithm, is proposed. Various simulations on temporal problems are done and performance comparisons with some existing recurrent networks are also made. Efficiency of the RSONFIN is verified from these results.

Journal ArticleDOI
TL;DR: This paper compares between four different methods to preprocess the inputs and outputs of the River Nile flow time series, including a novel method proposed here based on the discrete Fourier series.
Abstract: Estimating the flows of rivers can have significant economic impact, as this can help in agricultural water management and in protection from water shortages and possible flood damage. The first goal of the paper is to apply neural networks to the problem of forecasting the flow of the River Nile in Egypt. The second goal of the paper is to utilize time series as a benchmark to compare between several neural-network forecasting methods. We compare four different methods to preprocess the inputs and outputs, including a novel method proposed here based on discrete Fourier series. We also compare three different methods for the multistep ahead forecast problem: the direct method, the recursive method, and the recursive method trained using a backpropagation through time scheme. We also include a theoretical comparison between these three methods. The final comparison is between different methods to perform a longer horizon forecast, and that includes ways to partition the problem into several subproblems of forecasting K steps ahead.

Journal ArticleDOI
TL;DR: Experiments involving a variety of reformulated RBF networks generated by linear and exponential generator functions indicate that gradient descent learning is simple, easily implementable, and produces RBf networks that perform considerably better than conventional RBF models trained by existing algorithms.
Abstract: This paper presents an axiomatic approach for constructing radial basis function (RBF) neural networks. This approach results in a broad variety of admissible RBF models, including those employing Gaussian RBFs. The form of the RBFs is determined by a generator function. New RBF models can be developed according to the proposed approach by selecting generator functions other than exponential ones, which lead to Gaussian RBFs. This paper also proposes a supervised learning algorithm based on gradient descent for training reformulated RBF neural networks constructed using the proposed approach. A sensitivity analysis of the proposed algorithm relates the properties of RBFs with the convergence of gradient descent learning. Experiments involving a variety of reformulated RBF networks generated by linear and exponential generator functions indicate that gradient descent learning is simple, easily implementable, and produces RBF networks that perform considerably better than conventional RBF models trained by existing algorithms.

Journal ArticleDOI
TL;DR: A type of recurrent neuro-fuzzy network is proposed in this paper to build long-term prediction models for nonlinear processes and it has the advantage that control actions can be calculated analytically avoiding the time consuming nonlinear programming procedures required in conventional nonlinear model-based predictive control.
Abstract: A type of recurrent neuro-fuzzy network is proposed in this paper to build long-term prediction models for nonlinear processes. The process operation is partitioned into several fuzzy operating regions. Within each region, a local linear model is used to model the process. The global model output is obtained through the centre of gravity defuzzification which is essentially the interpolation of local model outputs. This modeling strategy utilizes both process knowledge and process input/output data. Process knowledge is used to initially divide the process operation into several fuzzy operating regions and to set up the initial fuzzification layer weights. Process I/O data are used to train the network. Network weights are such trained so that the long-term prediction errors are minimized. Through training, membership functions of fuzzy operating regions are refined and local models are learnt. Based on the recurrent neuro-fuzzy network model, a novel type of nonlinear model-based long range predictive controller can be developed and it consists of several local linear model-based predictive controllers. Local controllers are constructed based on the corresponding local linear models and their outputs are combined to form a global control action by using their membership functions. This control strategy has the advantage that control actions can be calculated analytically avoiding the time consuming nonlinear programming procedures required in conventional nonlinear model-based predictive control. The techniques have been successfully applied to the modeling and control of a neutralization process.

Journal Article
TL;DR: This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms and observes that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them a valid alternative to RBF kernels.
Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms Heavy-tailed RBF kernels of the form K(x,y) = e−ρ P i |x i −y i | with a ≤ 1 and b ≤ 2 are evaluated on the classification of images extracted from the Corel Stock Photo Collection and shown to far outperform traditional polynomial or Gaussian RBF kernels Moreover, we observed that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them, for this problem, a valid alternative to RBF kernels keywords: Support Vector Machines, Radial Basis Functions, Image Histogram, Image Classification, Corel

Journal ArticleDOI
TL;DR: In this article, the authors propose a method for decomposing pattern classification problems based on the class relations among training data, which can divide a K-class classification problem into a series of (/sub 2/sup K/) two-class problems.
Abstract: We propose a method for decomposing pattern classification problems based on the class relations among training data. By using this method, we can divide a K-class classification problem into a series of (/sub 2//sup K/) two-class problems. These two-class problems are to discriminate class C/sub i/ from class C/sub j/ for i=1, ..., K and j=i+1, while the existence of the training data belonging to the other K-2 classes is ignored. If the two-class problem of discriminating class C/sub i/ from class C/sub j/ is still hard to be learned, we can further break down it into a set of two-class subproblems as small as we expect. Since each of the two-class problems can be treated as a completely separate classification problem with the proposed learning framework, all of the two-class problems can be learned in parallel. We also propose two module combination principles which give practical guidelines in integrating individual trained network modules. After learning of each of the two-class problems with a network module, we can easily integrate all of the trained modules into a min-max modular (M/sup 3/) network according to the module combination principles and obtain a solution to the original problem. Consequently, a large-scale and complex K-class classification problem can be solved effortlessly and efficiently by learning a series of smaller and simpler two-class problems in parallel.

Journal ArticleDOI
TL;DR: A regularized orthogonal least squares (ROLS) algorithm is employed at the lower level to construct RBF networks while the two key learning parameters, the regularization parameter and the RBF width, are optimized using a genetic algorithm at the upper level.
Abstract: Presents a two-level learning method for radial basis function (RBF) networks. A regularized orthogonal least squares (ROLS) algorithm is employed at the lower level to construct RBF networks while the two key learning parameters, the regularization parameter and the RBF width, are optimized using a genetic algorithm (GA) at the upper level. Nonlinear time series modeling and prediction is used as an example to demonstrate the effectiveness of this hierarchical learning approach.

Journal ArticleDOI
TL;DR: A large class of simple pulse-coupled neural networks are obtained that can memorize and reproduce synchronized temporal patterns the same way a Hopfield network does with static patterns.
Abstract: We study pulse-coupled neural networks that satisfy only two assumptions: each isolated neuron fires periodically, and the neurons are weakly connected. Each such network can be transformed by a piece-wise continuous change of variables into a phase model, whose synchronization behavior and oscillatory associative properties are easier to analyze and understand. Using the phase model, we can predict whether a given pulse-coupled network has oscillatory associative memory, or what minimal adjustments should be made so that it can acquire memory. In the search for such minimal adjustments we obtain a large class of simple pulse-coupled neural networks that ran memorize and reproduce synchronized temporal patterns the same way a Hopfield network does with static patterns. The learning occurs via modification of synaptic weights and/or synaptic transmission delays.

Journal ArticleDOI
TL;DR: This paper establishes two theorems: the first one gives a bound for the identification error and the second one establishes a Bound for the tracking error of the adaptive nonlinear identification and trajectory tracking.
Abstract: In this paper the adaptive nonlinear identification and trajectory tracking are discussed via dynamic neural networks. By means of a Lyapunov-like analysis we determine stability conditions for the identification error. Then we analyze the trajectory tracking error by a local optimal controller. An algebraic Riccati equation and a differential one are used for the identification and the tracking error analysis. As our main original contributions, we establish two theorems: the first one gives a bound for the identification error, and the second one establishes a bound for the tracking error. We illustrate the effectiveness of these results by two examples: the second-order relay system with multiple isolated equilibrium points and the chaotic system given by Duffing equation.

Journal ArticleDOI
TL;DR: Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions, demonstrating the advantages of VC-based complexity control with finite samples.
Abstract: It is well known that for a given sample size there exists a model of optimal complexity corresponding to the smallest prediction (generalization) error. Hence, any method for learning from finite samples needs to have some provisions for complexity control. Existing implementations of complexity control include penalization (or regularization), weight decay (in neural networks), and various greedy procedures (aka constructive, growing, or pruning methods). There are numerous proposals for determining optimal model complexity (aka model selection) based on various (asymptotic) analytic estimates of the prediction risk and on resampling approaches. Nonasymptotic bounds on the prediction risk based on Vapnik-Chervonenkis (VC)-theory have been proposed by Vapnik. This paper describes application of VC-bounds to regression problems with the usual squared loss. An empirical study is performed for settings where the VC-bounds can be rigorously applied, i.e., linear models and penalized linear models where the VC-dimension can be accurately estimated, and the empirical risk can be reliably minimized. Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions. Our results demonstrate the advantages of VC-based complexity control with finite samples.

Journal ArticleDOI
TL;DR: An experiment was conducted where neural networks compete for survival in an evolving population based on their ability to play checkers, and multilayer feedforward neural networks were used to evaluate alternative board positions and games were played using a minimax search strategy.
Abstract: An experiment was conducted where neural networks compete for survival in an evolving population based on their ability to play checkers. More specifically, multilayer feedforward neural networks were used to evaluate alternative board positions and games were played using a minimax search strategy. At each generation, the extant neural networks were paired in competitions and selection was used to eliminate those that performed poorly relative to other networks. Offspring neural networks were created from the survivors using random variation of all weights and bias terms. After a series of 250 generations, the best-evolved neural network was played against human opponents in a series of 90 games on an Internet website. The neural network was able to defeat two expert-level players and played to a draw against a master. The final rating of the neural network placed it in the "Class A" category using a standard rating system. Of particular importance in the design of the experiment was the fact that no features beyond the piece differential were given to the neural networks as a priori knowledge. The process of evolution was able to extract all of the additional information required to play at this level of competency. It accomplished this based almost solely on the feedback offered in the final aggregated outcome of each game played (i.e., win, lose, or draw). This procedure stands in marked contrast to the typical artifice of explicitly injecting expert knowledge into a game-playing program.

Journal ArticleDOI
TL;DR: A new gradient-based procedure called recursive backpropagation (RBP) is proposed whose on-line version, causal recursive back propagation (CRBP), presents some advantages with respect to the other on- line training methods.
Abstract: This paper focuses on online learning procedures for locally recurrent neural nets with emphasis on multilayer perceptron (MLP) with infinite impulse response (IIR) synapses and its variations which include generalized output and activation feedback multilayer networks (MLN). We propose a new gradient-based procedure called recursive backpropagation (RBP) whose online version, causal recursive backpropagation (CRBP), has some advantages over other online methods. CRBP includes as particular cases backpropagation (BP), temporal BP, Back-Tsoi algorithm (1991) among others, thereby providing a unifying view on gradient calculation for recurrent nets with local feedback. The only learning method known for locally recurrent nets with no architectural restriction is the one by Back and Tsoi. The proposed algorithm has better stability and faster convergence with respect to the Back-Tsoi algorithm. The computational complexity of the CRBP is comparable with that of the Back-Tsoi algorithm, e.g., less that a factor of 1.5 for usual architectures and parameter settings. The superior performance of the new algorithm, however, easily justifies this small increase in computational burden. In addition, the general paradigms of truncated BPTT and RTRL are applied to networks with local feedback and compared with CRBP. CRBP exhibits similar performances and the detailed analysis of complexity reveals that CRBP is much simpler and easier to implement, e.g., CRBP is local in space and in time while RTRL is not local in space.

Journal ArticleDOI
TL;DR: The performance of the PNN when used in conjunction with these feature extraction and postprocessing schemes showed the potential of this neural-network-based cloud classification system.
Abstract: The problem of cloud data classification from satellite imagery using neural networks is considered. Several image transformations such as singular value decomposition (SVD) and wavelet packet (WP) were used to extract the salient spectral and textural features attributed to satellite cloud data in both visible and infrared (IR) channels. In addition, the well-known gray-level cooccurrence matrix (GLCM) method and spectral features were examined for the sake of comparison. Two different neural-network paradigms namely probability neural network (PNN) and unsupervised Kohonen self-organized feature map (SOM) were examined and their performance were also benchmarked on the geostationary operational environmental satellite (GOES) 8 data. Additionally, a postprocessing scheme was developed which utilizes the contextual information in the satellite images to improve the final classification accuracy. Overall, the performance of the PNN when used in conjunction with these feature extraction and postprocessing schemes showed the potential of this neural-network-based cloud classification system.

Journal ArticleDOI
R. Eckhorn1
TL;DR: This paper aims at relating signals of stimulus dependent synchronization and desynchronization, observed by us in the visual cortex of monkeys, with models of basic neural circuits explaining the measured signals and extending the former linking field model.
Abstract: Synchronization of neural activity has been proposed to code feature linking. This was supported by the discovery of synchronized neural activities. In cat and monkey visual cortex which occurred stimulus dependent either oscillatory (30-100 Hz) or nonrhythmical, internally generated or stimulus dominated. The area in visual space covered by receptive fields of an actually synchronized assembly of neurons was termed the "linking field". The present paper aims at relating signals of stimulus dependent synchronization and desynchronization, observed by the authors in the visual cortex of monkeys, with models of basic neural circuits explaining the measured signals and extending the authors' former linking field model. The circuits include: (1) a model neuron with the capability of fast mutual spike linking and decoupling which does not degrade the receptive field properties; (2) linking connections for fast synchronization in neighboring assemblies driven by the same stimulus; (3) feedback inhibition in local assemblies via a common interneuron subserving synchronization, desynchronization, and suppression of uncorrelated signals; and (4) common-input connectivity among members of local and distant assemblies supporting zero-delay phase difference in distributed assemblies. Other recently observed cortical effects that potentially support scene segmentation are shortly reviewed to stimulate further ideas for models. Finally, the linking field hypothesis is critically discussed, including contradictory psychophysical work and new supportive neurophysiological evidence.

Journal ArticleDOI
TL;DR: This paper presents the first physiologically motivated pulse coupled neural network (PCNN)-based image fusion network for object detection, which exceeded the accuracy obtained by any individual filtering methods or by logical ANDing the individual object detection technique results.
Abstract: This paper presents the first physiologically motivated pulse coupled neural network (PCNN)-based image fusion network for object detection. Primate vision processing principles, such as expectation driven filtering, state dependent modulation, temporal synchronization, and multiple processing paths are applied to create a physiologically motivated image fusion network. PCNN are used to fuse the results of several object detection techniques to improve object detection accuracy. Image processing techniques (wavelets, morphological, etc.) are used to extract target features and PCNN are used to focus attention by segmenting and fusing the information. The object detection property of the resulting image fusion network is demonstrated on mammograms and forward-looking infrared radar (FLIR) images. The network removed 94% of the false detections without removing any true detections in the FLIR images and removed 46% of the false detections while removing only 7% of the true detections in the mammograms. The model exceeded the accuracy obtained by any individual filtering methods or by logical ANDing the individual object detection technique results.

Journal ArticleDOI
TL;DR: The results show that some of the new algorithms can greatly improve the learning rate of the neural-network control structure, and that for the considered experimental setup a neural- network controller can outperform linear controllers.
Abstract: Active control of sound and vibration has been the subject of a lot of research, and examples of applications are now numerous. However, few practical implementations of nonlinear active controllers have been realized. Nonlinear active controllers may be required in cases where the actuators used in active control systems exhibit nonlinear characteristics, or in cases when the structure to be controlled exhibits a nonlinear behavior. A multilayer perceptron neural-network based control structure was previously introduced as a nonlinear active controller, with a training algorithm based on an extended backpropagation scheme. This paper introduces new heuristical training algorithms for the same neural-network control structure. The objective is to develop new algorithms with faster convergence speed and/or lower computational loads. Experimental results of active sound control using a nonlinear actuator with linear and nonlinear controllers are presented. The results show that some of the new algorithms can greatly improve the learning rate of the neural-network control structure, and that for the considered experimental setup a neural-network controller can outperform linear controllers.

Journal ArticleDOI
TL;DR: A novel artificial neural-network decision tree algorithm (ANN-DT), which extracts binary decision trees from a trained neural network, and is shown to have significant benefits in certain cases when compared with the standard criteria of minimum weighted variance over the branches.
Abstract: Although artificial neural networks can represent a variety of complex systems with a high degree of accuracy, these connectionist models are difficult to interpret. This significantly limits the applicability of neural networks in practice, especially where a premium is placed on the comprehensibility or reliability of systems. A novel artificial neural-network decision tree algorithm (ANN-DT) is therefore proposed, which extracts binary decision trees from a trained neural network. The ANN-DT algorithm uses the neural network to generate outputs for samples interpolated from the training data set. In contrast to existing techniques, ANN-DT can extract rules from feedforward neural networks with continuous outputs. These rules are extracted from the neural network without making assumptions about the internal structure of the neural network or the features of the data. A novel attribute selection criterion based on a significance analysis of the variables on the neural-network output is examined. It is shown to have significant benefits in certain cases when compared with the standard criteria of minimum weighted variance over the branches. In three case studies the ANN-DT algorithm compared favorably with CART, a standard decision tree algorithm.

Journal ArticleDOI
TL;DR: A geometrical representation of McCulloch-Pitts neural model is presented and a clear visual picture and interpretation of the model can be seen and two interesting applications based on the interpretation are discussed.
Abstract: In this paper, a geometrical representation of McCulloch-Pitts neural model (1943) is presented, From the representation, a clear visual picture and interpretation of the model can be seen. Two interesting applications based on the interpretation are discussed. They are 1) a new design principle of feedforward neural networks and 2) a new proof of mapping abilities of three-layer feedforward neural networks.

Journal ArticleDOI
TL;DR: The main result is that tedious manual adaptation of the temporal size of the receptive fields can be avoided by employing a novel method to adapt the corresponding time delay and related network structure parameters during the training process.
Abstract: We present an algorithm based on a time-delay neural network with spatio-temporal receptive fields and adaptable time delays for image sequence analysis Our main result is that tedious manual adaptation of the temporal size of the receptive fields can be avoided by employing a method to adapt the corresponding time delay and related network structure parameters during the training process