Showing papers in &quot;IEEE Transactions on Neural Networks in 1999&quot;

An overview of statistical learning theory

TL;DR: Using maximum entropy approximations of differential entropy, a family of new contrast (objective) functions for ICA enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions.

...read moreread less

Abstract: Independent component analysis (ICA) is a statistical method for transforming an observed multidimensional random vector into components that are statistically as independent from each other as possible. We use a combination of two different approaches for linear ICA: Comon's information theoretic approach and the projection pursuit approach. Using maximum entropy approximations of differential entropy, we introduce a family of new contrast functions for ICA. These contrast functions enable both the estimation of the whole decomposition by minimizing mutual information, and estimation of individual independent components as projection pursuit directions. The statistical properties of the estimators based on such contrast functions are analyzed under the assumption of the linear mixture model, and it is shown how to choose contrast functions that are robust and/or of minimum variance. Finally, we introduce simple fixed-point algorithms for practical optimization of the contrast functions.

...read moreread less

6,144 citations

Journal Article•DOI•

[...]

Vladimir Vapnik¹•Institutions (1)

AT&T Labs¹

Support vector machines for spam categorization

TL;DR: How the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms are demonstrated and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems are demonstrated.

...read moreread less

Abstract: Statistical learning theory was introduced in the late 1960's. Until the 1990's it was a purely theoretical analysis of the problem of function estimation from a given collection of data. In the middle of the 1990's new types of learning algorithms (called support vector machines) based on the developed theory were proposed. This made statistical learning theory not only a tool for the theoretical analysis but also a tool for creating practical algorithms for estimating multidimensional functions. This article presents a very general overview of statistical learning theory including both theoretical and algorithmic aspects of the theory. The goal of this overview is to demonstrate how the abstract learning theory established conditions for generalization which are more general than those discussed in classical statistical paradigms and how the understanding of these conditions inspired new algorithmic approaches to function estimation problems.

...read moreread less

5,370 citations

Journal Article•DOI•

[...]

H. Drucker¹, Donghui Wu, Vladimir Vapnik•Institutions (1)

AT&T Labs¹

Support vector machines for histogram-based image classification

TL;DR: The use of support vector machines in classifying e-mail as spam or nonspam is studied by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees, which found SVM's performed best when using binary features.

...read moreread less

Abstract: We study the use of support vector machines (SVM) in classifying e-mail as spam or nonspam by comparing it to three other classification algorithms: Ripper, Rocchio, and boosting decision trees. These four algorithms were tested on two different data sets: one data set where the number of features were constrained to the 1000 best features and another data set where the dimensionality was over 7000. SVM performed best when using binary features. For both data sets, boosting trees and SVM had acceptable test performance in terms of accuracy and speed. However, SVM had significantly less training time.

...read moreread less

1,536 citations

Journal Article•DOI•

[...]

Olivier Chapelle¹, Patrick Haffner², Vladimir Vapnik²•Institutions (2)

AT&T Labs¹, AT&T²

Input space versus feature space in kernel-based methods

TL;DR: It is observed that a simple remapping of the input x(i)-->x(i)(a) improves the performance of linear SVM's to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

...read moreread less

Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space. This paper shows that support vector machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms. Heavy-tailed RBF kernels of the form K(x, y)=e/sup -/spl rho///spl Sigma//sub i//sup |xia-yia|b/ with a /spl les/1 and b/spl les/2 are evaluated on the classification of images extracted from the Corel stock photo collection and shown to far outperform traditional polynomial or Gaussian radial basis function (RBF) kernels. Moreover, we observed that a simple remapping of the input x/sub i//spl rarr/x/sub i//sup a/ improves the performance of linear SVM to such an extend that it makes them, for this problem, a valid alternative to RBF kernels.

...read moreread less

1,510 citations

Journal Article•DOI•

[...]

Bernhard Schölkopf, Sebastian Mika¹, C.J.C. Burges², P. Knirsch³, Klaus-Robert Müller, Gunnar Rätsch¹, Alexander J. Smola¹ - Show less +3 more•Institutions (3)

Fraunhofer Institute for Open Communication Systems¹, Alcatel-Lucent², Max Planck Society³

Face recognition using the nearest feature line method

TL;DR: The geometry of feature space is reviewed, and the connection between feature space and input space is discussed by dealing with the question of how one can, given some vector in feature space, find a preimage in input space.

...read moreread less

Abstract: This paper collects some ideas targeted at advancing our understanding of the feature spaces associated with support vector (SV) kernel functions. We first discuss the geometry of feature space. In particular, we review what is known about the shape of the image of input space under the feature space map, and how this influences the capacity of SV methods. Following this, we describe how the metric governing the intrinsic geometry of the mapped surface can be computed in terms of the kernel, using the example of the class of inhomogeneous polynomial kernels, which are often used in SV pattern recognition. We then discuss the connection between feature space and input space by dealing with the question of how one can, given some vector in feature space, find a preimage (exact or approximate) in input space. We describe algorithms to tackle this issue, and show their utility in two applications of kernel methods. First, we use it to reduce the computational complexity of SV decision functions; second, we combine it with the kernel PCA algorithm, thereby constructing a nonlinear statistical denoising technique which is shown to perform well on real-world data.

...read moreread less

1,258 citations

Journal Article•DOI•

[...]

Stan Z. Li, Juwei Lu¹•Institutions (1)

Nanyang Technological University¹

PCNN models and applications

TL;DR: A novel classification method, called the nearest feature line (NFL), for face recognition, based on the nearest distance from the query feature point to each FL, which achieves the lowest error rate reported for the ORL face database.

...read moreread less

Abstract: We propose a classification method, called the nearest feature line (NFL), for face recognition. Any two feature points of the same class (person) are generalized by the feature line (FL) passing through the two points. The derived FL can capture more variations of face images than the original points and thus expands the capacity of the available database. The classification is based on the nearest distance from the query feature point to each FL. With a combined face database, the NFL error rate is about 43.7-65.4% of that of the standard eigenface method. Moreover, the NFL achieves the lowest error rate reported to date for the ORL face database.

...read moreread less

555 citations

Journal Article•DOI•

[...]

John L. Johnson¹, Mary Lou Padgett¹•Institutions (1)

United States Department of the Army¹

Successive overrelaxation for support vector machines

TL;DR: The linking field modulation term is shown to be a universal feature of any biologically grounded dendritic model and the PCNN image decomposition (factoring) model is described in new detail.

...read moreread less

Abstract: Pulse coupled neural network (PCNN) models are described. The linking field modulation term is shown to be a universal feature of any biologically grounded dendritic model. Applications and implementations of PCNNs are reviewed. Application based variations and simplifications are summarized. The PCNN image decomposition (factoring) model is described in detail.

...read moreread less

555 citations

Journal Article•DOI•

[...]

Olvi L. Mangasarian¹, David R. Musicant¹•Institutions (1)

University of Wisconsin-Madison¹

Fusion of face and speech data for person identity verification

TL;DR: Successive overrelaxation for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two massive datasets, each with millions of points.

...read moreread less

Abstract: Successive overrelaxation (SOR) for symmetric linear complementarity problems and quadratic programs is used to train a support vector machine (SVM) for discriminating between the elements of two massive datasets, each with millions of points. Because SOR handles one point at a time, similar to Platt's sequential minimal optimization (SMO) algorithm (1999) which handles two constraints at a time and Joachims' SVM/sup light/ (1998) which handles a small number of points at a time, SOR can process very large datasets that need not reside in memory. The algorithm converges linearly to a solution. Encouraging numerical results are presented on datasets with up to 10 000 000 points. Such massive discrimination problems cannot be processed by conventional linear or quadratic programming methods, and to our knowledge have not been solved by other methods. On smaller problems, SOR was faster than SVM/sup light/ and comparable or faster than SMO.

...read moreread less

443 citations

Journal Article•DOI•

[...]

S. Ben-Yacoub, Y. Abdeljaoued, E. Mayoraz

Perfect image segmentation using pulse coupled neural networks

TL;DR: This work proposes to evaluate different binary classification schemes (support vector machine, multilayer perceptron, C4.5 decision tree, Fisher's linear discriminant, Bayesian classifier) to carry on the fusion of experts for taking a final decision on identity authentication.

...read moreread less

Abstract: Biometric person identity authentication is gaining more and more attention. The authentication task performed by an expert is a binary classification problem: reject or accept identity claim. Combining experts, each based on a different modality (speech, face, fingerprint, etc.), increases the performance and robustness of identity authentication systems. In this context, a key issue is the fusion of the different experts for taking a final decision (i.e., accept or reject identity claim). We propose to evaluate different binary classification schemes (support vector machine, multilayer perceptron, C4.5 decision tree, Fisher's linear discriminant, Bayesian classifier) to carry on the fusion. The experimental results show that support vector machines and Bayesian classifier achieve almost the same performances, and both outperform the other evaluated classifiers.

...read moreread less

383 citations

Journal Article•DOI•

[...]

G. Kuntimad¹, Heggere S. Ranganath²•Institutions (2)

Aerojet Rocketdyne¹, University of Alabama in Huntsville²

Separation of speech from interfering sounds based on oscillatory correlation

TL;DR: Conditions for perfect image segmentation are derived and it is shown that addition of an inhibition receptive field to the neuron model increases the possibility of perfect segmentation.

...read moreread less

Abstract: This paper describes a method for segmenting digital images using pulse coupled neural networks (PCNN). The pulse coupled neuron (PCN) model used in PCNN is a modification of the cortical neuron model of Eckhorn et al. (1990). A single layered laterally connected PCNN is capable of perfectly segmenting digital images even when there is a considerable overlap in the intensity ranges of adjacent regions. Conditions for perfect image segmentation are derived. It is also shown that addition of an inhibition receptive field to the neuron model increases the possibility of perfect segmentation. The inhibition input reduces the overlap of intensity ranges of adjacent regions by effectively compressing the intensity range of each region.

...read moreread less

326 citations

Journal Article•DOI•

[...]

DeLiang Wang¹, Guy J. Brown²•Institutions (2)

Ohio State University¹, University of Sheffield²

A recurrent self-organizing neural fuzzy inference network

TL;DR: A multistage neural model is proposed for an auditory scene analysis task--segregating speech from interfering sound sources, a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation.

...read moreread less

Abstract: A multistage neural model is proposed for an auditory scene analysis task-segregating speech from interfering sound sources. The core of the model is a two-layer oscillator network that performs stream segregation on the basis of oscillatory correlation. In the oscillatory correlation framework, a stream is represented by a population of synchronized relaxation oscillators, each of which corresponds to an auditory feature, and different streams are represented by desynchronized oscillator populations. Lateral connections between oscillators encode harmonicity, and proximity in frequency and time. Prior to the oscillator network are a model of the auditory periphery and a stage in which mid-level auditory representations are formed. The model has been systematically evaluated using a corpus of voiced speech mixed with interfering sounds, and produces improvements in terms of signal-to-noise ratio for every mixture. A number of issues including biological plausibility and real-time implementation are also discussed.

...read moreread less

Journal Article•DOI•

[...]

Chia-Feng Juang¹, Chin-Teng Lin¹•Institutions (1)

National Chiao Tung University¹

01 Jul 1999-IEEE Transactions on Neural Networks

TL;DR: The recurrent property of the RSONFIN makes it suitable for dealing with temporal problems and no predetermination, like the number of hidden nodes, must be given, since the RsonFIN can find its optimal structure and parameters automatically and quickly.

...read moreread less

Abstract: A recurrent self-organizing neural fuzzy inference network (RSONFIN) is proposed. The RSONFIN is inherently a recurrent multilayered connectionist network for realizing the basic elements and functions of dynamic fuzzy inference, and may be considered to be constructed from a series of dynamic fuzzy rules. The temporal relations embedded in the network are built by adding some feedback connections representing the memory elements to a feedforward neural fuzzy network. Each weight as well as node in the RSONFIN has its own meaning and represents a special element in a fuzzy rule. There are no hidden nodes initially in the RSONFIN. They are created online via concurrent structure identification and parameter identification. The structure learning together with the parameter learning forms a fast learning algorithm for building a small, yet powerful, dynamic neural fuzzy network. Two major characteristics of the RSONFIN can thus be seen: 1) the recurrent property of the RSONFIN makes it suitable for dealing with temporal problems and 2) no predetermination, like the number of hidden nodes, must be given, since the RSONFIN can find its optimal structure and parameters automatically and quickly. Moreover, to reduce the number of fuzzy rules generated, a flexible input partition method, the aligned clustering-based algorithm, is proposed. Various simulations on temporal problems are done and performance comparisons with some existing recurrent networks are also made. Efficiency of the RSONFIN is verified from these results.

...read moreread less

Journal Article•DOI•

A comparison between neural-network forecasting techniques-case study: river flow forecasting

[...]

Amir F. Atiya¹, S.M. El-Shoura, Samir I. Shaheen², M.S. El-Sherif•Institutions (2)

California Institute of Technology¹, Cairo University²

Reformulated radial basis neural networks trained by gradient descent

TL;DR: This paper compares between four different methods to preprocess the inputs and outputs of the River Nile flow time series, including a novel method proposed here based on the discrete Fourier series.

...read moreread less

Abstract: Estimating the flows of rivers can have significant economic impact, as this can help in agricultural water management and in protection from water shortages and possible flood damage. The first goal of the paper is to apply neural networks to the problem of forecasting the flow of the River Nile in Egypt. The second goal of the paper is to utilize time series as a benchmark to compare between several neural-network forecasting methods. We compare four different methods to preprocess the inputs and outputs, including a novel method proposed here based on discrete Fourier series. We also compare three different methods for the multistep ahead forecast problem: the direct method, the recursive method, and the recursive method trained using a backpropagation through time scheme. We also include a theoretical comparison between these three methods. The final comparison is between different methods to perform a longer horizon forecast, and that includes ways to partition the problem into several subproblems of forecasting K steps ahead.

...read moreread less

Journal Article•DOI•

[...]

Nicolaos B. Karayiannis¹•Institutions (1)

University of Houston¹

Recurrent neuro-fuzzy networks for nonlinear process modeling

TL;DR: Experiments involving a variety of reformulated RBF networks generated by linear and exponential generator functions indicate that gradient descent learning is simple, easily implementable, and produces RBf networks that perform considerably better than conventional RBF models trained by existing algorithms.

...read moreread less

Abstract: This paper presents an axiomatic approach for constructing radial basis function (RBF) neural networks. This approach results in a broad variety of admissible RBF models, including those employing Gaussian RBFs. The form of the RBFs is determined by a generator function. New RBF models can be developed according to the proposed approach by selecting generator functions other than exponential ones, which lead to Gaussian RBFs. This paper also proposes a supervised learning algorithm based on gradient descent for training reformulated RBF neural networks constructed using the proposed approach. A sensitivity analysis of the proposed algorithm relates the properties of RBFs with the convergence of gradient descent learning. Experiments involving a variety of reformulated RBF networks generated by linear and exponential generator functions indicate that gradient descent learning is simple, easily implementable, and produces RBF networks that perform considerably better than conventional RBF models trained by existing algorithms.

...read moreread less

Journal Article•DOI•

[...]

Jie Zhang¹, A.J. Morris²•Institutions (2)

Universities UK¹, University of Newcastle²

SVMs for Histogram Based Image Classification

TL;DR: A type of recurrent neuro-fuzzy network is proposed in this paper to build long-term prediction models for nonlinear processes and it has the advantage that control actions can be calculated analytically avoiding the time consuming nonlinear programming procedures required in conventional nonlinear model-based predictive control.

...read moreread less

Abstract: A type of recurrent neuro-fuzzy network is proposed in this paper to build long-term prediction models for nonlinear processes. The process operation is partitioned into several fuzzy operating regions. Within each region, a local linear model is used to model the process. The global model output is obtained through the centre of gravity defuzzification which is essentially the interpolation of local model outputs. This modeling strategy utilizes both process knowledge and process input/output data. Process knowledge is used to initially divide the process operation into several fuzzy operating regions and to set up the initial fuzzification layer weights. Process I/O data are used to train the network. Network weights are such trained so that the long-term prediction errors are minimized. Through training, membership functions of fuzzy operating regions are refined and local models are learnt. Based on the recurrent neuro-fuzzy network model, a novel type of nonlinear model-based long range predictive controller can be developed and it consists of several local linear model-based predictive controllers. Local controllers are constructed based on the corresponding local linear models and their outputs are combined to form a global control action by using their membership functions. This control strategy has the advantage that control actions can be calculated analytically avoiding the time consuming nonlinear programming procedures required in conventional nonlinear model-based predictive control. The techniques have been successfully applied to the modeling and control of a neutralization process.

...read moreread less

Journal Article•

[...]

Olivier Chapelle¹, Patrick Haffner², Vapnik²•Institutions (2)

Max Planck Society¹, AT&T²

01 Jan 1999-IEEE Transactions on Neural Networks

TL;DR: This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms and observes that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them a valid alternative to RBF kernels.

...read moreread less

Abstract: Traditional classification approaches generalize poorly on image classification tasks, because of the high dimensionality of the feature space This paper shows that Support Vector Machines (SVM) can generalize well on difficult image classification problems where the only features are high dimensional histograms Heavy-tailed RBF kernels of the form K(x,y) = e−ρ P i |x i −y i | with a ≤ 1 and b ≤ 2 are evaluated on the classification of images extracted from the Corel Stock Photo Collection and shown to far outperform traditional polynomial or Gaussian RBF kernels Moreover, we observed that a simple remapping of the input xi → x a i improves the performance of linear SVMs to such an extend that it makes them, for this problem, a valid alternative to RBF kernels keywords: Support Vector Machines, Radial Basis Functions, Image Histogram, Image Classification, Corel

...read moreread less

Journal Article•DOI•

Task decomposition and module combination based on class relations: a modular neural network for pattern classification

[...]

Bao-Liang Lu, M. Ito

Combined genetic algorithm optimization and regularized orthogonal least squares learning for radial basis function networks

TL;DR: In this article, the authors propose a method for decomposing pattern classification problems based on the class relations among training data, which can divide a K-class classification problem into a series of (/sub 2/sup K/) two-class problems.

...read moreread less

Abstract: We propose a method for decomposing pattern classification problems based on the class relations among training data. By using this method, we can divide a K-class classification problem into a series of (/sub 2//sup K/) two-class problems. These two-class problems are to discriminate class C/sub i/ from class C/sub j/ for i=1, ..., K and j=i+1, while the existence of the training data belonging to the other K-2 classes is ignored. If the two-class problem of discriminating class C/sub i/ from class C/sub j/ is still hard to be learned, we can further break down it into a set of two-class subproblems as small as we expect. Since each of the two-class problems can be treated as a completely separate classification problem with the proposed learning framework, all of the two-class problems can be learned in parallel. We also propose two module combination principles which give practical guidelines in integrating individual trained network modules. After learning of each of the two-class problems with a network module, we can easily integrate all of the trained modules into a min-max modular (M/sup 3/) network according to the module combination principles and obtain a solution to the original problem. Consequently, a large-scale and complex K-class classification problem can be solved effortlessly and efficiently by learning a series of smaller and simpler two-class problems in parallel.

...read moreread less

Journal Article•DOI•

[...]

Sheng Chen¹, Y. Wu, B.L. Luk•Institutions (1)

University of Southampton¹

Weakly pulse-coupled oscillators, FM interactions, synchronization, and oscillatory associative memory

TL;DR: A regularized orthogonal least squares (ROLS) algorithm is employed at the lower level to construct RBF networks while the two key learning parameters, the regularization parameter and the RBF width, are optimized using a genetic algorithm at the upper level.

...read moreread less

Abstract: Presents a two-level learning method for radial basis function (RBF) networks. A regularized orthogonal least squares (ROLS) algorithm is employed at the lower level to construct RBF networks while the two key learning parameters, the regularization parameter and the RBF width, are optimized using a genetic algorithm (GA) at the upper level. Nonlinear time series modeling and prediction is used as an example to demonstrate the effectiveness of this hierarchical learning approach.

...read moreread less

Journal Article•DOI•

[...]

Eugene M. Izhikevich¹•Institutions (1)

Arizona State University¹

Nonlinear adaptive trajectory tracking using dynamic neural networks

TL;DR: A large class of simple pulse-coupled neural networks are obtained that can memorize and reproduce synchronized temporal patterns the same way a Hopfield network does with static patterns.

...read moreread less

Abstract: We study pulse-coupled neural networks that satisfy only two assumptions: each isolated neuron fires periodically, and the neurons are weakly connected. Each such network can be transformed by a piece-wise continuous change of variables into a phase model, whose synchronization behavior and oscillatory associative properties are easier to analyze and understand. Using the phase model, we can predict whether a given pulse-coupled network has oscillatory associative memory, or what minimal adjustments should be made so that it can acquire memory. In the search for such minimal adjustments we obtain a large class of simple pulse-coupled neural networks that ran memorize and reproduce synchronized temporal patterns the same way a Hopfield network does with static patterns. The learning occurs via modification of synaptic weights and/or synaptic transmission delays.

...read moreread less

Journal Article•DOI•

[...]

Alexander S. Poznyak¹, Wen Yu, E.N. Sanchez¹, Jose P. Perez²•Institutions (2)

CINVESTAV¹, Universidad Autónoma de Nuevo León²

Model complexity control for regression using VC generalization bounds

TL;DR: This paper establishes two theorems: the first one gives a bound for the identification error and the second one establishes a Bound for the tracking error of the adaptive nonlinear identification and trajectory tracking.

...read moreread less

Abstract: In this paper the adaptive nonlinear identification and trajectory tracking are discussed via dynamic neural networks. By means of a Lyapunov-like analysis we determine stability conditions for the identification error. Then we analyze the trajectory tracking error by a local optimal controller. An algebraic Riccati equation and a differential one are used for the identification and the tracking error analysis. As our main original contributions, we establish two theorems: the first one gives a bound for the identification error, and the second one establishes a bound for the tracking error. We illustrate the effectiveness of these results by two examples: the second-order relay system with multiple isolated equilibrium points and the chaotic system given by Duffing equation.

...read moreread less

Journal Article•DOI•

[...]

Vladimir Cherkassky¹, Xuhui Shao, Filip M. Mulier, Vladimir Vapnik²•Institutions (2)

University of Minnesota¹, AT&T²

Evolving neural networks to play checkers without relying on expert knowledge

TL;DR: Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions, demonstrating the advantages of VC-based complexity control with finite samples.

...read moreread less

Abstract: It is well known that for a given sample size there exists a model of optimal complexity corresponding to the smallest prediction (generalization) error. Hence, any method for learning from finite samples needs to have some provisions for complexity control. Existing implementations of complexity control include penalization (or regularization), weight decay (in neural networks), and various greedy procedures (aka constructive, growing, or pruning methods). There are numerous proposals for determining optimal model complexity (aka model selection) based on various (asymptotic) analytic estimates of the prediction risk and on resampling approaches. Nonasymptotic bounds on the prediction risk based on Vapnik-Chervonenkis (VC)-theory have been proposed by Vapnik. This paper describes application of VC-bounds to regression problems with the usual squared loss. An empirical study is performed for settings where the VC-bounds can be rigorously applied, i.e., linear models and penalized linear models where the VC-dimension can be accurately estimated, and the empirical risk can be reliably minimized. Empirical comparisons between model selection using VC-bounds and classical methods are performed for various noise levels, sample size, target functions and types of approximating functions. Our results demonstrate the advantages of VC-based complexity control with finite samples.

...read moreread less

Journal Article•DOI•

[...]

Kumar Chellapilla¹, David B. Fogel•Institutions (1)

University of California, San Diego¹

On-line learning algorithms for locally recurrent neural networks

TL;DR: An experiment was conducted where neural networks compete for survival in an evolving population based on their ability to play checkers, and multilayer feedforward neural networks were used to evaluate alternative board positions and games were played using a minimax search strategy.

...read moreread less

Abstract: An experiment was conducted where neural networks compete for survival in an evolving population based on their ability to play checkers. More specifically, multilayer feedforward neural networks were used to evaluate alternative board positions and games were played using a minimax search strategy. At each generation, the extant neural networks were paired in competitions and selection was used to eliminate those that performed poorly relative to other networks. Offspring neural networks were created from the survivors using random variation of all weights and bias terms. After a series of 250 generations, the best-evolved neural network was played against human opponents in a series of 90 games on an Internet website. The neural network was able to defeat two expert-level players and played to a draw against a master. The final rating of the neural network placed it in the "Class A" category using a standard rating system. Of particular importance in the design of the experiment was the fact that no features beyond the piece differential were given to the neural networks as a priori knowledge. The process of evolution was able to extract all of the additional information required to play at this level of competency. It accomplished this based almost solely on the feedback offered in the final aggregated outcome of each game played (i.e., win, lose, or draw). This procedure stands in marked contrast to the typical artifice of explicitly injecting expert knowledge into a game-playing program.

...read moreread less

Journal Article•DOI•

[...]

P. Campolucci, Aurelio Uncini, Francesco Piazza, Bhaskar D. Rao¹•Institutions (1)

University of California, San Diego¹

A study of cloud classification with neural networks using spectral and textural features

TL;DR: A new gradient-based procedure called recursive backpropagation (RBP) is proposed whose on-line version, causal recursive back propagation (CRBP), presents some advantages with respect to the other on- line training methods.

...read moreread less

Abstract: This paper focuses on online learning procedures for locally recurrent neural nets with emphasis on multilayer perceptron (MLP) with infinite impulse response (IIR) synapses and its variations which include generalized output and activation feedback multilayer networks (MLN). We propose a new gradient-based procedure called recursive backpropagation (RBP) whose online version, causal recursive backpropagation (CRBP), has some advantages over other online methods. CRBP includes as particular cases backpropagation (BP), temporal BP, Back-Tsoi algorithm (1991) among others, thereby providing a unifying view on gradient calculation for recurrent nets with local feedback. The only learning method known for locally recurrent nets with no architectural restriction is the one by Back and Tsoi. The proposed algorithm has better stability and faster convergence with respect to the Back-Tsoi algorithm. The computational complexity of the CRBP is comparable with that of the Back-Tsoi algorithm, e.g., less that a factor of 1.5 for usual architectures and parameter settings. The superior performance of the new algorithm, however, easily justifies this small increase in computational burden. In addition, the general paradigms of truncated BPTT and RTRL are applied to networks with local feedback and compared with CRBP. CRBP exhibits similar performances and the detailed analysis of complexity reveals that CRBP is much simpler and easier to implement, e.g., CRBP is local in space and in time while RTRL is not local in space.

...read moreread less

Journal Article•DOI•

[...]

Bin Tian¹, M.A. Shaikh, Mahmood R. Azimi-Sadjadi, T.H.V. Haar, Donald L. Reinke - Show less +1 more•Institutions (1)

Colorado State University¹

01 Jan 1999-IEEE Transactions on Neural Networks

TL;DR: The performance of the PNN when used in conjunction with these feature extraction and postprocessing schemes showed the potential of this neural-network-based cloud classification system.

...read moreread less

Abstract: The problem of cloud data classification from satellite imagery using neural networks is considered. Several image transformations such as singular value decomposition (SVD) and wavelet packet (WP) were used to extract the salient spectral and textural features attributed to satellite cloud data in both visible and infrared (IR) channels. In addition, the well-known gray-level cooccurrence matrix (GLCM) method and spectral features were examined for the sake of comparison. Two different neural-network paradigms namely probability neural network (PNN) and unsupervised Kohonen self-organized feature map (SOM) were examined and their performance were also benchmarked on the geostationary operational environmental satellite (GOES) 8 data. Additionally, a postprocessing scheme was developed which utilizes the contextual information in the satellite images to improve the final classification accuracy. Overall, the performance of the PNN when used in conjunction with these feature extraction and postprocessing schemes showed the potential of this neural-network-based cloud classification system.

...read moreread less

Journal Article•DOI•

Neural mechanisms of scene segmentation: recordings from the visual cortex suggest basic circuits for linking field models

[...]

R. Eckhorn¹•Institutions (1)

University of Marburg¹

01 Jan 1999-IEEE Transactions on Neural Networks

TL;DR: This paper aims at relating signals of stimulus dependent synchronization and desynchronization, observed by us in the visual cortex of monkeys, with models of basic neural circuits explaining the measured signals and extending the former linking field model.

...read moreread less

Abstract: Synchronization of neural activity has been proposed to code feature linking. This was supported by the discovery of synchronized neural activities. In cat and monkey visual cortex which occurred stimulus dependent either oscillatory (30-100 Hz) or nonrhythmical, internally generated or stimulus dominated. The area in visual space covered by receptive fields of an actually synchronized assembly of neurons was termed the "linking field". The present paper aims at relating signals of stimulus dependent synchronization and desynchronization, observed by the authors in the visual cortex of monkeys, with models of basic neural circuits explaining the measured signals and extending the authors' former linking field model. The circuits include: (1) a model neuron with the capability of fast mutual spike linking and decoupling which does not degrade the receptive field properties; (2) linking connections for fast synchronization in neighboring assemblies driven by the same stimulus; (3) feedback inhibition in local assemblies via a common interneuron subserving synchronization, desynchronization, and suppression of uncorrelated signals; and (4) common-input connectivity among members of local and distant assemblies supporting zero-delay phase difference in distributed assemblies. Other recently observed cortical effects that potentially support scene segmentation are shortly reviewed to stimulate further ideas for models. Finally, the linking field hypothesis is critically discussed, including contradictory psychophysical work and new supportive neurophysiological evidence.

...read moreread less

Journal Article•DOI•

Physiologically motivated image fusion for object detection using a pulse coupled neural network

[...]

R.P. Broussard, Steven K. Rogers¹, Mark E. Oxley², G.L. Tarr³•Institutions (3)

Battelle Memorial Institute¹, Air Force Institute of Technology², Air Force Research Laboratory³

Improved training of neural networks for the nonlinear active control of sound and vibration

TL;DR: This paper presents the first physiologically motivated pulse coupled neural network (PCNN)-based image fusion network for object detection, which exceeded the accuracy obtained by any individual filtering methods or by logical ANDing the individual object detection technique results.

...read moreread less

Abstract: This paper presents the first physiologically motivated pulse coupled neural network (PCNN)-based image fusion network for object detection. Primate vision processing principles, such as expectation driven filtering, state dependent modulation, temporal synchronization, and multiple processing paths are applied to create a physiologically motivated image fusion network. PCNN are used to fuse the results of several object detection techniques to improve object detection accuracy. Image processing techniques (wavelets, morphological, etc.) are used to extract target features and PCNN are used to focus attention by segmenting and fusing the information. The object detection property of the resulting image fusion network is demonstrated on mammograms and forward-looking infrared radar (FLIR) images. The network removed 94% of the false detections without removing any true detections in the FLIR images and removed 46% of the false detections while removing only 7% of the true detections in the mammograms. The model exceeded the accuracy obtained by any individual filtering methods or by logical ANDing the individual object detection technique results.

...read moreread less

Journal Article•DOI•

[...]

Martin Bouchard¹, B. Paillard², Chon Tan Le Dinh²•Institutions (2)

Ottawa University¹, Université de Sherbrooke²

ANN-DT: an algorithm for extraction of decision trees from artificial neural networks

TL;DR: The results show that some of the new algorithms can greatly improve the learning rate of the neural-network control structure, and that for the considered experimental setup a neural- network controller can outperform linear controllers.

...read moreread less

Abstract: Active control of sound and vibration has been the subject of a lot of research, and examples of applications are now numerous. However, few practical implementations of nonlinear active controllers have been realized. Nonlinear active controllers may be required in cases where the actuators used in active control systems exhibit nonlinear characteristics, or in cases when the structure to be controlled exhibits a nonlinear behavior. A multilayer perceptron neural-network based control structure was previously introduced as a nonlinear active controller, with a training algorithm based on an extended backpropagation scheme. This paper introduces new heuristical training algorithms for the same neural-network control structure. The objective is to develop new algorithms with faster convergence speed and/or lower computational loads. Experimental results of active sound control using a nonlinear actuator with linear and nonlinear controllers are presented. The results show that some of the new algorithms can greatly improve the learning rate of the neural-network control structure, and that for the considered experimental setup a neural-network controller can outperform linear controllers.

...read moreread less

Journal Article•DOI•

[...]

G.P.J. Schmitz¹, Chris Aldrich, F.S. Gouws•Institutions (1)

Stellenbosch University¹

A geometrical representation of McCulloch-Pitts neural model and its applications

TL;DR: A novel artificial neural-network decision tree algorithm (ANN-DT), which extracts binary decision trees from a trained neural network, and is shown to have significant benefits in certain cases when compared with the standard criteria of minimum weighted variance over the branches.

...read moreread less

Abstract: Although artificial neural networks can represent a variety of complex systems with a high degree of accuracy, these connectionist models are difficult to interpret. This significantly limits the applicability of neural networks in practice, especially where a premium is placed on the comprehensibility or reliability of systems. A novel artificial neural-network decision tree algorithm (ANN-DT) is therefore proposed, which extracts binary decision trees from a trained neural network. The ANN-DT algorithm uses the neural network to generate outputs for samples interpolated from the training data set. In contrast to existing techniques, ANN-DT can extract rules from feedforward neural networks with continuous outputs. These rules are extracted from the neural network without making assumptions about the internal structure of the neural network or the features of the data. A novel attribute selection criterion based on a significance analysis of the variables on the neural-network output is examined. It is shown to have significant benefits in certain cases when compared with the standard criteria of minimum weighted variance over the branches. In three case studies the ANN-DT algorithm compared favorably with CART, a standard decision tree algorithm.

...read moreread less

Journal Article•DOI•

[...]

Ling Zhang¹, Bo Zhang²•Institutions (2)

Anhui University¹, Tsinghua University²

01 Jul 1999-IEEE Transactions on Neural Networks

TL;DR: A geometrical representation of McCulloch-Pitts neural model is presented and a clear visual picture and interpretation of the model can be seen and two interesting applications based on the interpretation are discussed.

...read moreread less

Abstract: In this paper, a geometrical representation of McCulloch-Pitts neural model (1943) is presented, From the representation, a clear visual picture and interpretation of the model can be seen. Two interesting applications based on the interpretation are discussed. They are 1) a new design principle of feedforward neural networks and 2) a new proof of mapping abilities of three-layer feedforward neural networks.

...read moreread less

Journal Article•DOI•

An adaptable time-delay neural-network algorithm for image sequence analysis

[...]

C. Wohler¹, J.K. Anlauf²•Institutions (2)

Daimler AG¹, University of Bonn²