scispace - formally typeset
Search or ask a question

Showing papers on "Artificial neural network published in 2003"


Journal ArticleDOI
TL;DR: As with previous analyses of effective connectivity, the focus is on experimentally induced changes in coupling, but unlike previous approaches in neuroimaging, the causal model ascribes responses to designed deterministic inputs, as opposed to treating inputs as unknown and stochastic.

4,182 citations


Journal ArticleDOI
TL;DR: A model is presented that reproduces spiking and bursting behavior of known types of cortical neurons and combines the biologically plausibility of Hodgkin-Huxley-type dynamics and the computational efficiency of integrate-and-fire neurons.
Abstract: A model is presented that reproduces spiking and bursting behavior of known types of cortical neurons. The model combines the biologically plausibility of Hodgkin-Huxley-type dynamics and the computational efficiency of integrate-and-fire neurons. Using this model, one can simulate tens of thousands of spiking cortical neurons in real time (1 ms resolution) using a desktop PC.

4,082 citations


Journal ArticleDOI
TL;DR: Experimental results with real data sets indicate that the combined model can be an effective way to improve forecasting accuracy achieved by either of the models used separately.

3,155 citations


Proceedings ArticleDOI
03 Aug 2003
TL;DR: A set of concrete bestpractices that document analysis researchers can use to get good results with neural networks, including a simple "do-it-yourself" implementation of convolution with a flexible architecture suitable for many visual document problems.
Abstract: Neural networks are a powerful technology forclassification of visual inputs arising from documents.However, there is a confusing plethora of different neuralnetwork methods that are used in the literature and inindustry. This paper describes a set of concrete bestpractices that document analysis researchers can use toget good results with neural networks. The mostimportant practice is getting a training set as large aspossible: we expand the training set by adding a newform of distorted data. The next most important practiceis that convolutional neural networks are better suited forvisual document tasks than fully connected networks. Wepropose that a simple "do-it-yourself" implementation ofconvolution with a flexible architecture is suitable formany visual document problems. This simpleconvolutional neural network does not require complexmethods, such as momentum, weight decay, structure-dependentlearning rates, averaging layers, tangent prop,or even finely-tuning the architecture. The end result is avery simple yet general architecture which can yieldstate-of-the-art performance for document analysis. Weillustrate our claims on the MNIST set of English digitimages.

2,783 citations


Journal ArticleDOI
TL;DR: The results reveal the self-organization of the network into a state where the distribution of community sizes is self-similar, suggesting that a universal mechanism, responsible for emergence of scaling in other self-organized complex systems, as, for instance, river networks, could also be the underlying driving force in the formation and evolution of social networks.
Abstract: We propose a procedure for analyzing and characterizing complex networks. We apply this to the social network as constructed from email communications within a medium sized university with about 1700 employees. Email networks provide an accurate and nonintrusive description of the flow of information within human organizations. Our results reveal the self-organization of the network into a state where the distribution of community sizes is self-similar. This suggests that a universal mechanism, responsible for emergence of scaling in other self-organized complex systems, as, for instance, river networks, could also be the underlying driving force in the formation and evolution of social networks.

1,396 citations


Journal ArticleDOI
TL;DR: Basic concepts, important progress, and significant results in the current studies of various complex networks, with emphasis on the relationship between the topology and the dynamics of such complex networks are reviewed.
Abstract: In the past few years, the discovery of small-world and scale-free properties of many natural and artificial complex networks has stimulated a great deal of interest in studying the underlying organizing principles of various complex networks, which has led to dramatic advances in this emerging and active field of research. The present article reviews some basic concepts, important progress, and significant results in the current studies of various complex networks, with emphasis on the relationship between the topology and the dynamics of such complex networks. Some fundamental properties and typical complex network models are described; and, as an example, epidemic dynamics are analyzed and discussed in some detail. Finally, the important issue of robustness versus fragility of dynamical synchronization in complex networks is introduced and discussed.

1,315 citations


Journal ArticleDOI
TL;DR: The PaD method was found to be the most useful as it gave the most complete results, followed by the Profile method that gave the contribution profile of the input variables, and the classical stepwise methods gave the poorest results.

1,073 citations


Journal ArticleDOI
TL;DR: The difference in predictive performance between the neural network methods and that of the matrix‐driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high‐binding peptides.
Abstract: In this paper we describe an improved neural network method to predict T-cell class I epitopes. A novel input representation has been developed consisting of a combination of sparse encoding, Blosum encoding, and input derived from hidden Markov models. We demonstrate that the combination of several neural networks derived using different sequence-encoding schemes has a performance superior to neural networks derived using a single sequence-encoding scheme. The new method is shown to have a performance that is substantially higher than that of other methods. By use of mutual information calculations we show that peptides that bind to the HLA A*0204 complex display signal of higher order sequence correlations. Neural networks are ideally suited to integrate such higher order correlations when predicting the binding affinity. It is this feature combined with the use of several neural networks derived from different and novel sequence-encoding schemes and the ability of the neural network to be trained on data consisting of continuous binding affinities that gives the new method an improved performance. The difference in predictive performance between the neural network methods and that of the matrix-driven methods is found to be most significant for peptides that bind strongly to the HLA molecule, confirming that the signal of higher order sequence correlation is most strongly present in high-binding peptides. Finally, we use the method to predict T-cell epitopes for the genome of hepatitis C virus and discuss possible applications of the prediction method to guide the process of rational vaccine design.

1,010 citations


Journal ArticleDOI
TL;DR: SVM with adaptive parameters can both achieve higher generalization performance and use fewer support vectors than the standard SVM in financial forecasting.
Abstract: A novel type of learning machine called support vector machine (SVM) has been receiving increasing interest in areas ranging from its original application in pattern recognition to other applications such as regression estimation due to its remarkable generalization performance. This paper deals with the application of SVM in financial time series forecasting. The feasibility of applying SVM in financial forecasting is first examined by comparing it with the multilayer back-propagation (BP) neural network and the regularized radial basis function (RBF) neural network. The variability in performance of SVM with respect to the free parameters is investigated experimentally. Adaptive parameters are then proposed by incorporating the nonstationarity of financial time series into SVM. Five real futures contracts collated from the Chicago Mercantile Market are used as the data sets. The simulation shows that among the three methods, SVM outperforms the BP neural network in financial forecasting, and there are comparable generalization performance between SVM and the regularized RBF neural network. Furthermore, the free parameters of SVM have a great effect on the generalization performance. SVM with adaptive parameters can both achieve higher generalization performance and use fewer support vectors than the standard SVM in financial forecasting.

916 citations


Journal ArticleDOI
TL;DR: This paper focuses on neural network-based approaches for novelty detection, and statistical approaches are covered in Part 1 paper.

862 citations


Journal ArticleDOI
TL;DR: It is shown that the improved GA performs better than the standard GA based on some benchmark test functions and a neural network with switches introduced to its links is proposed that can learn both the input-output relationships of an application and the network structure using the improvedGA.
Abstract: This paper presents the tuning of the structure and parameters of a neural network using an improved genetic algorithm (GA). It is also shown that the improved GA performs better than the standard GA based on some benchmark test functions. A neural network with switches introduced to its links is proposed. By doing this, the proposed neural network can learn both the input-output relationships of an application and the network structure using the improved GA. The number of hidden nodes is chosen manually by increasing it from a small number until the learning performance in terms of fitness value is good enough. Application examples on sunspot forecasting and associative memory are given to show the merits of the improved GA and the proposed neural network.

Journal ArticleDOI
TL;DR: This paper rigorously proves in a constructive method that two-hidden-layer feedforward networks (TLFNs) with 2/spl radic/(m+2)N (/spl Lt/N) hidden neurons can learn any N distinct samples with any arbitrarily small error, where m is the required number of output neurons.
Abstract: The problem of the necessary complexity of neural networks is of interest in applications. In this paper, learning capability and storage capacity of feedforward neural networks are considered. We markedly improve the recent results by introducing neural-network modularity logically. This paper rigorously proves in a constructive method that two-hidden-layer feedforward networks (TLFNs) with 2/spl radic/(m+2)N (/spl Lt/N) hidden neurons can learn any N distinct samples (x/sub i/, t/sub i/) with any arbitrarily small error, where m is the required number of output neurons. It implies that the required number of hidden neurons needed in feedforward networks can be decreased significantly, comparing with previous results. Conversely, a TLFN with Q hidden neurons can store at least Q/sup 2//4(m+2) any distinct data (x/sub i/, t/sub i/) with any desired precision.

Journal ArticleDOI
TL;DR: The proposed procedure requires only a few features extracted from the measured vibration data either directly or with simple preprocessing, leading to faster training requiring far less iterations making the procedure suitable for on-line condition monitoring and diagnostics of machines.

Journal ArticleDOI
28 Jul 2003
TL;DR: The results of a linear (linear discriminant analysis) and two nonlinear classifiers applied to the classification of spontaneous EEG during five mental tasks are reported, showing that non linear classifiers produce only slightly better classification results.
Abstract: The reliable operation of brain-computer interfaces (BCIs) based on spontaneous electroencephalogram (EEG) signals requires accurate classification of multichannel EEG. The design of EEG representations and classifiers for BCI are open research questions whose difficulty stems from the need to extract complex spatial and temporal patterns from noisy multidimensional time series obtained from EEG measurements. The high-dimensional and noisy nature of EEG may limit the advantage of nonlinear classification methods over linear ones. This paper reports the results of a linear (linear discriminant analysis) and two nonlinear classifiers (neural networks and support vector machines) applied to the classification of spontaneous EEG during five mental tasks, showing that nonlinear classifiers produce only slightly better classification results. An approach to feature selection based on genetic algorithms is also presented with preliminary results of application to EEG during finger movement.

Journal ArticleDOI
TL;DR: Hierarchy generative models enable the learning of empirical priors and eschew prior assumptions about the causes of sensory input that are inherent in non-hierarchical models, but are not necessary in a hierarchical context.

Book
01 Jan 2003
TL;DR: This book provides comprehensive treatment of the theory of both static and dynamic neural networks, and end-of-chapter exercises for both students and teachers.
Abstract: From the Publisher: Provides comprehensive treatment of the theory of both static and dynamic neural networks. * Theoretical concepts are illustrated by reference to practical examples Includes end-of-chapter exercises and end-of-chapter exercises.

Journal ArticleDOI
TL;DR: A Bayesian approach is adopted in which some of the model parameters are shared and others more loosely connected through a joint prior distribution that can be learned from the data to combine the best parts of both the statistical multilevel approach and the neural network machinery.
Abstract: Modeling a collection of similar regression or classification tasks can be improved by making the tasks 'learn from each other'. In machine learning, this subject is approached through 'multitask learning', where parallel tasks are modeled as multiple outputs of the same network. In multilevel analysis this is generally implemented through the mixed-effects linear model where a distinction is made between 'fixed effects', which are the same for all tasks, and 'random effects', which may vary between tasks. In the present article we will adopt a Bayesian approach in which some of the model parameters are shared (the same for all tasks) and others more loosely connected through a joint prior distribution that can be learned from the data. We seek in this way to combine the best parts of both the statistical multilevel approach and the neural network machinery. The standard assumption expressed in both approaches is that each task can learn equally well from any other task. In this article we extend the model by allowing more differentiation in the similarities between tasks. One such extension is to make the prior mean depend on higher-level task characteristics. More unsupervised clustering of tasks is obtained if we go from a single Gaussian prior to a mixture of Gaussians. This can be further generalized to a mixture of experts architecture with the gates depending on task characteristics. All three extensions are demonstrated through application both on an artificial data set and on two real-world problems, one a school problem and the other involving single-copy newspaper sales.

Journal ArticleDOI
TL;DR: Fundamental concepts in this emerging area of neural-network computational modules are described at teaching RF/microwave engineers what neural networks are, why they are useful, when they can be used, and how to use them.
Abstract: Neural-network computational modules have recently gained recognition as an unconventional and useful tool for RF and microwave modeling and design. Neural networks can be trained to learn the behavior of passive/active components/circuits. A trained neural network can be used for high-level design, providing fast and accurate answers to the task it has learned. Neural networks are attractive alternatives to conventional methods such as numerical modeling methods, which could be computationally expensive, or analytical methods which could be difficult to obtain for new devices, or empirical modeling solutions whose range and accuracy may be limited. This tutorial describes fundamental concepts in this emerging area aimed at teaching RF/microwave engineers what neural networks are, why they are useful, when they can be used, and how to use them. Neural-network structures and their training methods are described from the RF/microwave designer's perspective. Electromagnetics-based training for passive component models and physics-based training for active device models are illustrated. Circuit design and yield optimization using passive/active neural models are also presented. A multimedia slide presentation along with narrative audio clips is included in the electronic version of this paper. A hyperlink to the NeuroModeler demonstration software is provided to allow readers practice neural-network-based design concepts.

Journal ArticleDOI
TL;DR: In this study, differential evolution has been analyzed as a candidate global optimization method for feed-forward neural networks and seems not to provide any distinct advantage in terms of learning rate or solution quality.
Abstract: An evolutionary optimization method over continuous search spaces, differential evolution, has recently been successfully applied to real world and artificial optimization problems and proposed also for neural network training However, differential evolution has not been comprehensively studied in the context of training neural network weights, ie, how useful is differential evolution in finding the global optimum for expense of convergence speed In this study, differential evolution has been analyzed as a candidate global optimization method for feed-forward neural networks In comparison to gradient based methods, differential evolution seems not to provide any distinct advantage in terms of learning rate or solution quality Differential evolution can rather be used in validation of reached optima and in the development of regularization terms and non-conventional transfer functions that do not necessarily provide gradient information

Journal ArticleDOI
TL;DR: Although SVM outperformed the ANN classifiers with regard to overall prediction accuracy, both methods were shown to complement each other, as the sets of true positives, false positives, true negatives, and false negatives produced by the two classifiers were not identical.
Abstract: Support vector machine (SVM) and artificial neural network (ANN) systems were applied to a drug/nondrug classification problem as an example of binary decision problems in early-phase virtual compound filtering and screening. The results indicate that solutions obtained by SVM training seem to be more robust with a smaller standard error compared to ANN training. Generally, the SVM classifier yielded slightly higher prediction accuracy than ANN, irrespective of the type of descriptors used for molecule encoding, the size of the training data sets, and the algorithm employed for neural network training. The performance was compared using various different descriptor sets and descriptor combinations based on the 120 standard Ghose-Crippen fragment descriptors, a wide range of 180 different properties and physicochemical descriptors from the Molecular Operating Environment (MOE) package, and 225 topological pharmacophore (CATS) descriptors. For the complete set of 525 descriptors cross-validated classificati...

Journal ArticleDOI
TL;DR: Several new sufficient conditions for ascertaining the existence, uniqueness, and global asymptotic stability of the equilibrium point of such recurrent neural networks are obtained by using the theory of topological degree and properties of nonsingular M-matrix, and constructing suitable Lyapunov functionals.
Abstract: In this paper, the existence and uniqueness of the equilibrium point and its global asymptotic stability are discussed for a general class of recurrent neural networks with time-varying delays and Lipschitz continuous activation functions. The neural network model considered includes the delayed Hopfield neural networks, bidirectional associative memory networks, and delayed cellular neural networks as its special cases. Several new sufficient conditions for ascertaining the existence, uniqueness, and global asymptotic stability of the equilibrium point of such recurrent neural networks are obtained by using the theory of topological degree and properties of nonsingular M-matrix, and constructing suitable Lyapunov functionals. The new criteria do not require the activation functions to be differentiable, bounded or monotone nondecreasing and the connection weight matrices to be symmetric. Some stability results from previous works are extended and improved. Two illustrative examples are given to demonstrate the effectiveness of the obtained results.

Journal ArticleDOI
TL;DR: It is concluded that neural network rule extraction and decision tables are powerful management tools that allow us to build advanced and userfriendly decision-support systems for credit-risk evaluation.
Abstract: Credit-risk evaluation is a very challenging and important management science problem in the domain of financial analysis. Many classification methods have been suggested in the literature to tackle this problem. Neural networks, especially, have received a lot of attention because of their universal approximation property. However, a major drawback associated with the use of neural networks for decision making is their lack of explanation capability. While they can achieve a high predictive accuracy rate, the reasoning behind how they reach their decisions is not readily available. In this paper, we present the results from analysing three real-life credit-risk data sets using neural network rule extraction techniques. Clarifying the neural network decisions by explanatory rules that capture the learned knowledge embedded in the networks can help the credit-risk manager in explaining why a particular applicant is classified as either bad or good. Furthermore, we also discuss how these rules can be visualized as a decision table in a compact and intuitive graphical format that facilitates easy consultation. It is concluded that neural network rule extraction and decision tables are powerful management tools that allow us to build advanced and userfriendly decision-support systems for credit-risk evaluation.

Journal ArticleDOI
TL;DR: In this article, various principles of the neural network approach for predicting certain properties of polymer composite materials are discussed, such as fatigue life, wear performance, response under combined loading situations, and dynamic mechanical properties.

Journal ArticleDOI
TL;DR: A universal model for the firing-frequency dynamics of an adapting neuron that is independent of the specific adaptation process and spike generator is derived and the specific nature of high-pass filter properties caused by spike-frequency adaptation is elucidated.
Abstract: Spike-frequency adaptation is a prominent feature of neural dynamics. Among other mechanisms, various ionic currents modulating spike generation cause this type of neural adaptation. Prominent examples are voltage-gated potassium currents (M-type currents), the interplay of calcium currents and intracellular calcium dynamics with calcium-gated potassium channels (AHP-type currents), and the slow recovery from inactivation of the fast sodium current. While recent modeling studies have focused on the effects of specific adaptation currents, we derive a universal model for the firing-frequency dynamics of an adapting neuron that is independent of the specific adaptation process and spike generator. The model is completely defined by the neuron's onset f-I curve, the steady-state f-I curve, and the time constant of adaptation. For a specific neuron, these parameters can be easily determined from electrophysiological measurements without any pharmacological manipulations. At the same time, the simplicity of the model allows one to analyze mathematically how adaptation influences signal processing on the single-neuron level. In particular, we elucidate the specific nature of high-pass filter properties caused by spike-frequency adaptation. The model is limited to firing frequencies higher than the reciprocal adaptation time constant and to moderate fluctuations of the adaptation and the input current. As an extension of the model, we introduce a framework for combining an arbitrary spike generator with a generalized adaptation current.

Journal ArticleDOI
TL;DR: A study to compare the performance of bearing fault detection using two different classifiers, namely, artificial neural networks and support vector machines (SMVs), using time-domain vibration signals of a rotating machine with normal and defective bearings.

Book
30 Jun 2003
TL;DR: This book examines the mathematical governing principles of simulation-based optimization, thereby providing the reader with the ability to model relevant real-life problems using these techniques, and outlines the computational technology underlying these methods.
Abstract: From the Publisher: "Simulation-Based Optimization: Parametric Optimization Techniques and Reinforcement Learning introduces the evolving area of simulation-based optimization. Since it became possible to analyze random systems using computers, scientists and engineers have sought the means to optimize systems using simulation models. Only recently, however, has this objective had success in practice. Cutting-edge work in computational operations research, including non-linear programming (simultaneous perturbation), dynamic programming (reinforcement learning), and game theory (learning automata) has made it possible to use simulation in conjunction with optimization techniques. As a result, this research has given simulation added dimensions and power that it did not have in the recent past." "The book's objective is two-fold: (1) It examines the mathematical governing principles of simulation-based optimization, thereby providing the reader with the ability to model relevant real-life problems using these techniques. (2) It outlines the computational technology underlying these methods. Taken together, these two aspects demonstrate that the mathematical and computational methods discussed in this book do work." "Broadly speaking, the book has two parts: (1) parametric (static) optimization and (2) control (dynamic) optimization. Some of the book's special features are: an accessible introduction to reinforcement learning and parametric-optimization techniques; a step-by-step description of several algorithms of simulation-based optimization; a clear and simple introduction to the methodology of neural networks; a gentle introduction to convergence analysis of some of the methods enumerated above; and Computer programs for many algorithms of simulation-based optimization." This book is written for students and researchers in the fields of engineering (electrical, industrial and computer), computer science, operations research, management science, and applied mathematics.

Journal ArticleDOI
TL;DR: Modifications of this algorithm that improve its learning speed are discussed and the new optimization methods are empirically compared to the existing Rprop variants, the conjugate gradient method, Quickprop, and the BFGS algorithm on a set of neural network benchmark problems.

Journal ArticleDOI
TL;DR: Results from the theory of differential equations with discontinuous right-hand side as introduced by Filippov are employed, and global convergence is addressed by using a Lyapunov-like approach based on the concept of monotone trajectories of a differential inclusion.
Abstract: The paper introduces a general class of neural networks where the neuron activations are modeled by discontinuous functions. The neural networks have an additive interconnecting structure and they include as particular cases the Hopfield neural networks (HNNs), and the standard cellular neural networks (CNNs), in the limiting situation where the HNNs and CNNs possess neurons with infinite gain. Conditions are derived which ensure the existence of a unique equilibrium point, and a unique output equilibrium point, which are globally attractive for the state and the output trajectories of the neural network, respectively. These conditions, which are applicable to general nonsymmetric neural networks, are based on the concept of Lyapunov diagonally-stable neuron interconnection matrices, and they can be thought of as a generalization to the discontinuous case of previous results established for neural networks possessing smooth neuron activations. Moreover, by suitably exploiting the presence of sliding modes, entirely new conditions are obtained which ensure global convergence in finite time, where the convergence time can be easily estimated on the basis of the relevant neural-network parameters. The analysis in the paper employs results from the theory of differential equations with discontinuous right-hand side as introduced by Filippov. In particular, global convergence is addressed by using a Lyapunov-like approach based on the concept of monotone trajectories of a differential inclusion.

Journal ArticleDOI
TL;DR: A technique which allows to continuously control both the age of a synthetic voice and the quantity of emotions that are expressed and the first large-scale data mining experiment about the automatic recognition of basic emotions in informal everyday short utterances is presented.
Abstract: This paper presents algorithms that allow a robot to express its emotions by modulating the intonation of its voice. They are very simple and efficiently provide life-like speech thanks to the use of concatenative speech synthesis. We describe a technique which allows to continuously control both the age of a synthetic voice and the quantity of emotions that are expressed. Also, we present the first large-scale data mining experiment about the automatic recognition of basic emotions in informal everyday short utterances. We focus on the speaker-dependent problem. We compare a large set of machine learning algorithms, ranging from neural networks, Support Vector Machines or decision trees, together with 200 features, using a large database of several thousands examples. We show that the difference of performance among learning schemes can be substantial, and that some features which were previously unexplored are of crucial importance. An optimal feature set is derived through the use of a genetic algorithm. Finally, we explain how this study can be applied to real world situations in which very few examples are available. Furthermore, we describe a game to play with a personal robot which facilitates teaching of examples of emotional utterances in a natural and rather unconstrained manner.

Proceedings ArticleDOI
27 Jan 2003
TL;DR: This paper applies the technique of deleting one feature at a time to perform experiments on SVMs and neural networks to rank the importance of input features for the DARPA collected intrusion data and shows that SVM-based and neural network based IDSs using a reduced number of features can deliver enhanced or comparable performance.
Abstract: Intrusion detection is a critical component of secure information systems. This paper addresses the issue of identifying important input features in building an intrusion detection system (IDS). Since elimination of the insignificant and/or useless inputs leads to a simplification of the problem, faster and more accurate detection may result. Feature ranking and selection, therefore, is an important issue in intrusion detection. We apply the technique of deleting one feature at a time to perform experiments on SVMs and neural networks to rank the importance of input features for the DARPA collected intrusion data. Important features for each of the 5 classes of intrusion patterns in the DARPA data are identified. It is shown that SVM-based and neural network based IDSs using a reduced number of features can deliver enhanced or comparable performance. An IDS for class-specific detection based on five SVMs is proposed.