scispace - formally typeset
Search or ask a question

Showing papers on "Artificial neural network published in 1998"


Journal ArticleDOI
01 Jan 1998
TL;DR: In this article, a graph transformer network (GTN) is proposed for handwritten character recognition, which can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters.
Abstract: Multilayer neural networks trained with the back-propagation algorithm constitute the best example of a successful gradient based learning technique. Given an appropriate network architecture, gradient-based learning algorithms can be used to synthesize a complex decision surface that can classify high-dimensional patterns, such as handwritten characters, with minimal preprocessing. This paper reviews various methods applied to handwritten character recognition and compares them on a standard handwritten digit recognition task. Convolutional neural networks, which are specifically designed to deal with the variability of 2D shapes, are shown to outperform all other techniques. Real-life document recognition systems are composed of multiple modules including field extraction, segmentation recognition, and language modeling. A new learning paradigm, called graph transformer networks (GTN), allows such multimodule systems to be trained globally using gradient-based methods so as to minimize an overall performance measure. Two systems for online handwriting recognition are described. Experiments demonstrate the advantage of global training, and the flexibility of graph transformer networks. A graph transformer network for reading a bank cheque is also described. It uses convolutional neural network character recognizers combined with global training techniques to provide record accuracy on business and personal cheques. It is deployed commercially and reads several million cheques per day.

42,067 citations


Book
16 Jul 1998
TL;DR: Thorough, well-organized, and completely up to date, this book examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks.
Abstract: From the Publisher: This book represents the most comprehensive treatment available of neural networks from an engineering perspective. Thorough, well-organized, and completely up to date, it examines all the important aspects of this emerging technology, including the learning process, back-propagation learning, radial-basis function networks, self-organizing systems, modular networks, temporal processing and neurodynamics, and VLSI implementation of neural networks. Written in a concise and fluid manner, by a foremost engineering textbook author, to make the material more accessible, this book is ideal for professional engineers and graduate students entering this exciting field. Computer experiments, problems, worked examples, a bibliography, photographs, and illustrations reinforce key concepts.

29,130 citations


Journal ArticleDOI
TL;DR: In this article, a visual attention system inspired by the behavior and the neuronal architecture of the early primate visual system is presented, where multiscale image features are combined into a single topographical saliency map.
Abstract: A visual attention system, inspired by the behavior and the neuronal architecture of the early primate visual system, is presented. Multiscale image features are combined into a single topographical saliency map. A dynamical neural network then selects attended locations in order of decreasing saliency. The system breaks down the complex problem of scene understanding by rapidly selecting, in a computationally efficient manner, conspicuous locations to be analyzed in detail.

10,525 citations


Journal ArticleDOI
TL;DR: A common theoretical framework for combining classifiers which use distinct pattern representations is developed and it is shown that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision.
Abstract: We develop a common theoretical framework for combining classifiers which use distinct pattern representations and show that many existing schemes can be considered as special cases of compound classification where all the pattern representations are used jointly to make a decision. An experimental comparison of various classifier combination schemes demonstrates that the combination rule developed under the most restrictive assumptions-the sum rule-outperforms other classifier combinations schemes. A sensitivity analysis of the various schemes to estimation errors is carried out to show that this finding can be justified theoretically.

5,670 citations


Journal ArticleDOI
TL;DR: A neural network-based upright frontal face detection system that arbitrates between multiple networks to improve performance over a single network, and a straightforward procedure for aligning positive face examples for training.
Abstract: We present a neural network-based upright frontal face detection system. A retinally connected neural network examines small windows of an image and decides whether each window contains a face. The system arbitrates between multiple networks to improve performance over a single network. We present a straightforward procedure for aligning positive face examples for training. To collect negative examples, we use a bootstrap algorithm, which adds false detections into the training set as training progresses. This eliminates the difficult task of manually selecting nonface training examples, which must be chosen to span the entire space of nonface images. Simple heuristics, such as using the fact that faces rarely overlap in images, can further improve the accuracy. Comparisons with several other state-of-the-art face detection systems are presented, showing that our system has comparable performance in terms of detection and false-positive rates.

4,105 citations


Journal ArticleDOI
TL;DR: In this paper, the authors present a state-of-the-art survey of ANN applications in forecasting and provide a synthesis of published research in this area, insights on ANN modeling issues, and future research directions.

3,680 citations


Book
01 May 1998
TL;DR: Multitask learning as discussed by the authors is an approach to inductive transfer that improves learning for one task by using the information contained in the training signals of other related tasks, and it does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better.
Abstract: Multitask Learning is an approach to inductive transfer that improves learning for one task by using the information contained in the training signals of other related tasks. It does this by learning tasks in parallel while using a shared representation; what is learned for each task can help other tasks be learned better. In this thesis we demonstrate multitask learning for a dozen problems. We explain how multitask learning works and show that there are many opportunities for multitask learning in real domains. We show that in some cases features that would normally be used as inputs work better if used as multitask outputs instead. We present suggestions for how to get the most out of multitask learning in artificial neural nets, present an algorithm for multitask learning with case-based methods like k-nearest neighbor and kernel regression, and sketch an algorithm for multitask learning in decision trees. Multitask learning improves generalization performance, can be applied in many different kinds of domains, and can be used with different learning algorithms. We conjecture there will be many opportunities for its use on real-world problems.

2,642 citations


Journal ArticleDOI
TL;DR: In this paper, the authors used information geometry to calculate the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the spaces of linear dynamical systems for blind source deconvolution, and proved that Fisher efficient online learning has asymptotically the same performance as the optimal batch estimation of parameters.
Abstract: When a parameter space has a certain underlying structure, the ordinary gradient of a function does not represent its steepest direction, but the natural gradient does. Information geometry is used for calculating the natural gradients in the parameter space of perceptrons, the space of matrices (for blind source separation), and the space of linear dynamical systems (for blind source deconvolution). The dynamical behavior of natural gradient online learning is analyzed and is proved to be Fisher efficient, implying that it has asymptotically the same performance as the optimal batch estimation of parameters. This suggests that the plateau phenomenon, which appears in the backpropagation learning algorithm of multilayer perceptrons, might disappear or might not be so serious when the natural gradient is used. An adaptive method of updating the learning rate is proposed and analyzed.

2,504 citations


Journal ArticleDOI
TL;DR: This paper presents a general introduction and discussion of recent applications of the multilayer perceptron, one type of artificial neural network, in the atmospheric sciences.

2,389 citations


01 Jan 1998
TL;DR: The Structural Risk Minimization (SRM) as discussed by the authors principle has been shown to be superior to traditional empirical risk minimization (ERM) principle employed by conventional neural networks, as opposed to ERM which minimizes the error on the training data.
Abstract: The foundations of Support Vector Machines (SVM) have been developed by Vapnik and are gaining popularity due to many attractive features, and promising empirical performance. The formulation embodies the Structural Risk Minimisation (SRM) principle, which in our work has been shown to be superior to traditional Empirical Risk Minimisation (ERM) principle employed by conventional neural networks. SRM minimises an upper bound on the VC dimension (generalisation error), as opposed to ERM which minimises the error on the training data. It is this difference which equips SVMs with a greater ability to generalise, which is our goal in statistical learning. SVMs were developed to solve the classification problem, but recently they have been extended to the domain of regression problems.

2,295 citations


Journal ArticleDOI
TL;DR: A modular approach to motor learning and control based on multiple pairs of inverse (controller) and forward (predictor) models that can simultaneously learn the multiple inverse models necessary for control as well as how to select the inverse models appropriate for a given environment is proposed.

Journal ArticleDOI
TL;DR: An overview of the self-organizing map algorithm, on which the papers in this issue are based, is presented in this article.

Journal ArticleDOI
TL;DR: Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights.
Abstract: Sample complexity results from computational learning theory, when applied to neural network learning for pattern classification problems, suggest that for good generalization performance the number of training examples should grow at least linearly with the number of adjustable parameters in the network. Results in this paper show that if a large neural network is used for a pattern classification problem and the learning algorithm finds a network with small weights that has small squared error on the training patterns, then the generalization performance depends on the size of the weights rather than the number of weights. For example, consider a two-layer feedforward network of sigmoid units, in which the sum of the magnitudes of the weights associated with each unit is bounded by A and the input dimension is n. We show that the misclassification probability is no more than a certain error estimate (that is related to squared error on the training set) plus A/sup 3/ /spl radic/((log n)/m) (ignoring log A and log m factors), where m is the number of training patterns. This may explain the generalization performance of neural networks, particularly when the number of training examples is considerably smaller than the number of weights. It also supports heuristics (such as weight decay and early stopping) that attempt to keep the weights small during training. The proof techniques appear to be useful for the analysis of other pattern classifiers: when the input domain is a totally bounded metric space, we use the same approach to give upper bounds on misclassification probability for classifiers with decision boundaries that are far from the training examples.

Book
01 Mar 1998
TL;DR: This is an interdisciplinary book on neural networks, statistics and fuzzy systems that establishes a general framework for adaptive data modeling within which various methods from statistics, neural networks and fuzzy logic are presented.
Abstract: From the Publisher: This is an interdisciplinary book on neural networks, statistics and fuzzy systems. A unique feature is the establishment of a general framework for adaptive data modeling within which various methods from statistics, neural networks and fuzzy logic are presented. Chapter summaries, examples and case studies are also included.[Includes companion Web site with ... Software for use with the book.

Journal ArticleDOI
TL;DR: A linear transformation for each input variable can be incorporated into the network so that much fewer rules are needed or higher accuracy can be achieved.
Abstract: A self-constructing neural fuzzy inference network (SONFIN) with online learning ability is proposed in this paper. The SONFIN is inherently a modified Takagi-Sugeno-Kang (TSK)-type fuzzy rule-based model possessing neural network learning ability. There are no rules initially in the SONFIN. They are created and adapted as online learning proceeds via simultaneous structure and parameter identification. In the structure identification of the precondition part, the input space is partitioned in a flexible way according to an aligned clustering-based algorithm. As to the structure identification of the consequent part, only a singleton value selected by a clustering method is assigned to each rule initially. Afterwards, some additional significant terms selected via a projection-based correlation measure for each rule will be added to the consequent part incrementally as learning proceeds. The combined precondition and consequent structure identification scheme can set up an economic and dynamically growing network, a main feature of the SONFIN. In the parameter identification, the consequent parameters are tuned optimally by either least mean squares or recursive least squares algorithms and the precondition parameters are tuned by a backpropagation algorithm. To enhance the knowledge representation ability of the SONFIN, a linear transformation for each input variable can be incorporated into the network so that much fewer rules are needed or higher accuracy can be achieved.

Journal ArticleDOI
TL;DR: To aid a more well-founded selection of the stopping criterion, 14 different automatic stopping criteria from three classes were evaluated empirically for their efficiency and effectiveness in 12 different classification and approximation tasks using multi-layer perceptrons with RPROP training.

01 Jan 1998
TL;DR: This chapter will assess whether the feedforward network has been superceded, for supervised regression and classification tasks, and will review work on this idea by Williams and Rasmussen (1996), Neal (1997), Barber and Williams (1997) and Gibbs and MacKay (1997).
Abstract: Feedforward neural networks such as multilayer perceptrons are popular tools for nonlinear regression and classification problems. From a Bayesian perspective, a choice of a neural network model can be viewed as defining a prior probability distribution over non-linear functions, and the neural network's learning process can be interpreted in terms of the posterior probability distribution over the unknown function. (Some learning algorithms search for the function with maximum posterior probability and other Monte Carlo methods draw samples from this posterior probability). In the limit of large but otherwise standard networks, Neal (1996) has shown that the prior distribution over non-linear functions implied by the Bayesian neural network falls in a class of probability distributions known as Gaussian processes. The hyperparameters of the neural network model determine the characteristic length scales of the Gaussian process. Neal's observation motivates the idea of discarding parameterized networks and working directly with Gaussian processes. Computations in which the parameters of the network are optimized are then replaced by simple matrix operations using the covariance matrix of the Gaussian process. In this chapter I will review work on this idea by Williams and Rasmussen (1996), Neal (1997), Barber and Williams (1997) and Gibbs and MacKay (1997), and will assess whether, for supervised regression and classification tasks, the feedforward network has been superceded.

Journal ArticleDOI
TL;DR: The chaotic nature of the balanced state of this network model is revealed by showing that the evolution of the microscopic state of the network is extremely sensitive to small deviations in its initial conditions.
Abstract: The nature and origin of the temporal irregularity in the electrical activity of cortical neurons in vivo are not well understood. We consider the hypothesis that this irregularity is due to a balance of excitatory and inhibitory currents into the cortical cells. We study a network model with excitatory and inhibitory populations of simple binary units. The internal feedback is mediated by relatively large synaptic strengths, so that the magnitude of the total excitatory and inhibitory feedback is much larger than the neuronal threshold. The connectivity is random and sparse. The mean number of connections per unit is large, though small compared to the total number of cells in the network. The network also receives a large, temporally regular input from external sources. We present an analytical solution of the mean-field theory of this model, which is exact in the limit of large network size. This theory reveals a new cooperative stationary state of large networks, which we term a balanced state. In thi...

Journal ArticleDOI
TL;DR: Models of neural networks are developed from a biological point of view and small networks are analysed using techniques from dynamical systems.
Abstract: Models of neural networks are developed from a biological point of view. Small networks are analysed using techniques from dynamical systems. The behaviour of spatially and temporally organized neural fields is then discussed from the point of view of pattern formation. Bifurcation methods, analytic solutions and perturbation methods are applied to these models.

Book
01 Aug 1998
TL;DR: This book introduces the newly emerging technology of artificial neural networks and demonstrates its use in intelligent manufacturing systems and presents some of the most promising current research in the design and training of artificial Neural networks with applications in speech and vision.
Abstract: DOWNLOAD http://bit.ly/1OslRBc Artificial neural networks: theory and applications This comprehensive tutorial on artifical neural networks covers all the important neural network architectures as well as the most recent theory-e.g., pattern recognition, statistical theory, and other mathematical prerequisites. A broad range of applications is provided for each of the architectures. Artificial neural networks for intelligent manufacturing , Cihan H. Dagli, 1994, Technology & Engineering, 469 pages. This book introduces the newly emerging technology of artificial neural networks and demonstrates its use in intelligent manufacturing systems.. Presents some of the most promising current research in the design and training of artificial neural networks (ANNs) with applications in speech and vision, as reported by the. Provides an introduction to the use of neural networks in mechanical engineering applications. This book presents models like Hopfield, Bi-directional Associative Memory, fuzzy. The recent interest in artificial neural networks has motivated the publication of numerous books, including selections of research papers and textbooks presenting the most.

Journal ArticleDOI
TL;DR: This paper introduces two algorithms for analog and digital modulations recognition that utilizes the decision-theoretic approach in which a set of decision criteria for identifying different types of modulations is developed and the artificial neural network is used as a new approach.
Abstract: This paper introduces two algorithms for analog and digital modulations recognition. The first algorithm utilizes the decision-theoretic approach in which a set of decision criteria for identifying different types of modulations is developed. In the second algorithm the artificial neural network (ANN) is used as a new approach for the modulation recognition process. Computer simulations of different types of band-limited analog and digitally modulated signals corrupted by band-limited Gaussian noise sequences have been carried out to measure the performance of the developed algorithms. In the decision-theoretic algorithm it is found that the overall success rate is over 94% at the signal-to-noise ratio (SNR) of 15 dB, while in the ANN algorithm the overall success rate is over 96% at the SNR of 15 dB.

Journal ArticleDOI
C. Nebauer1
TL;DR: Instead of training convolutional networks by time-consuming error backpropagation, a modular procedure is applied whereby layers are trained sequentially from the input to the output layer in order to recognize features of increasing complexity.
Abstract: Convolutional neural networks provide an efficient method to constrain the complexity of feedforward neural networks by weight sharing and restriction to local connections. This network topology has been applied in particular to image classification when sophisticated preprocessing is to be avoided and raw images are to be classified directly. In this paper two variations of convolutional networks-neocognitron and a modification of neocognitron-are compared with classifiers based on fully connected feedforward layers with respect to their visual recognition performance. For a quantitative experimental comparison with standard classifiers two very different recognition tasks have been-chosen: handwritten digit recognition and face recognition. In the first example, the generalization of convolutional networks is compared to fully connected networks; in the second example human face recognition is investigated under constrained and variable conditions, and the limitations of convolutional networks are discussed.

Book
26 Jun 1998
TL;DR: Probabilistic inference in graphical models pattern classification unsupervised learning data compression channel coding future research directions and how this affects research directions is investigated.
Abstract: Probabilistic inference in graphical models pattern classification unsupervised learning data compression channel coding future research directions.

Proceedings ArticleDOI
23 Jun 1998
TL;DR: This paper presents a neural network-based face detection system, which is limited to detecting upright, frontal faces, and presents preliminary results for detecting faces rotated out of the image plane, such as profiles and semi-profiles.
Abstract: In this paper, we present a neural network-based face detection system. Unlike similar systems which are limited to detecting upright, frontal faces, this system detects faces at any degree of rotation in the image plane. The system employs multiple networks; a "router" network first processes each input window to determine its orientation and then uses this information to prepare the window for one or more "detector" networks. We present the training methods for both types of networks. We also perform sensitivity analysis on the networks, and present empirical results on a large test set. Finally, we present preliminary results for detecting faces rotated out of the image plane, such as profiles and semi-profiles.

Book ChapterDOI
Vladimir Vapnik1
01 Jan 1998
TL;DR: For the Support Vector method both the quality of solution and the complexity of the solution does not depend directly on the dimensionality of an input space, and on the basis of this technique one can obtain a good estimate using a given number of high-dimensional data.
Abstract: This chapter describes the Support Vector technique for function estimation problems such as pattern recognition, regression estimation, and solving linear operator equations. It shows that for the Support Vector method both the quality of solution and the complexity of the solution does not depend directly on the dimensionality of an input space. Therefore, on the basis of this technique one can obtain a good estimate using a given number of high-dimensional data.

Book
31 Aug 1998
TL;DR: This chapter discusses data mining and knowledge discovery through the lens of machine learning, and some of the techniques used in this chapter were previously described in the preface.
Abstract: Foreword. Preface. 1. Data Mining and Knowledge Discovery. 2. Rough Sets. 3. Fuzzy Sets. 4. Bayesian Methods. 5. Evolutionary Computing. 6. Machine Learning. 7. Neural Networks. 8. Clustering. 9. Preprocessing. Index.

Journal ArticleDOI
TL;DR: A fundamental open problem in computer vision—determining pose and correspondence between two sets of points in space—is solved with a novel, fast, robust and easily implementable algorithm using a combination of optimization techniques.

Journal ArticleDOI
TL;DR: A unifying framework is introduced to understand existing approaches to investigate the universal approximation problem using feedforward neural networks, and two training algorithms are introduced which can determine the weights of feedforward Neural Network, with sigmoidal activation neurons, to any degree of prescribed accuracy.

MonographDOI
01 Oct 1998
TL;DR: The authors provide a coherent account of various important concepts and techniques that are currently only found scattered in papers, supplement this with background material in mathematics and physics, and include many examples and exercises.
Abstract: From the Publisher: The effort to build machines that are able to learn and undertake tasks such as datamining, image processing and pattern recognition has led to the development of artificial neural networks in which learning from examples may be described and understood. The contribution to this subject made over the past decade by researchers applying the techniques of statistical mechanics is the subject of this book. The authors provide a coherent account of various important concepts and techniques that are currently only found scattered in papers, supplement this with background material in mathematics and physics, and include many examples and exercises.

Book
09 Dec 1998
TL;DR: The text has been tailored to give a comprehensive study of robot dynamics, present structured network models for robots, and provide systematic approaches for neural network based adaptive controller design for rigid robots, flexible joint Robots, and robots in constraint motion.
Abstract: There has been considerable research interest in neural network control of robots, and satisfactory results have been obtained in solving some of the special issues associated with the problems of robot control in an "on-and-off" fasion. This text is dedicated to issues on adaptive control of robots based on neural networks. The text has been tailored to give a comprehensive study of robot dynamics, present structured network models for robots, and provide systematic approaches for neural network based adaptive controller design for rigid robots, flexible joint robots, and robots in constraint motion. Rigorous proof of the stability properties of adaptive neural network controllers is provided. Simulation examples are also presented to verify the effectiveness of the controllers, and practical implementation issues associated with the controllers are also discussed.