Showing papers on "Artificial neural network published in 1994"

PDF

Open Access

Journal Article•DOI•

Learning long-term dependencies with gradient descent is difficult

[...]

Yoshua Bengio¹, Patrice Y. Simard², Paolo Frasconi³•Institutions (3)

Université de Montréal¹, AT&T², University of Florence³

01 Mar 1994-IEEE Transactions on Neural Networks

TL;DR: This work shows why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases, and exposes a trade-off between efficient learning by gradient descent and latching on information for long periods.

...read moreread less

Abstract: Recurrent neural networks can be used to map input sequences to output sequences, such as for recognition, production or prediction problems. However, practical difficulties have been reported in training recurrent neural networks to perform tasks in which the temporal contingencies present in the input/output sequences span long intervals. We show why gradient based learning algorithms face an increasingly difficult problem as the duration of the dependencies to be captured increases. These results expose a trade-off between efficient learning by gradient descent and latching on information for long periods. Based on an understanding of this problem, alternatives to standard gradient descent are considered. >

...read moreread less

7,309 citations

Journal Article•DOI•

Training feedforward networks with the Marquardt algorithm

[...]

Martin T. Hagan, Mohammad Bagher Menhaj

01 Nov 1994-IEEE Transactions on Neural Networks

TL;DR: The Marquardt algorithm for nonlinear least squares is presented and is incorporated into the backpropagation algorithm for training feedforward neural networks and is found to be much more efficient than either of the other techniques when the network contains no more than a few hundred weights.

...read moreread less

Abstract: The Marquardt algorithm for nonlinear least squares is presented and is incorporated into the backpropagation algorithm for training feedforward neural networks. The algorithm is tested on several function approximation problems, and is compared with a conjugate gradient algorithm and a variable learning rate algorithm. It is found that the Marquardt algorithm is much more efficient than either of the other techniques when the network contains no more than a few hundred weights. >

...read moreread less

6,899 citations

Journal Article•DOI•

Using mutual information for selecting features in supervised neural net learning

[...]

Roberto Battiti¹•Institutions (1)

University of Trento¹

01 Jul 1994-IEEE Transactions on Neural Networks

TL;DR: This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier.

...read moreread less

Abstract: This paper investigates the application of the mutual information criterion to evaluate a set of candidate features and to select an informative subset to be used as input data for a neural network classifier. Because the mutual information measures arbitrary dependencies between random variables, it is suitable for assessing the "information content" of features in complex classification tasks, where methods bases on linear relations (like the correlation) are prone to mistakes. The fact that the mutual information is independent of the coordinates chosen permits a robust estimation. Nonetheless, the use of the mutual information for tasks characterized by high input dimensionality requires suitable approximations because of the prohibitive demands on computation and samples. An algorithm is proposed that is based on a "greedy" selection of the features and that takes both the mutual information with respect to the output class and with respect to the already-selected features into account. Finally the results of a series of experiments are discussed. >

...read moreread less

2,423 citations

Journal Article•DOI•

Hierarchical mixtures of experts and the EM algorithm

[...]

Michael I. Jordan¹, Robert A. Jacobs²•Institutions (2)

Massachusetts Institute of Technology¹, University of Rochester²

01 Mar 1994-Neural Computation

TL;DR: An Expectation-Maximization (EM) algorithm for adjusting the parameters of the tree-structured architecture for supervised learning and an on-line learning algorithm in which the parameters are updated incrementally.

...read moreread less

Abstract: We present a tree-structured architecture for supervised learning. The statistical model underlying the architecture is a hierarchical mixture model in which both the mixture coefficients and the mixture components are generalized linear models (GLIM's). Learning is treated as a maximum likelihood problem; in particular, we present an Expectation-Maximization (EM) algorithm for adjusting the parameters of the architecture. We also develop an on-line learning algorithm in which the parameters are updated incrementally. Comparative simulation results are presented in the robot dynamics domain.

...read moreread less

2,418 citations

Proceedings Article•

Neural Network Ensembles, Cross Validation, and Active Learning

[...]

Anders Krogh, Jesper Vedelsby¹•Institutions (1)

Technical University of Denmark¹

01 Jan 1994

TL;DR: It is shown how to estimate the optimal weights of the ensemble members using unlabeled data and how the ambiguity can be used to select new training data to be labeled in an active learning scheme.

...read moreread less

Abstract: Learning of continuous valued functions using neural network ensembles (committees) can give improved accuracy, reliable estimation of the generalization error, and active learning. The ambiguity is defined as the variation of the output of ensemble members averaged over unlabeled data, so it quantifies the disagreement among the networks. It is discussed how to use the ambiguity in combination with cross-validation to give a reliable estimate of the ensemble generalization error, and how this type of ensemble cross-validation can sometimes improve performance. It is shown how to estimate the optimal weights of the ensemble members using unlabeled data. By a generalization of query by committee, it is finally shown how the ambiguity can be used to select new training data to be labeled in an active learning scheme.

...read moreread less

1,952 citations

Journal Article•DOI•

Fuzzy logic, neural networks, and soft computing

[...]

Lotfi A. Zadeh¹•Institutions (1)

University of California, Berkeley¹

01 Mar 1994-Communications of The ACM

TL;DR: Today, the authors have microwave ovens and washing machines that can figure out on their own what settings to use to perform their tasks optimally; cameras that come close to professional photographers in picture-taking ability; and many other products that manifest an impressive capability to reason, make intelligent decisions, and learn from experience.

...read moreread less

Abstract: Prof. Zadeh presented a comprehensive lecture on fuzzy logic, neural networks, and soft computing. In addition, he lead a spirited discussion of how these relatively new techniques may be applied to safety evaluation of time variant and nonlinear structures based on identification approaches. The abstract of his lecture is given as follows.

...read moreread less

1,390 citations

Journal Article•DOI•

Growing cell structures—a self-organizing network for unsupervised and supervised learning

[...]

Bernd Fritzke

01 Nov 1994-Neural Networks

TL;DR: A new self-organizing neural network model that has two variants that performs unsupervised learning and can be used for data visualization, clustering, and vector quantization is presented and results on the two-spirals benchmark and a vowel classification problem are presented that are better than any results previously published.

...read moreread less

1,319 citations

Journal Article•DOI•

Recurrent neural networks and robust time series prediction

[...]

J. Connor, R.D. Martin¹, Les Atlas¹•Institutions (1)

University of Washington¹

01 Mar 1994-IEEE Transactions on Neural Networks

TL;DR: A robust learning algorithm is proposed and applied to recurrent neural networks, NARMA(p,q), which show advantages over feedforward neural networks for time series with a moving average component and are shown to give better predictions than neural networks trained on unfiltered time series.

...read moreread less

Abstract: We propose a robust learning algorithm and apply it to recurrent neural networks. This algorithm is based on filtering outliers from the data and then estimating parameters from the filtered data. The filtering removes outliers from both the target function and the inputs of the neural network. The filtering is soft in that some outliers are neither completely rejected nor accepted. To show the need for robust recurrent networks, we compare the predictive ability of least squares estimated recurrent networks on synthetic data and on the Puget Power Electric Demand time series. These investigations result in a class of recurrent neural networks, NARMA(p,q), which show advantages over feedforward neural networks for time series with a moving average component. Conventional least squares methods of fitting NARMA(p,q) neural network models are shown to suffer a lack of robustness towards outliers. This sensitivity to outliers is demonstrated on both the synthetic and real data sets. Filtering the Puget Power Electric Demand time series is shown to automatically remove the outliers due to holidays. Neural networks trained on filtered data are then shown to give better predictions than neural networks trained on unfiltered time series. >

...read moreread less

1,169 citations

Journal Article•DOI•

Neural Networks: A Review from a Statistical Perspective

[...]

Bing Cheng, D. M. Titterington

01 Feb 1994-Statistical Science

TL;DR: This paper informs a statistical readership about Artificial Neural Networks (ANNs), points out some of the links with statistical methodology and encourages cross-disciplinary research in the directions most likely to bear fruit, and treats various topics in more depth.

...read moreread less

Abstract: This paper informs a statistical readership about Artificial Neural Networks (ANNs), points out some of the links with statistical methodology and encourages cross-disciplinary research in the directions most likely to bear fruit. The areas of statistical interest are briefly outlined, and a series of examples indicates the flavor of ANN models. We then treat various topics in more depth. In each case, we describe the neural network architectures and training rules and provide a statistical commentary. The topics treated in this way are perceptrons (from single-unit to multilayer versions), Hopfield-type recurrent networks (including probabilistic versions strongly related to statistical physics and Gibbs distributions) and associative memory networks trained by so-called unsupervised learning rules. Perceptrons are shown to have strong associations with discriminant analysis and regression, and unsupervized networks with cluster analysis. The paper concludes with some thoughts on the future of the interface between neural networks and statistics.

...read moreread less

1,114 citations

Journal Article•DOI•

An evolutionary algorithm that constructs recurrent neural networks

[...]

Peter J. Angeline¹, Gregory M. Saunders¹, Jordan Pollack¹•Institutions (1)

Ohio State University¹

01 Jan 1994-IEEE Transactions on Neural Networks

TL;DR: It is argued that genetic algorithms are inappropriate for network acquisition and an evolutionary program is described, called GNARL, that simultaneously acquires both the structure and weights for recurrent networks.

...read moreread less

Abstract: Standard methods for simultaneously inducing the structure and weights of recurrent neural networks limit every task to an assumed class of architectures. Such a simplification is necessary since the interactions between network structure and function are not well understood. Evolutionary computations, which include genetic algorithms and evolutionary programming, are population-based search methods that have shown promise in many similarly complex tasks. This paper argues that genetic algorithms are inappropriate for network acquisition and describes an evolutionary program, called GNARL, that simultaneously acquires both the structure and weights for recurrent networks. GNARL's empirical acquisition method allows for the emergence of complex behaviors and topologies that are potentially excluded by the artificial architectural constraints imposed in standard network induction methods. >

...read moreread less

1,092 citations

Mixture density networks

[...]

Christopher M. Bishop

01 Jan 1994

TL;DR: This paper introduces a new class of network models obtained by combining a conventional neural network with a mixture density model, called a Mixture Density Network, which can in principle represent arbitrary conditional probability distributions in the same way that aventional neural network can represent arbitrary functions.

...read moreread less

Abstract: Minimization of a sum-of-squares or cross-entropy error function leads to network outputs which approximate the conditional averages of the target data, conditioned on the input vector. For classifications problems, with a suitably chosen target coding scheme, these averages represent the posterior probabilities of class membership, and so can be regarded as optimal. For problems involving the prediction of continuous variables, however, the conditional averages provide only a very limited description of the properties of the target variables. This is particularly true for problems in which the mapping to be learned is multi-valued, as often arises in the solution of inverse problems, since the average of several correct target values is not necessarily itself a correct value. In order to obtain a complete description of the data, for the purposes of predicting the outputs corresponding to new input vectors, we must model the conditional probability distribution of the target data, again conditioned on the input vector. In this paper we introduce a new class of network models obtained by combining a conventional neural network with a mixture density model. The complete system is called a Mixture Density Network, and can in principle represent arbitrary conditional probability distributions in the same way that a conventional neural network can represent arbitrary functions. We demonstrate the effectiveness of Mixture Density Networks using both a toy problem and a problem involving robot inverse kinematics.

...read moreread less

Journal Article•DOI•

Learning and generalization characteristics of the random vector functional-link net

[...]

Yoh Han Pao¹, Gwang Hoon Park¹, Dejan J. Sobajic¹•Institutions (1)

Case Western Reserve University¹

01 Apr 1994-Neurocomputing

TL;DR: The learning and generalization characteristics of the random vector version of the Functional-link net are explored and compared with those attainable with the GDR algorithm and it seems that ‘ overtraining ’ occurs for stochastic mappings.

...read moreread less

Journal Article•DOI•

Knowledge-based artificial neural networks

[...]

Geoffrey G. Towell¹, Jude W. Shavlik¹•Institutions (1)

University of Wisconsin-Madison¹

01 Oct 1994-Artificial Intelligence

TL;DR: These tests show that the networks created by KBANN generalize better than a wide variety of learning systems, as well as several techniques proposed by biologists.

...read moreread less

Book•

Neurofuzzy adaptive modelling and control

[...]

Martin Brown¹, Chris Harris¹•Institutions (1)

University of Southampton¹

01 Jan 1994

TL;DR: This book provides a unified description of several adaptive neural and fuzzy networks and introduces the associative memory class of systems - which describe the similarities and differences existing between fuzzy and neural algorithms.

...read moreread less

Abstract: This book provides a unified description of several adaptive neural and fuzzy networks and introduces the associative memory class of systems - which describe the similarities and differences existing between fuzzy and neural algorithms. Three networks are described in detail - the Albus CMAC, the B-spline network and a class of fuzzy systems - and then analysed, their desirable features (local learning, linearly dependent on the parameter set, fuzzy interpretation) are emphasised and the algorithms are all evaluated on a common time series problem and applied to a common ship control benchmark. Chapters: 1. An Introduction to Learning Modelling and Control 1.1 Preliminaries 1.2 Intelligent Control 1.3 Learning Modelling and Control 1.4 Artificial Neural Networks 1.5 Fuzzy Control Systems 1.6 Book Description 2. Neural Networks for Modelling and Control 2.1 Introduction 2.2 Neuromodelling and Control Architectures 2.3 Neural Network Structure 2.4 Training Algorithms 2.5 Validation of a Neural Model 2.6 Discussion 3. Associative Memory Networks 3.1 Introduction 3.2 A Common Description 3.3 Five Associative Memory Networks 3.4 Summary 4. Adaptive Linear Modelling 4.1 Introduction 4.2 Linear Models 4.3 Performance of the Model 4.4 Gradient Descent 4.5 Multi-Layer Perceptrons and Back Propagation 4.6 Network Stability 4.7 Conclusion 5. Instantaneous Learning Algorithms 5.1 Introduction 5.2 Instantaneous Learning Rules 5.3 Parameter Convergence 5.4 The Effects of Instantaneous Estimates 5.5 Learning Interference in Associative Memory Networks 5.6 Higher Order Learning Rules 5.7 Discussion 6. The CMAC Algorithm 6.1 Introduction 6.2 The Basic Algorithm 6.3 Adaptation Strategies 6.4 Higher Order Basis Functions 6.5 Computational Requirements 6.6 Nonlinear Time Series Modelling 6.7 Modelling and Control Applications 6.8 Conclusions 7. The Modelling Capabilities of the Binary CMAC 7.1 Modelling and Generalisation in the Binary CMAC 7.2 Measuring the Flexibility of the Binary CMAC 7.3 Consistency Equations 7.4 Orthogonal Functions 7.5 Bounding the Modelling Error 7.6 Investigating the CMAC's Coarse Coding Map 7.7 Conclusion 8. Adaptive B-spline Networks 8.1 Introduction 8.2 Basic Algorithm 8.3 B-spline Learning Rules 8.4 B-spline Time Series Modelling 8.5 Model Adaptation Rules 8.6 ASMOD Time Series Modelling 8.7 Discussion 9. B-spline Guidance Algorithms 9.1 Introduction 9.2 Autonomous Docking 9.3 Constrained Trajectory Generation 9.4 B-spline Interpolants 9.5 Boundary and Kinematic Constraints 9.6 Example: A Quadratic Velocity Interpolant 9.7 Discussion 10. The Representation of Fuzzy Algorithms 10.1 Introduction: How Fuzzy is a Fuzzy Model? 10.2 Fuzzy Algorithms 10.3 Fuzzy Sets 10.4 Logical Operators 10.5 Compositional Rule of Inference 10.6 Defuzzification 10.7 Conclusions 11. Adaptive Fuzzy Modelling and Control 11.1 Introduction 11.2 Learning Algorithms 11.3 Plant Modelling 11.4 Indirect Fuzzy Control 11.5 Direct Fuzzy Control References. Appendix A. Modified Error Correction Rule Appendix B. Improved CMAC Displacement Tables Appendix C. Associative Memory Network Software Structure C.1 Data Structures C.2 Interface Functions C.3 Sample C Code Appendix D. Fuzzy Intersection Appendix E. Weight to Rule Confidence Vector Map For further information about this book (mailing/shipping costs etc.) and other neurofuzzy titles in the Prentice Hall series please contact: LIZ DICKINSON Prentice Hall Paramount Publishing International Campus 400 Maylands Avenue Hemel Hempstead, HP2 7EZ United Kingdom Tel: 0442 881900 Fax: 0442 257115 Contents

...read moreread less

Journal Article•DOI•

Network information criterion-determining the number of hidden units for an artificial neural network model

[...]

Noboru Murata¹, Shuji Yoshizawa¹, Shun-ichi Amari¹•Institutions (1)

University of Tokyo¹

01 Nov 1994-IEEE Transactions on Neural Networks

TL;DR: The problem of model selection, or determination of the number of hidden units, can be approached statistically, by generalizing Akaike's information criterion (AIC) to be applicable to unfaithful models with general loss criteria including regularization terms.

...read moreread less

Abstract: The problem of model selection, or determination of the number of hidden units, can be approached statistically, by generalizing Akaike's information criterion (AIC) to be applicable to unfaithful (i.e., unrealizable) models with general loss criteria including regularization terms. The relation between the training error and the generalization error is studied in terms of the number of the training examples and the complexity of a network which reduces to the number of parameters in the ordinary statistical theory of AIC. This relation leads to a new network information criterion which is useful for selecting the optimal network model based on a given training set. >

...read moreread less

A Model of How the Basal Ganglia Generate and Use Neural Signals That Predict Reinforcement

[...]

James C. Houk¹, Joel L. Davis, David G. Beiser•Institutions (1)

Northwestern University¹

01 Jan 1994

TL;DR: This chapter contains sections titled: Introduction, Dopamine Neurons, Organization of Strtosomal Modules, Mechanism of Responsiveness to Predictors of Reinforcement, Correspondence with the Theory of Adaptive Critics, and Relation to the Actor-Critic Architecture.

...read moreread less

Abstract: This chapter contains sections titled: Introduction, Dopamine Neurons, Organization of Strtosomal Modules, Mechanism of Responsiveness to Predictors of Reinforcement, Correspondence with the Theory of Adaptive Critics, Learning to Predict Primary Reinforcement, Learning Earlier Predictors of Reinforcement, Relation to the Actor-Critic Architecture, More Realistic Assumptions, Summary, Acknowledgments, References

...read moreread less

Journal Article•DOI•

Neural Networks for River Flow Prediction

[...]

Nachimuthu Karunanithi, William J. Grenney, Darrell Whitley, Ken Bovee

01 Apr 1994-Journal of Computing in Civil Engineering

TL;DR: This paper demonstrates how a neural network can be used as an adaptive model synthesizer as well as a predictor in the flow prediction of the Huron River at the Dexter sampling station, near Ann Arbor, Mich.

...read moreread less

Abstract: The surface-water hydrographs of rivers exhibit large variations due to many natural phenomena. One of the most commonly used approachs for interpolating and extending streamflow records is to fit observed data with an analytic power model. However, such analytic models may not adequately represent the flow process, because they are based on many simplifying assumptions about the natural phenomena that influence the river flow. This paper demonstrates how a neural network can be used as an adaptive model synthesizer as well as a predictor. Issues such as selecting an appropriate neural network architecture and a correct training algorithm as well as presenting data to neural networks are addressed using a constructive algorithm called the cascade-correlation algorithm. The neural-network approach is applied to the flow prediction of the Huron River at the Dexter sampling station, near Ann Arbor, Mich. Empirical comparisons are performed between the predictive capability of the neural network models and the most commonly used analytic nonlinear power model in terms of accuracy and convenience of use. Our preliminary results are quite encouraging. An analysis performed on the structure of the networks developed by the cascade-correlation algorithm shows that the neural networks are capable of adapting their complexity to match changes in the flow history and that the models developed by the neural-network approach are more complex than the power model.

...read moreread less

Journal Article•DOI•

Cryptographic limitations on learning Boolean formulae and finite automata

[...]

Michael Kearns¹, Leslie G. Valiant²•Institutions (2)

Bell Labs¹, Harvard University²

02 Jan 1994-Journal of the ACM

TL;DR: It is proved that a polynomial-time learning algorithm for Boolean formulae, deterministic finite automata or constant-depth threshold circuits would have dramatic consequences for cryptography and number theory and is applied to obtain strong intractability results for approximating a generalization of graph coloring.

...read moreread less

Abstract: In this paper, we prove the intractability of learning several classes of Boolean functions in the distribution-free model (also called the Probably Approximately Correct or PAC model) of learning from examples. These results are representation independent, in that they hold regardless of the syntactic form in which the learner chooses to represent its hypotheses.Our methods reduce the problems of cracking a number of well-known public-key cryptosystems to the learning problems. We prove that a polynomial-time learning algorithm for Boolean formulae, deterministic finite automata or constant-depth threshold circuits would have dramatic consequences for cryptography and number theory. In particular, such an algorithm could be used to break the RSA cryptosystem, factor Blum integers (composite numbers equivalent to 3 modulo 4), and detect quadratic residues. The results hold even if the learning algorithm is only required to obtain a slight advantage in prediction over random guessing. The techniques used demonstrate an interesting duality between learning and cryptography.We also apply our results to obtain strong intractability results for approximating a generalization of graph coloring.

...read moreread less

Book•

Neural Networks in Computer Intelligence

[...]

LiMin Fu

01 Apr 1994

TL;DR: Neural Networks in Computer Intelligence provides basic concepts, algorithms, and analysis of important neural network models developed to date, with emphasis on the importance of knowledge in intelligent system design.

...read moreread less

Abstract: From the Publisher: Neural Networks in Computer Intelligence provides basic concepts,algorithms,and analysis of important neural network models developed to date,with emphasis on the importance of knowledge in intelligent system design The book bridges the gap between artificial intelligence and neural networks Unlike many other network books,this one pioneers the effort to offer a unified perspective which could be used to integrate intelligence technologies The broad coverage of the book and the emphasis on basic principles can accommodate the diverse background of readers

...read moreread less

Proceedings Article•

Generalization in Reinforcement Learning: Safely Approximating the Value Function

[...]

Justin A. Boyan¹, Andrew W. Moore¹•Institutions (1)

Carnegie Mellon University¹

01 Jan 1994

TL;DR: Grow-Support is introduced, a new algorithm which is safe from divergence yet can still reap the benefits of successful generalization, and which is not robust, and in even very benign cases, may produce an entirely wrong policy.

...read moreread less

Abstract: A straightforward approach to the curse of dimensionality in reinforcement learning and dynamic programming is to replace the lookup table with a generalizing function approximator such as a neural net. Although this has been successful in the domain of backgammon, there is no guarantee of convergence. In this paper, we show that the combination of dynamic programming and function approximation is not robust, and in even very benign cases, may produce an entirely wrong policy. We then introduce Grow-Support, a new algorithm which is safe from divergence yet can still reap the benefits of successful generalization.

...read moreread less

Journal Article•DOI•

Nonlinear flight control using neural networks

[...]

Byoung Soo Kim¹, Anthony J. Calise¹•Institutions (1)

Georgia Institute of Technology¹

01 Jan 1994-Journal of Guidance Control and Dynamics

TL;DR: The theoretical development of a direct adaptive tracking control architecture using neural networks based on feedback linearization of the aircraft dynamics is presented and a stable weights adjustment rule for the on-line neural network is derived.

...read moreread less

Abstract: The theoretical developmentofa direct adaptivetracking controlarchitectureusingneuralnetworks ispresented. Emphasis is placed on utilization of neural networks in a ight control architecture based on feedback linearization of the aircraft dynamics. Neural networks are used to represent the nonlinear inverse transformation needed for feedback linearization. Neural networks may be rst trained off line using a nominalmathematicalmodel, which provides an approximate inversion that can accommodate the total ight envelope. Neural networks capable of on-line learning are required to compensate for inversion error, which may arise from imperfect modeling, approximate inversion, or sudden changes in aircraft dynamics. A stable weights adjustment rule for the on-line neural network is derived. Under mild assumptions on the nonlinearities representing the inversion error, the adaptation algorithm ensures that all of the signals in the loop are uniformly bounded and that the weights of the on-line neural network tend to constant values. Simulation results for an F-18 aircraft model are presented to illustrate the performance of the on-line neural network based adaptation algorithm.

...read moreread less

Journal Article•DOI•

Neural networks and their applications

[...]

Christopher M. Bishop

01 Jan 1994-Review of Scientific Instruments

TL;DR: This review of neural networks provides a range of powerful new techniques for solving problems in pattern recognition, data analysis, and control and describes these models in detail and explains the various techniques used to train them.

...read moreread less

Abstract: Neural networks provide a range of powerful new techniques for solving problems in pattern recognition, data analysis, and control. They have several notable features including high processing speeds and the ability to learn the solution to a problem from a set of examples. The majority of practical applications of neural networks currently make use of two basic network models. We describe these models in detail and explain the various techniques used to train them. Next we discuss a number of key issues which must be addressed when applying neural networks to practical problems, and highlight several potential pitfalls. Finally, we survey the various classes of problem which may be addressed using neural networks, and we illustrate them with a variety of successful applications drawn from a range of fields. It is intended that this review should be accessible to readers with no previous knowledge of neural networks, and yet also provide new insights for those already making practical use of these techniques.

...read moreread less

Book•

Neural networks

[...]

Simon Haykin

01 Jan 1994

Book Chapter•DOI•

Novelty detection and neural network validation

[...]

Christopher M. Bishop¹•Institutions (1)

Aston University¹

01 Jan 1994

TL;DR: This paper provides a quantitative procedure for measuring novelty, and its performance is demonstrated using an application involving the monitoring of oil flow in multi-phase pipelines.

...read moreread less

Abstract: One of the key factors limiting the use of neural networks in many industrial applications has been the difficulty of demonstrating that a trained network will continue to generate reliable outputs once it is in routine use. An important potential source of errors arises from input data which differs significantly from that used to train the network. In this paper we investigate the relation between the degree of novelty of input data and the corresponding reliability of the output data. We provide a quantitative procedure for measuring novelty, and we demonstrate its performance using an application involving the monitoring of oil flow in multi-phase pipelines.

...read moreread less

Book•

The Roots of Backpropagation: From Ordered Derivatives to Neural Networks and Political Forecasting

[...]

Paul J. Werbos¹•Institutions (1)

National Science Foundation¹

01 Jan 1994

TL;DR: This book discusses forms of Backpropagation for Sensitivity Analysis, Optimization, and Neural Networks, and the importance of the Multivariate ARMA(1,1) Model in this regard.

...read moreread less

Abstract: THESIS. Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences. Dynamic Feedback, Statistical Estimation, and Systems Optimization: General Techniques. The Multivariate ARMA(1,1) Model: Its Significance and Estimation. Simulation Studies of Techniques of Time--Series Analysis. General Applications of These Ideas: Practical Hazards and New Possibilities. Nationalism and Social Communications: A Test Case for Mathematical Approaches. APPLICATIONS AND EXTENSIONS. Forms of Backpropagation for Sensitivity Analysis, Optimization, and Neural Networks. Backpropagation Through Time: What It Does and How to Do It. Neurocontrol: Where It Is Going and Why It Is Crucial. Neural Networks and the Human Mind: New Mathematics Fits Humanistic Insight. Index.

...read moreread less

Journal Article•DOI•

Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks

[...]

G.V. Puskorius¹, L.A. Feldkamp¹•Institutions (1)

Ford Motor Company¹

01 Mar 1994-IEEE Transactions on Neural Networks

TL;DR: These simulations suggest that recurrent controller networks trained by Kalman filter methods can combine the traditional features of state-space controllers and observers in a homogeneous architecture for nonlinear dynamical systems, while simultaneously exhibiting less sensitivity than do purely feedforward controller networks to changes in plant parameters and measurement noise.

...read moreread less

Abstract: Although the potential of the powerful mapping and representational capabilities of recurrent network architectures is generally recognized by the neural network research community, recurrent neural networks have not been widely used for the control of nonlinear dynamical systems, possibly due to the relative ineffectiveness of simple gradient descent training algorithms. Developments in the use of parameter-based extended Kalman filter algorithms for training recurrent networks may provide a mechanism by which these architectures will prove to be of practical value. This paper presents a decoupled extended Kalman filter (DEKF) algorithm for training of recurrent networks with special emphasis on application to control problems. We demonstrate in simulation the application of the DEKF algorithm to a series of example control problems ranging from the well-known cart-pole and bioreactor benchmark problems to an automotive subsystem, engine idle speed control. These simulations suggest that recurrent controller networks trained by Kalman filter methods can combine the traditional features of state-space controllers and observers in a homogeneous architecture for nonlinear dynamical systems, while simultaneously exhibiting less sensitivity than do purely feedforward controller networks to changes in plant parameters and measurement noise. >

...read moreread less

Journal Article•DOI•

Neural Networks in Civil Engineering. I: Principles and Understanding

[...]

Ian Flood, Nabil A. Kartam

01 Apr 1994-Journal of Computing in Civil Engineering

TL;DR: An understanding of how these devices operate is developed and the main issues concerning their use are explained, including factors affecting their ability to learn and generalize.

...read moreread less

Abstract: This is the first of two papers providing a discourse on the understanding, usage, and potential for application of artificial neural networks within civil engineering. The present paper develops an understanding of how these devices operate and explains the main issues concerning their use. A simple structural‐analysis problem is solved using the most popular form of neural‐networking system—a feedforward network trained using a supervised scheme. A graphical interpretation of the way in which neural networks operate is first presented. This is followed by discussions of the primary concepts and issues concerning their use, including factors affecting their ability to learn and generalize, the selection of an appropriate set of training patterns, theoretical limitations of alternative network configurations, and network validation. The second paper demonstrates the ways in which different types of civil engineering problems can be tackled using neural networks. The objective of the two papers is to ensur...

...read moreread less

Journal Article•DOI•

Neural Networks and Related Methods for Classification

[...]

Brian D. Ripley

01 Sep 1994-Journal of the royal statistical society series b-methodological

Journal Article•DOI•

Advanced supervised learning in multi-layer perceptrons — From backpropagation to adaptive learning algorithms

[...]

Martin Riedmiller¹•Institutions (1)

Karlsruhe Institute of Technology¹

01 Jul 1994-Computer Standards & Interfaces

TL;DR: The concept of supervised learning in multi-layer perceptrons based on the technique of gradient descent is introduced and the behavior of several learning procedures on some popular benchmark problems is reported, thereby illuminating convergence, robustness, and scaling properties of the respective algorithms.

...read moreread less

Book•

Neural Network Time Series: Forecasting of Financial Markets

[...]

E. Michael Azoff

13 Sep 1994

TL;DR: This book takes the reader beyond the 'black-box' approach to neural networks and provides the knowledge that is required for their proper design and use in financial markets forecasting - with an emphasis on futures trading.

...read moreread less

Abstract: From the Publisher: A neural network is a computer program that can recognise patterns in data, learn from this and (in the case of time series data) make forecasts of future patterns. There are now over 20 commercially available neural network programs designed for use on financial markets and there have been some notable reports of their successful application. However, like any other computer program, neural networks are only as good as the data they are given and the questions that are asked of them. Proper use of a neural network involves spending time understanding and cleaning the data: removing errors, preprocessing and postprocessing. This book takes the reader beyond the 'black-box' approach to neural networks and provides the knowledge that is required for their proper design and use in financial markets forecasting - with an emphasis on futures trading. Comprehensively specified benchmarks are provided (including weight values), drawn from time series examples in chaos theory and financial futures. The book covers data preprocessing, random walk theory, trading systems and risk analysis. It also provides a literature review, a tutorial on backpropagation, and a chapter on further reading and software. For the professional financial forecaster this book is without parallel as a comprehensive, practical and up-to-date guide to this important subject.

...read moreread less

Collapse