Showing papers on "Deep learning published in 1992"

PDF

Open Access

Book Chapter•DOI•

Theory of the backpropagation neural network

[...]

01 Jan 1992

TL;DR: A speculative neurophysiological model illustrating how the backpropagation neural network architecture might plausibly be implemented in the mammalian brain for corticocortical learning between nearby regions of the cerebral cortex is presented.

...read moreread less

Abstract: Publisher Summary This chapter presents a survey of the elementary theory of the basic backpropagation neural network architecture, covering the areas of architectural design, performance measurement, function approximation capability, and learning. The survey includes a formulation of the backpropagation neural network architecture to make it a valid neural network and a proof that the backpropagation mean squared error function exists and is differentiable. Also included in the survey is a theorem showing that any L2 function can be implemented to any desired degree of accuracy with a three-layer backpropagation neural network. An appendix presents a speculative neurophysiological model illustrating the way in which the backpropagation neural network architecture might plausibly be implemented in the mammalian brain for corticocortical learning between nearby regions of cerebral cortex. One of the crucial decisions in the design of the backpropagation architecture is the selection of a sigmoidal activation function.

...read moreread less

1,729 citations

Journal Article•DOI•

Functional-link net computing: theory, system architecture, and functionalities

[...]

Y.-H. Pao¹, Yoshiyasu Takefuji¹•Institutions (1)

Case Western Reserve University¹

01 May 1992-IEEE Computer

TL;DR: A system architecture and a network computational approach compatible with the goal of devising a general-purpose artificial neural network computer are described and the functionalities of supervised learning and optimization are illustrated.

...read moreread less

Abstract: A system architecture and a network computational approach compatible with the goal of devising a general-purpose artificial neural network computer are described. The functionalities of supervised learning and optimization are illustrated, and cluster analysis and associative recall are briefly mentioned. >

...read moreread less

692 citations

Book•

Neural networks for signal processing

[...]

Bart Kosko¹•Institutions (1)

University of Southern California¹

03 Jan 1992

TL;DR: In this article, a collection of essays explores neural networks applications in signal and image processing, function and estimation, robotics and control, associative memories, and electrical and optical networks.

...read moreread less

Abstract: This collection of essays explores neural networks applications in signal and image processing, function and estimation, robotics and control, associative memories, and electrical and optical networks. Intended as a companion to "Neural Networks and Fuzzy Systems", this reference is designed to be of use to scientists, engineers and other working in the neural network field.

...read moreread less

408 citations

Journal Article•DOI•

Learning with the Recurrent Random Neural Network

[...]

Erol Gelenbe¹•Institutions (1)

Paris Descartes University¹

07 Sep 1992

TL;DR: Gelenbe et al. as mentioned in this paper presented a learning algorithm for the recurrent random network model using gradient descent of a quadratic error function, which requires the solution of a system of n linear and n nonlinear equations each time the n-neuron network "learns" a new input-output pair.

...read moreread less

Abstract: The capacity to learn from examples is one of the most desirable features of neural network models. We present a learning algorithm for the recurrent random network model (Gelenbe 1989, 1990) using gradient descent of a quadratic error function. The analytical properties of the model lead to a "backpropagation" type algorithm that requires the solution of a system of n linear and n nonlinear equations each time the n-neuron network "learns" a new input-output pair.

...read moreread less

377 citations

Journal Article•DOI•

Short term load forecasting using a multilayer neural network with an adaptive learning algorithm

[...]

K.L. Ho¹, Y.-Y. Hsu¹, Chien-Chuen Yang¹•Institutions (1)

National Taiwan University¹

01 Feb 1992-IEEE Transactions on Power Systems

TL;DR: A multilayer feedforward neural network is proposed for short-term load forecasting and it is found that, once trained by the proposed learning algorithm, the neural network can yield the desired hourly load forecast efficiently and accurately.

...read moreread less

Abstract: A multilayer feedforward neural network is proposed for short-term load forecasting. To speed up the training process, a learning algorithm for the adaptive training of neural networks is presented. The effectiveness of the neural network with the proposed adaptive learning algorithm is demonstrated by short-term load forecasting of the Taiwan power system. It is found that, once trained by the proposed learning algorithm, the neural network can yield the desired hourly load forecast efficiently and accurately. The proposed adaptive learning algorithm converges much faster than the conventional backpropagation-momentum learning method. >

...read moreread less

270 citations

Proceedings Article•

Explanation-Based Neural Network Learning for Robot Control

[...]

Tom M. Mitchell¹, Sebastian Thrun²•Institutions (2)

Carnegie Mellon University¹, University of Bonn²

30 Nov 1992

TL;DR: A neural network learning method that generalizes rationally from many fewer data points, relying instead on prior knowledge encoded in previously learned neural networks that is used to bias generalization when learning the target function.

...read moreread less

Abstract: How can artificial neural nets generalize better from fewer examples? In order to generalize successfully, neural network learning methods typically require large training data sets. We introduce a neural network learning method that generalizes rationally from many fewer data points, relying instead on prior knowledge encoded in previously learned neural networks. For example, in robot control learning tasks reported here, previously learned networks that model the effects of robot actions are used to guide subsequent learning of robot control functions. For each observed training example of the target function (e.g. the robot control policy), the learner explains the observed example in terms of its prior knowledge, then analyzes this explanation to infer additional information about the shape, or slope, of the target function. This shape knowledge is used to bias generalization when learning the target function. Results are presented applying this approach to a simulated robot task based on reinforcement learning.

...read moreread less

128 citations

Journal Article•DOI•

Overtraining, Regularization, and Searching for Minimum in Neural Networks

[...]

Jonas Sjöberg¹, Lennart Ljung¹•Institutions (1)

Linköping University¹

01 Jul 1992-IFAC Proceedings Volumes

TL;DR: In this article, a neural network model for dynamical systems has been proposed, which is often characterized by the fact that they use a fairly large amount of parameters and is often unsuitable for dynamic systems.

...read moreread less

123 citations

Book Chapter•DOI•

Fast learning algorithms for neural networks

[...]

Nicolaos B. Karayiannis¹, Anastasios N. Venetsanopoulos²•Institutions (2)

University of Houston¹, University of Toronto²

01 Jul 1992

TL;DR: A generalized criterion for the training of feedforward neural networks is proposed, which leads to a variety of fast learning algorithms for single-layered as well as multilayered neural networks.

...read moreread less

Abstract: A generalized criterion for the training of feedforward neural networks is proposed. Depending on the optimization strategy used, this criterion leads to a variety of fast learning algorithms for single-layered as well as multilayered neural networks. The simplest algorithm devised on the basis of this generalized criterion is the fast delta rule algorithm, proposed for the training of single-layered neural networks. The application of a similar optimization strategy to multilayered neural networks in conjunction with the proposed generalized criterion provides the fast backpropagation algorithm. Another set of fast algorithms with better convergence properties is derived on the basis of the same strategy that provided recently a family of Efficient LEarning Algorithms for Neural NEtworks (ELEANNE). Several experiments verify that the fast algorithms developed perform the training of neural networks faster than the corresponding learning algorithms existing in the literature. >

...read moreread less

116 citations

Journal Article•DOI•

Fast learning process of multilayer neural networks using recursive least squares method

[...]

Mahmood R. Azimi-Sadjadi¹, R.-J. Liou¹•Institutions (1)

Colorado State University¹

01 Feb 1992-IEEE Transactions on Signal Processing

TL;DR: Simulation results on the 4-b parity checker and multiplexer networks indicate significant reduction in the total number of iterations when compared with those of the conventional and accelerated back-propagation algorithms.

...read moreread less

Abstract: A new approach for the learning process of multilayer perceptron neural networks using the recursive least squares (RLS) type algorithm is proposed. This method minimizes the global sum of the square of the errors between the actual and the desired output values iteratively. The weights in the network are updated upon the arrival of a new training sample and by solving a system of normal equations recursively. To determine the desired target in the hidden layers an analog of the back-propagation strategy used in the conventional learning algorithms is developed. This permits the application of the learning procedure to all the layers. Simulation results on the 4-b parity checker and multiplexer networks were obtained which indicate significant reduction in the total number of iterations when compared with those of the conventional and accelerated back-propagation algorithms. >

...read moreread less

108 citations

Book•

Neural Networks: Current Applications

[...]

P. G. J. Lisboa

01 May 1992

TL;DR: The application of neural networks to robotics neural networks in vision image labelling with a neural network object recognition with optimum neural networks handwritten digit recognition with a backpropagation network higher order networks for invariant pattern recognition the bionic retina and beyond.

...read moreread less

Abstract: Neural network basics using adaptive networks for resource allocation in changing environments medical risk assessment for insurance underwriting modelling chemical process systems via neural computation the application of neural networks to robotics neural networks in vision image labelling with a neural network object recognition with optimum neural networks handwritten digit recognition with a backpropagation network higher order networks for invariant pattern recognition the bionic retina and beyond.

...read moreread less

98 citations

Book•

Neural networks: theory and applications

[...]

Richard J. Mammone¹, Yehoshua Y. Zeevi²•Institutions (2)

Rutgers University¹, Technion – Israel Institute of Technology²

01 Jun 1992

TL;DR: Weightless neural tools - toward cognitive macrostructures, L. Julesz toward hierarchical matched filtering, R. Hecht-Nielsen some variations on training of recurrent networks, G.J. Kuhn and N.P. Stark.

...read moreread less

Abstract: Weightless neural tools - toward cognitive macrostructures, L. Aleksander an estimation theoretic basis for the design of sorting and classification network, R.W. Brockett a self organizing ARTMAP neural architecture for supervized learning and pattern recognition, G.A. Carpenter et al hybrid neural network architectures - equilibrium systems that pay attention, L.N. Cooper neural networks for internal representation of movements in primates and robots, R. Eckmiller et al recognition and segmentation of characters in handwriting with selective attention, K. Fukushima et al adaptive acquisition of language, A.L. Gorin et al what connectionist models learn - learning and representation in connectionist networks, S.J. Hanson and D.J. Burr early vision, focal attention and neural nets, B. Julesz toward hierarchical matched filtering, R. Hecht-Nielsen some variations on training of recurrent networks, G.M. Kuhn and N.P. Herzberg generalized perception networks with nonlinear discriminant functions, S.Y. Kung et al neural tree networks, A. Sankar and R. Mammone capabilities and training of feedforward nets, E.D. Sontag a fast learning algorithm for multilayer neural network based on projection methods, S.J. Yeh and H. Stark.

...read moreread less

Proceedings Article•DOI•

Cresceptron: a self-organizing neural network which grows adaptively

[...]

Juyang Weng¹, Narendra Ahuja¹, Thomas S. Huang¹•Institutions (1)

University of Illinois at Urbana–Champaign¹

07 Jun 1992

TL;DR: Cresceptron uses a hierarchical framework to grow neural networks automatically, adaptively, and incrementally through learning by creating new neurons and synapses which memorize the new concepts and their context.

...read moreread less

Abstract: Cresceptron uses a hierarchical framework to grow neural networks automatically, adaptively, and incrementally through learning. At every level of the hierarchy, new concepts are detected automatically and the network grows by creating new neurons and synapses which memorize the new concepts and their context. The training samples are generalized to other perceptually equivalent items through hierarchical tolerance of deviation. The neural network recognizes the learned items and their variations by hierarchically associating the learned knowledge with the input. It segments the recognized items from the input through back training along the response paths. >

...read moreread less

Journal Article•DOI•

Structure-unknown non-linear dynamic systems: identification through neural networks

[...]

Sami F. Masri, Anastasios Chassiakos, Thomas K. Caughey

01 Mar 1992-Smart Materials and Structures

TL;DR: The analogy of the neural network procedure to a qualitatively similar non-parametric identification approach, which was previously developed by the authors for handling arbitrary non-linear systems, is discussed and the utility of the Neural network approach is demonstrated by application to several illustrative problems.

...read moreread less

Abstract: Explores the potential of using parallel distributed processing (neural network) approaches to identify the internal forces of structure-unknown non-linear dynamic systems typically encountered in the field of applied mechanics. The relevant characteristics of neural networks, such as the processing elements, network topology, and learning algorithms, are discussed in the context of system identification. The analogy of the neural network procedure to a qualitatively similar non-parametric identification approach, which was previously developed by the authors for handling arbitrary non-linear systems, is discussed. The utility of the neural network approach is demonstrated by application to several illustrative problems.

...read moreread less

Proceedings Article•DOI•

Off-line signature verification using directional PDF and neural networks

[...]

Robert Sabourin¹, J.-P. Drouhard¹•Institutions (1)

École Normale Supérieure¹

30 Aug 1992

TL;DR: Experimental results show that using both directional PDFs and the completely connected feedforward neural network classifier are valuable to build the first stage of a complete AHSVS.

...read moreread less

Abstract: The first stage of a complete automatic handwritten signature verification system (AHSVS) is described in this paper Since only random forgeries are taken into account in this first stage of decision, the directional probability density function (PDF) which is related to the overall shape of the handwritten signature has been taken into account as feature vector Experimental results show that using both directional PDFs and the completely connected feedforward neural network classifier are valuable to build the first stage of a complete AHSVS >

...read moreread less

Proceedings Article•DOI•

Decision boundary feature extraction for neural networks

[...]

Chulhee Lee¹, David A. Landgrebe¹•Institutions (1)

Purdue University¹

18 Oct 1992

TL;DR: The authors propose a novel feature extraction method for neural networks based on the decision boundary feature extraction algorithm, which preserves the characteristics of neural networks, which can define an arbitrary decision boundary.

...read moreread less

Abstract: The authors propose a novel feature extraction method for neural networks. The method is based on the decision boundary feature extraction algorithm. It has been shown that all the necessary features for classification can be extracted from the decision boundary. To apply the method, the authors first define the decision boundary in neural networks. Next, they propose a procedure for extracting all the necessary features for classification from the decision boundary. The proposed algorithm preserves the characteristics of neural networks, which can define an arbitrary decision boundary. Experiments show promising results. >

...read moreread less

A Framework for Combining Symbolic and Neural Learning

[...]

Jude W. Shavlik¹•Institutions (1)

University of Wisconsin-Madison¹

01 Jan 1992

TL;DR: An approach to combining symbolic and connectionist approaches to machine learning is described, with a three-stage framework and the research of several groups is reviewed with respect to this framework.

...read moreread less

Abstract: This article describes an approach to combining symbolic and connectionist approaches to machine learning A three-stage framework is presented and the research of several groups is reviewed with respect to this framework The first stage involves the insertion of symbolic knowledge into neural networks, the second addresses the refinement of this prior knowledge in its neural representation, while the third concerns the extraction of the refined symbolic knowledge Experimental results and open research issues are discussed

...read moreread less

Journal Article•DOI•

Machine fault classification: a neural network approach

[...]

Gerald M. Knapp¹, Hsu-Pin (Ben) Wang¹•Institutions (1)

University of Iowa¹

01 Apr 1992-International Journal of Production Research

TL;DR: A neural network-based machine fault diagnosis model is developed using the back propagation (BP) learning paradigm and network training efficiency is studied by varying the learning rate and learning momentum of the activation function.

...read moreread less

Abstract: This paper presents a neural network approach for machine fault diagnosis. Specifically, two tasks are explained and discussed: (1) a neural network-based machine fault diagnosis model is developed using the back propagation (BP) learning paradigm; (2) network training efficiency is studied by varying the learning rate and learning momentum of the activation function. The results are presented and discussed.

...read moreread less

Journal Article•DOI•

Visualizing learning and computation in artificial neural networks

[...]

Mark Craven¹, Jude W. Shavlik¹•Institutions (1)

University of Wisconsin-Madison¹

01 Sep 1992-International Journal on Artificial Intelligence Tools

TL;DR: A number of visualization techniques for understanding the learning and decision-making processes of neural networks are surveyed and the visualization techniques used to understand these networks are described.

...read moreread less

Abstract: Scientific visualization is the process of using graphical images to form succinct and lucid representations of numerical data. Visualization has proven to be a useful method for understanding both learning and computation in artificial neural networks. While providing a powerful and general technique for inductive learning, artificial neural networks are difficult to comprehend because they form representations that are encoded by a large number of real-valued parameters. By viewing these parameters pictorially, a better understanding can be gained of how a network maps inputs into outputs. In this article, we survey a number of visualization techniques for understanding the learning and decision-making processes of neural networks. We also describe our work in knowledge-based neural networks and the visualization techniques we have used to understand these networks. In a knowledge-based neural network, the topology and initial weight values of the network are determined by an approximately-correct set of inference rules. Knowledge-based networks are easier to interpret than conventional networks because of the synergy between visualization methods and the relation of the networks to symbolic rules.

...read moreread less

Book•

Modular learning in neural networks

[...]

Tomas Hrycej

01 Jan 1992

Book•

Modular Learning in Neural Networks: A Modularized Approach to Neural Network Classification

[...]

Tomas Hrycej

25 Sep 1992

TL;DR: This important new work recognizes the advanced nature of today's artificial neural networks, uniquely emphasizing a modular approach to neural network learning, and covers the full range of conceivable approaches to the modularization of learning.

...read moreread less

Abstract: From the Publisher: This important new work recognizes the advanced nature of today's artificial neural networks, uniquely emphasizing a modular approach to neural network learning. By breaking down the learning task into relatively independent parts of lower complexity, Modular Learning in Neural Networks demonstrates how neural network learning can be made more powerful and efficient. The book's modular approach, unlike the monolithic viewpoint, admits intermediary solution stages whose success can be independently verified, as in other engineering fields. Each stage can be evaluated before moving on to the subsequent one, and the reason for possible failures can be analyzed, ultimately leading to the improved development and engineering of applications. The modular approach also takes into account growing network complexity, reducing the difficulty of such inevitable problems as scaling and convergence. Modular Learning in Neural Networks' modular approach is also fully in step with important psychological and neurobiological research. Studies in developmental psychology demonstrate the incremental nature of human learning, in which the success of each stage is conditioned by the successful accomplishment of the previous stage, while neurobiology has depicted the human brain as a complex structure of cooperating modules. Modular Learning in Neural Networks covers the full range of conceivable approaches to the modularization of learning, including decomposition of learning into modules using supervised and unsupervised learning types; decomposition of the function to be mapped into linear and nonlinear parts; decomposition of the neural network to minimize harmful interferences between a large number of network parameters during learning; decomposition of the application task into subtasks that are learned separately; decomposition into a knowledge-based part and a learning part. The book attempts to show that modular learning based on these approaches is helpful in improving t

...read moreread less

Proceedings Article•DOI•

Approximation theory and recurrent networks

[...]

L.K. Li¹•Institutions (1)

University of Southern California¹

07 Jun 1992

TL;DR: It is shown that a given trajectory sequence with the corresponding time steps can be represented by a discrete-time connected recurrent neural net and the result is generalized to an approximation of a differentiable trajectory on a compact time interval.

...read moreread less

Abstract: It is shown that a given trajectory sequence with the corresponding time steps can be represented by a discrete-time connected recurrent neural net. The result is generalized to an approximation of a differentiable trajectory on a compact time interval. It is shown that fully recurrent neural nets of sigmoid type units can approximate a large class of continuous real functions of time. This implies that fully recurrent neural networks can be universal approximators of trajectories. This fundamental principle of constructing a set of linearly independent vectors can be used to obtain the weights which serve for constructing such networks either directly or by providing a good initial guess for iterative learning algorithms. The estimation of network size is given. >

...read moreread less

Book Chapter•DOI•

Neural Networks and Complexity Theory

[...]

Pekka Orponen¹•Institutions (1)

University of Helsinki¹

24 Aug 1992

TL;DR: Some of the central results in the complexity theory of neural networks, with pointers to the literature, are surveyed.

...read moreread less

Abstract: We survey some of the central results in the complexity theory of neural networks, with pointers to the literature.

...read moreread less

Proceedings Article•

Learning Curves, Model Selection and Complexity of Neural Networks

[...]

Noboru Murata¹, Shuji Yoshizawa¹, Shun-ichi Amari¹•Institutions (1)

University of Tokyo¹

30 Nov 1992

TL;DR: The present paper clarifies asymptotic properties and their relation of two learning curves, one concerning the predictive loss or generalization loss and the other the training loss, which gives a natural definition of the complexity of a neural network.

...read moreread less

Abstract: Learning curves show how a neural network is improved as the number of training examples increases and how it is related to the network complexity. The present paper clarifies asymptotic properties and their relation of two learning curves, one concerning the predictive loss or generalization loss and the other the training loss. The result gives a natural definition of the complexity of a neural network. Moreover, it provides a new criterion of model selection.

...read moreread less

Journal Article•DOI•

Recurrent neural networks for linear programming: analysis and design principles

[...]

Jun Wang¹, Vira Chankong²•Institutions (2)

University of North Dakota¹, Case Western Reserve University²

15 Apr 1992-Computers & Operations Research

TL;DR: The asymptotic properties of the proposed recurrent neural networks for linear programming are analyzed theoretically, and the design principles for synthesizing the recurrent networks are discussed based on the results of analysis.

...read moreread less

Journal Article•DOI•

Hardware requirements for neural network pattern classifiers: a case study and implementation

[...]

Bernhard E. Boser¹, E. Sackinger¹, Jane Bromley¹, Yann LeCun¹, Lawrence D. Jackel¹ - Show less +1 more•Institutions (1)

Alcatel-Lucent¹

01 Jan 1992-IEEE Micro

TL;DR: A special-purpose chip, optimized for computational needs of neural networks and performing over 2000 multiplications and additions simultaneously, is described, which is particularly suitable for the convolutional architectures typical in pattern classification networks.

...read moreread less

Abstract: A special-purpose chip, optimized for computational needs of neural networks and performing over 2000 multiplications and additions simultaneously, is described. Its data path is particularly suitable for the convolutional architectures typical in pattern classification networks but can also be configured for fully connected or feedback topologies. A development system permits rapid prototyping of new applications and analysis of the impact of the specialized hardware on system performance. The power and flexibility of the processor are demonstrated with a neural network for handwritten character recognition containing over 133000 connections. >

...read moreread less

Book Chapter•DOI•

Curvature-Driven Smoothing in Backpropagation Neural Networks

[...]

Christopher M. Bishop¹•Institutions (1)

AEA Technology¹

01 Jan 1992

TL;DR: A modified error measure is proposed which can reduce the tendency to over-fit and whose properties can be controlled by a single scalar parameter.

...read moreread less

Abstract: A central problem in the theory of feed-forward neural networks is the determination of the number of hidden neurons. With too few neurons the network may be unable to achieve the desired accuracy, while too many neurons leads to over-fitting. In this paper we propose a modification to the standard mean square error function which will reduce the tendency to over-fit, and which can be controlled by a single scalar parameter. A new learning algorithm, which can be used to minimise this error function, is also described.

...read moreread less

Proceedings Article•DOI•

The projection neural network

[...]

G.D. Wilensky, N. Manukian

07 Jun 1992

TL;DR: A novel neural network model, the projection neural network, is developed to overcome three key drawbacks of backpropagation-trained neural networks (BPNN), i.e., long training times, the large number of nodes required to form closed regions for classification of high-dimensional problems and the lack of modularity.

...read moreread less

Abstract: A novel neural network model, the projection neural network, is developed to overcome three key drawbacks of backpropagation-trained neural networks (BPNN), i.e., long training times, the large number of nodes required to form closed regions for classification of high-dimensional problems and the lack of modularity. This network combines advantages of hypersphere classifiers, such as the restricted Coulomb energy (RCE) network, radial basis function methods, and BPNN. It provides the ability to initialize nodes to serve either as hyperplane separators or as spherical prototypes (radial basis functions), followed by a modified gradient descent error minimization training of the network weights and thresholds, which adjusts the prototype positions and sizes and may convert closed prototype decision boundaries to open boundaries, and vice versa. The network can provide orders of magnitude decrease in the required training time over BPNN and a reduction in the number of required nodes. Theory and examples are given. >

...read moreread less

Proceedings Article•

Neural network learning, generalization and over-learning

[...]

Hidemitsu Ogawa

01 Jan 1992

Proceedings Article•DOI•

The role of genetic algorithms in neural network query-based learning and explanation facilities

[...]

R.C. Eberhart¹•Institutions (1)

Research Triangle Park¹

06 Jun 1992

TL;DR: Genetic algorithms are used as a means of achieving neural network inversion and the input patterns obtained can use in training partially-trained networks, as well as in the building of neural network system explanation facilities.

...read moreread less

Abstract: Genetic algorithms are used as a means of achieving neural network inversion. Neural network inversion allows a user to find one or more neural network input patterns which yield a specific output. The input patterns obtained from the genetic algorithm can use in training partially-trained networks, as well as in the building of neural network system explanation facilities. >

...read moreread less

Proceedings Article•DOI•

Inserting rules into recurrent neural networks

[...]

C.L. Giles¹, Christian W. Omlin²•Institutions (2)

Princeton University¹, Rensselaer Polytechnic Institute²

01 Jan 1992

TL;DR: Simulations show that training recurrent networks with different amounts of partial knowledge to recognize simple grammers improves the training times by orders of magnitude, even when only a small fraction of all transitions are inserted as rules.

...read moreread less

Abstract: The authors present a method that incorporates a priori knowledge in the training of recurrent neural networks. This a priori knowledge can be interpreted as hints about the problem to be learned and these hints are encoded as rules which are then inserted into the neural network. The authors demonstrate the approach by training recurrent neural networks with inserted rules to learn to recognize regular languages from grammatical string examples. Because the recurrent networks have second-order connections, rule-insertion is a straightforward mapping of rules into weights and neurons. Simulations show that training recurrent networks with different amounts of partial knowledge to recognize simple grammers improves the training times by orders of magnitude, even when only a small fraction of all transitions are inserted as rules. In addition, there appears to be no loss in generalization performance. >

...read moreread less

Collapse