scispace - formally typeset
Search or ask a question
Journal ArticleDOI

Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks

01 Jan 1992-IEEE Transactions on Neural Networks (IEEE Trans Neural Netw)-Vol. 3, Iss: 1, pp 154-157
TL;DR: It is shown that using gradient descent with direct approximation of the gradient instead of back-propagation is more economical for parallel analog implementations and is suitable for multilayer recurrent networks as well.
Abstract: Previous work on analog VLSI implementation of multilayer perceptrons with on-chip learning has mainly targeted the implementation of algorithms such as back-propagation. Although back-propagation is efficient, its implementation in analog VLSI requires excessive computational hardware. It is shown that using gradient descent with direct approximation of the gradient instead of back-propagation is more economical for parallel analog implementations. It is shown that this technique (which is called 'weight perturbation') is suitable for multilayer recurrent networks as well. A discrete level analog implementation showing the training of an XOR network as an example is presented. >
Citations
More filters
Book
01 Jan 1995
TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.
Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

19,056 citations


Cites methods from "Weight perturbation: an optimal arc..."

  • ...This technique is called node perturbation (Jabri and Flower, 1991), and is closely related to the madeline III learning rule (Widrow and Lehr, 1990)....

    [...]

Journal ArticleDOI
27 Nov 2019-Nature
TL;DR: An overview of the developments in neuromorphic computing for both algorithms and hardware is provided and the fundamentals of learning and hardware frameworks are highlighted, with emphasis on algorithm–hardware codesign.
Abstract: Guided by brain-like ‘spiking’ computational frameworks, neuromorphic computing—brain-inspired computing for machine intelligence—promises to realize artificial intelligence while reducing the energy requirements of computing platforms. This interdisciplinary field began with the implementation of silicon circuits for biological neural routines, but has evolved to encompass the hardware implementation of algorithms with spike-based encoding and event-driven representations. Here we provide an overview of the developments in neuromorphic computing for both algorithms and hardware and highlight the fundamentals of learning and hardware frameworks. We discuss the main challenges and the future prospects of neuromorphic computing, with emphasis on algorithm–hardware codesign. The authors review the advantages and future prospects of neuromorphic computing, a multidisciplinary engineering concept for energy-efficient artificial intelligence with brain-inspired functionality.

877 citations

Journal ArticleDOI
TL;DR: This work derives a technique that directly calculates Hv, where v is an arbitrary vector, and shows that this technique can be used at the heart of many iterative techniques for computing various properties of H, obviating any need to calculate the full Hessian.
Abstract: Just storing the Hessian H (the matrix of second derivatives δ2E/δwiδ wj of the error E with respect to each pair of weights) of a large neural network is difficult. Since a common use of a large matrix like H is to compute its product with various vectors, we derive a technique that directly calculates Hv, where v is an arbitrary vector. To calculate Hv, we first define a differential operator Rv{f(w)} = (δ/δr)f(w + rv)|r=0, note that Rv{∇w} = Hv and Rv{w} = v, and then apply Rv{·} to the equations used to compute ∇w. The result is an exact and numerically stable procedure for computing Hv, which takes about as much computation, and is about as local, as a gradient evaluation. We then apply the technique to a one pass gradient calculation algorithm (backpropagation), a relaxation gradient calculation algorithm (recurrent backpropagation), and two stochastic gradient calculation algorithms (Boltzmann machines and weight perturbation). Finally, we show that this technique can be used at the heart of many iterative techniques for computing various properties of H, obviating any need to calculate the full Hessian.

700 citations

Journal ArticleDOI
TL;DR: This article presents a comprehensive overview of the hardware realizations of artificial neural network models, known as hardware neural networks (HNN), appearing in academic studies as prototypes as well as in commercial use.

638 citations


Cites background from "Weight perturbation: an optimal arc..."

  • ...This can be important in certain high-volume applications, such as ubiquitous consumer-products for real-time image processing, that are very price-sensitive....

    [...]

Posted Content
TL;DR: An exhaustive review of the research conducted in neuromorphic computing since the inception of the term is provided to motivate further work by illuminating gaps in the field where new research is needed.
Abstract: Neuromorphic computing has come to refer to a variety of brain-inspired computers, devices, and models that contrast the pervasive von Neumann computer architecture This biologically inspired approach has created highly connected synthetic neurons and synapses that can be used to model neuroscience theories as well as solve challenging machine learning problems The promise of the technology is to create a brain-like ability to learn and adapt, but the technical challenges are significant, starting with an accurate neuroscience model of how the brain works, to finding materials and engineering breakthroughs to build devices to support these models, to creating a programming framework so the systems can learn, to creating applications with brain-like capabilities In this work, we provide a comprehensive survey of the research and motivations for neuromorphic computing over its history We begin with a 35-year review of the motivations and drivers of neuromorphic computing, then look at the major research areas of the field, which we define as neuro-inspired models, algorithms and learning approaches, hardware and devices, supporting systems, and finally applications We conclude with a broad discussion on the major research topics that need to be addressed in the coming years to see the promise of neuromorphic computing fulfilled The goals of this work are to provide an exhaustive review of the research conducted in neuromorphic computing since the inception of the term, and to motivate further work by illuminating gaps in the field where new research is needed

570 citations


Additional excerpts

  • ...[647], [655], [670], [679], [682], [693], [698], [699], [702]–...

    [...]

  • ...[655], [669], [682], [698], [699], [708], [710], [712], [713],...

    [...]

References
More filters
Journal ArticleDOI
01 Sep 1990
TL;DR: The history, origination, operating characteristics, and basic theory of several supervised neural-network training algorithms (including the perceptron rule, the least-mean-square algorithm, three Madaline rules, and the backpropagation technique) are described.
Abstract: Fundamental developments in feedforward artificial neural networks from the past thirty years are reviewed. The history, origination, operating characteristics, and basic theory of several supervised neural-network training algorithms (including the perceptron rule, the least-mean-square algorithm, three Madaline rules, and the backpropagation technique) are described. The concept underlying these iterative adaptation algorithms is the minimal disturbance principle, which suggests that during training it is advisable to inject new information into a network in a manner that disturbs stored information to the smallest extent possible. The two principal kinds of online rules that have developed for altering the weights of a network are examined for both single-threshold elements and multielement networks. They are error-correction rules, which alter the weights of a network to correct error in the output response to the present input pattern, and gradient rules, which alter the weights of a network during each pattern presentation by gradient descent with the objective of reducing mean-square error (averaged over all training patterns). >

2,297 citations

Journal ArticleDOI
TL;DR: It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control.
Abstract: Error backpropagation in feedforward neural network models is a popular learning algorithm that has its roots in nonlinear estimation and optimization. It is being used routinely to calculate error gradients in nonlinear systems with hundreds of thousands of parameters. However, the classical architecture for backpropagation has severe restrictions. The extension of backpropagation to networks with recurrent connections will be reviewed. It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control.

174 citations


"Weight perturbation: an optimal arc..." refers background in this paper

  • ...w ij = E(w ij + pert ij ) E(w ij ) pert ij (3)...

    [...]

Journal ArticleDOI
TL;DR: Some test cases are presented, concerning MLPs with hidden layers of different sizes, on pattern recognition problems, to demonstrate the validity and the generalization capability of the method and give some insight into the behavior of the learning algorithm.
Abstract: Multilayer perceptrons (MLPs) with weight values restricted to powers of two or sums of powers of two are introduced. In a digital implementation, these neural networks do not need multipliers but only shift registers when computing in forward mode, thus saving chip area and computation time. A learning procedure, based on backpropagation, is presented for such neural networks. This learning procedure requires full real arithmetic and therefore must be performed offline. Some test cases are presented, concerning MLPs with hidden layers of different sizes, on pattern recognition problems. Such tests demonstrate the validity and the generalization capability of the method and give some insight into the behavior of the learning algorithm. >

142 citations

Journal ArticleDOI
TL;DR: The standard cell design style is investigated, and two probabilistic models are presented that estimate the wiring space requirements in the routing channels between the cell rows and the number of feedthroughs that must be inserted in thecell rows to interconnect cells placed several rows apart.
Abstract: The standard cell design style is investigated. Two probabilistic models are presented. The first model estimates the wiring space requirements in the routing channels between the cell rows. The second model estimates the number of feedthroughs that must be inserted in the cell rows to interconnect cells placed several rows apart. These models were implemented in the standard cell area estimation program PLEST (PLotting ESTimator). PLEST was used to estimate the areas of a set of 12 standard cell chips. In all cases, the estimates were accurate to within 10% of the actual areas. PLEST's estimation of a chip layout area takes only a few seconds to produce, as compared with more than 10 h to generate the chip layout itself using an industrial layout system. >

109 citations

Proceedings ArticleDOI
01 Jun 1988
TL;DR: A model is presented for the prediction of shape functions for aspect ratios up to 1:5 and can be used for many different design styles and has been tested for standard cell blocks for the placement of general cells.
Abstract: Area estimation of IC layouts has become an important requirement for early design and top-down chip planning tools. Especially the relation of area and aspect ratio (shape function) is necessary for chip planning. Statistical models have been published with good results for standard cell blocks with near unity aspect ratios. This paper describes a new model for the prediction of shape functions for aspect ratios up to 1:5. The model is based on the shape and connectivity of adjacent cells. It can be used for many different design styles and has been tested for standard cell blocks and for the placement of general cells.Categories: 6, 9

96 citations