Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks

doi:10.1109/72.105429

Home
/
Papers
/
Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks

Journal Article•DOI•

Weight perturbation: an optimal architecture and learning technique for analog VLSI feedforward and recurrent multilayer networks

Marwan A. Jabri¹, Barry Flower¹•Institutions (1)

University of Sydney¹

01 Jan 1992-IEEE Transactions on Neural Networks (IEEE Trans Neural Netw)-Vol. 3, Iss: 1, pp 154-157

TL;DR: It is shown that using gradient descent with direct approximation of the gradient instead of back-propagation is more economical for parallel analog implementations and is suitable for multilayer recurrent networks as well.

read less

Abstract: Previous work on analog VLSI implementation of multilayer perceptrons with on-chip learning has mainly targeted the implementation of algorithms such as back-propagation. Although back-propagation is efficient, its implementation in analog VLSI requires excessive computational hardware. It is shown that using gradient descent with direct approximation of the gradient instead of back-propagation is more economical for parallel analog implementations. It is shown that this technique (which is called 'weight perturbation') is suitable for multilayer recurrent networks as well. A discrete level analog implementation showing the training of an XOR network as an example is presented. >

...read moreread less

Citations

PDF

Open Access

More filters

Book•

Neural networks for pattern recognition

[...]

Christopher M. Bishop¹•Institutions (1)

Aston University¹

01 Jan 1995

TL;DR: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition, and is designed as a text, with over 100 exercises, to benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

Abstract: From the Publisher: This is the first comprehensive treatment of feed-forward neural networks from the perspective of statistical pattern recognition. After introducing the basic concepts, the book examines techniques for modelling probability density functions and the properties and merits of the multi-layer perceptron and radial basis function network models. Also covered are various forms of error functions, principal algorithms for error function minimalization, learning and generalization in neural networks, and Bayesian techniques and their applications. Designed as a text, with over 100 exercises, this fully up-to-date work will benefit anyone involved in the fields of neural computation and pattern recognition.

...read moreread less

19,056 citations

Cites methods from "Weight perturbation: an optimal arc..."

...This technique is called node perturbation (Jabri and Flower, 1991), and is closely related to the madeline III learning rule (Widrow and Lehr, 1990)....
[...]

Journal Article•DOI•

Towards spike-based machine intelligence with neuromorphic computing.

[...]

Kaushik Roy¹, Akhilesh Jaiswal¹, Priyadarshini Panda¹•Institutions (1)

Purdue University¹

27 Nov 2019-Nature

TL;DR: An overview of the developments in neuromorphic computing for both algorithms and hardware is provided and the fundamentals of learning and hardware frameworks are highlighted, with emphasis on algorithm–hardware codesign.

...read moreread less

Abstract: Guided by brain-like ‘spiking’ computational frameworks, neuromorphic computing—brain-inspired computing for machine intelligence—promises to realize artificial intelligence while reducing the energy requirements of computing platforms. This interdisciplinary field began with the implementation of silicon circuits for biological neural routines, but has evolved to encompass the hardware implementation of algorithms with spike-based encoding and event-driven representations. Here we provide an overview of the developments in neuromorphic computing for both algorithms and hardware and highlight the fundamentals of learning and hardware frameworks. We discuss the main challenges and the future prospects of neuromorphic computing, with emphasis on algorithm–hardware codesign. The authors review the advantages and future prospects of neuromorphic computing, a multidisciplinary engineering concept for energy-efficient artificial intelligence with brain-inspired functionality.

...read moreread less

877 citations

Journal Article•DOI•

Fast exact multiplication by the Hessian

[...]

Barak A. Pearlmutter¹•Institutions (1)

Princeton University¹

01 Jan 1994-Neural Computation

TL;DR: This work derives a technique that directly calculates Hv, where v is an arbitrary vector, and shows that this technique can be used at the heart of many iterative techniques for computing various properties of H, obviating any need to calculate the full Hessian.

...read moreread less

Abstract: Just storing the Hessian H (the matrix of second derivatives δ2E/δwiδ wj of the error E with respect to each pair of weights) of a large neural network is difficult. Since a common use of a large matrix like H is to compute its product with various vectors, we derive a technique that directly calculates Hv, where v is an arbitrary vector. To calculate Hv, we first define a differential operator Rv{f(w)} = (δ/δr)f(w + rv)|r=0, note that Rv{∇w} = Hv and Rv{w} = v, and then apply Rv{·} to the equations used to compute ∇w. The result is an exact and numerically stable procedure for computing Hv, which takes about as much computation, and is about as local, as a gradient evaluation. We then apply the technique to a one pass gradient calculation algorithm (backpropagation), a relaxation gradient calculation algorithm (recurrent backpropagation), and two stochastic gradient calculation algorithms (Boltzmann machines and weight perturbation). Finally, we show that this technique can be used at the heart of many iterative techniques for computing various properties of H, obviating any need to calculate the full Hessian.

...read moreread less

700 citations

Journal Article•DOI•

Artificial neural networks in hardware: A survey of two decades of progress

[...]

Janardan Misra, Indranil Saha¹•Institutions (1)

University of California, Los Angeles¹

01 Dec 2010-Neurocomputing

TL;DR: This article presents a comprehensive overview of the hardware realizations of artificial neural network models, known as hardware neural networks (HNN), appearing in academic studies as prototypes as well as in commercial use.

...read moreread less

638 citations

Cites background from "Weight perturbation: an optimal arc..."

...This can be important in certain high-volume applications, such as ubiquitous consumer-products for real-time image processing, that are very price-sensitive....
[...]

Posted Content•

A Survey of Neuromorphic Computing and Neural Networks in Hardware.

[...]

Catherine D. Schuman, Thomas E. Potok, Robert M. Patton, J. Douglas Birdwell, Mark Edward Dean, Garrett S. Rose, James S. Plank - Show less +3 more

19 May 2017-arXiv: Neural and Evolutionary Computing

TL;DR: An exhaustive review of the research conducted in neuromorphic computing since the inception of the term is provided to motivate further work by illuminating gaps in the field where new research is needed.

...read moreread less

Abstract: Neuromorphic computing has come to refer to a variety of brain-inspired computers, devices, and models that contrast the pervasive von Neumann computer architecture This biologically inspired approach has created highly connected synthetic neurons and synapses that can be used to model neuroscience theories as well as solve challenging machine learning problems The promise of the technology is to create a brain-like ability to learn and adapt, but the technical challenges are significant, starting with an accurate neuroscience model of how the brain works, to finding materials and engineering breakthroughs to build devices to support these models, to creating a programming framework so the systems can learn, to creating applications with brain-like capabilities In this work, we provide a comprehensive survey of the research and motivations for neuromorphic computing over its history We begin with a 35-year review of the motivations and drivers of neuromorphic computing, then look at the major research areas of the field, which we define as neuro-inspired models, algorithms and learning approaches, hardware and devices, supporting systems, and finally applications We conclude with a broad discussion on the major research topics that need to be addressed in the coming years to see the promise of neuromorphic computing fulfilled The goals of this work are to provide an exhaustive review of the research conducted in neuromorphic computing since the inception of the term, and to motivate further work by illuminating gaps in the field where new research is needed

...read moreread less

570 citations

Additional excerpts

...[647], [655], [670], [679], [682], [693], [698], [699], [702]–...
[...]
...[655], [669], [682], [698], [699], [708], [710], [712], [713],...
[...]

1
2
3
4
…
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53

Collapse

References

PDF

Open Access

More filters

Journal Article•DOI•

30 years of adaptive neural networks: perceptron, Madaline, and backpropagation

[...]

Bernard Widrow¹, Michael A. Lehr¹•Institutions (1)

Stanford University¹

01 Sep 1990

TL;DR: The history, origination, operating characteristics, and basic theory of several supervised neural-network training algorithms (including the perceptron rule, the least-mean-square algorithm, three Madaline rules, and the backpropagation technique) are described.

...read moreread less

Abstract: Fundamental developments in feedforward artificial neural networks from the past thirty years are reviewed. The history, origination, operating characteristics, and basic theory of several supervised neural-network training algorithms (including the perceptron rule, the least-mean-square algorithm, three Madaline rules, and the backpropagation technique) are described. The concept underlying these iterative adaptation algorithms is the minimal disturbance principle, which suggests that during training it is advisable to inject new information into a network in a manner that disturbs stored information to the smallest extent possible. The two principal kinds of online rules that have developed for altering the weights of a network are examined for both single-threshold elements and multielement networks. They are error-correction rules, which alter the weights of a network to correct error in the output response to the present input pattern, and gradient rules, which alter the weights of a network during each pattern presentation by gradient descent with the objective of reducing mean-square error (averaged over all training patterns). >

...read moreread less

2,297 citations

Journal Article•DOI•

Recurrent backpropagation and the dynamical approach to adaptive neural computation

[...]

Fernando J. Pineda¹•Institutions (1)

California Institute of Technology¹

01 Jun 1989-Neural Computation

TL;DR: It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control.

...read moreread less

Abstract: Error backpropagation in feedforward neural network models is a popular learning algorithm that has its roots in nonlinear estimation and optimization. It is being used routinely to calculate error gradients in nonlinear systems with hundreds of thousands of parameters. However, the classical architecture for backpropagation has severe restrictions. The extension of backpropagation to networks with recurrent connections will be reviewed. It is now possible to efficiently compute the error gradients for networks that have temporal dynamics, which opens applications to a host of problems in systems identification and control.

...read moreread less

174 citations

"Weight perturbation: an optimal arc..." refers background in this paper

...w ij = E(w ij + pert ij ) E(w ij ) pert ij (3)...
[...]

Journal Article•DOI•

Fast neural networks without multipliers

[...]

Michele Marchesi, G. Orlandi, Francesco Piazza, Aurelio Uncini

01 Jan 1993-IEEE Transactions on Neural Networks

TL;DR: Some test cases are presented, concerning MLPs with hidden layers of different sizes, on pattern recognition problems, to demonstrate the validity and the generalization capability of the method and give some insight into the behavior of the learning algorithm.

...read moreread less

Abstract: Multilayer perceptrons (MLPs) with weight values restricted to powers of two or sums of powers of two are introduced. In a digital implementation, these neural networks do not need multipliers but only shift registers when computing in forward mode, thus saving chip area and computation time. A learning procedure, based on backpropagation, is presented for such neural networks. This learning procedure requires full real arithmetic and therefore must be performed offline. Some test cases are presented, concerning MLPs with hidden layers of different sizes, on pattern recognition problems. Such tests demonstrate the validity and the generalization capability of the method and give some insight into the behavior of the learning algorithm. >

...read moreread less

142 citations

Journal Article•DOI•

Techniques for area estimation of VLSI layouts

[...]

Fadi J. Kurdahi¹, Alice C. Parker²•Institutions (2)

University of California, Irvine¹, University of Southern California²

01 Jan 1989-IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems

TL;DR: The standard cell design style is investigated, and two probabilistic models are presented that estimate the wiring space requirements in the routing channels between the cell rows and the number of feedthroughs that must be inserted in thecell rows to interconnect cells placed several rows apart.

...read moreread less

Abstract: The standard cell design style is investigated. Two probabilistic models are presented. The first model estimates the wiring space requirements in the routing channels between the cell rows. The second model estimates the number of feedthroughs that must be inserted in the cell rows to interconnect cells placed several rows apart. These models were implemented in the standard cell area estimation program PLEST (PLotting ESTimator). PLEST was used to estimate the areas of a set of 12 standard cell chips. In all cases, the estimates were accurate to within 10% of the actual areas. PLEST's estimation of a chip layout area takes only a few seconds to produce, as compared with more than 10 h to generate the chip layout itself using an industrial layout system. >

...read moreread less

109 citations

Proceedings Article•DOI•

A new area and shape function estimation technique for VLSI layouts

[...]

Gerhard Zimmerman

01 Jun 1988

TL;DR: A model is presented for the prediction of shape functions for aspect ratios up to 1:5 and can be used for many different design styles and has been tested for standard cell blocks for the placement of general cells.

...read moreread less

Abstract: Area estimation of IC layouts has become an important requirement for early design and top-down chip planning tools. Especially the relation of area and aspect ratio (shape function) is necessary for chip planning. Statistical models have been published with good results for standard cell blocks with near unity aspect ratios. This paper describes a new model for the prediction of shape functions for aspect ratios up to 1:5. The model is based on the shape and connectivity of adjacent cells. It can be used for many different design styles and has been tested for standard cell blocks and for the placement of general cells.Categories: 6, 9

...read moreread less

96 citations