Topic

# Stochastic neural network

About: Stochastic neural network is a research topic. Over the lifetime, 5913 publications have been published within this topic receiving 138214 citations. The topic is also known as: SNN.

##### Papers published on a yearly basis

##### Papers

More filters

••

[...]

TL;DR: It is demonstrated that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube.

Abstract: In this paper we demonstrate that finite linear combinations of compositions of a fixed, univariate function and a set of affine functionals can uniformly approximate any continuous function ofn real variables with support in the unit hypercube; only mild conditions are imposed on the univariate function. Our results settle an open question about representability in the class of single hidden layer neural networks. In particular, we show that arbitrary decision regions can be arbitrarily well approximated by continuous feedforward neural networks with only a single internal, hidden layer and any continuous sigmoidal nonlinearity. The paper discusses approximation properties of other possible types of nonlinearities that might be implemented by artificial neural networks.

10,615 citations

••

[...]

TL;DR: The bias-variance decomposition of the error is provided in this paper, which shows that the success of GASEN may lie in that it can significantly reduce the bias as well as the variance.

Abstract: Neural network ensemble is a learning paradigm where many neural networks are jointly used to solve a problem. In this paper, the relationship between the ensemble and its component neural networks is analyzed from the context of both regression and classification, which reveals that it may be better to ensemble many instead of all of the neural networks at hand. This result is interesting because at present, most approaches ensemble all the available neural networks for prediction. Then, in order to show that the appropriate neural networks for composing an ensemble can be effectively selected from a set of available neural networks, an approach named GASEN is presented. GASEN trains a number of neural networks at first. Then it assigns random weights to those networks and employs genetic algorithm to evolve the weights so that they can characterize to some extent the fitness of the neural networks in constituting an ensemble. Finally it selects some neural networks based on the evolved weights to make up the ensemble. A large empirical study shows that, compared with some popular ensemble approaches such as Bagging and Boosting, GASEN can generate neural network ensembles with far smaller sizes but stronger generalization ability. Furthermore, in order to understand the working mechanism of GASEN, the bias-variance decomposition of the error is provided in this paper, which shows that the success of GASEN may lie in that it can significantly reduce the bias as well as the variance.

1,703 citations

•

[...]

14 Apr 1993

TL;DR: This paper presents a meta-modelling framework for evaluating the performance of Neural Networks using the NEURAL Program, which automates the very labor-intensive and therefore time-heavy and expensive process of unsupervised training.

Abstract: Foundations. Classification. Autoassociation. Time Series Prediction. Function Approximation. Multilayer Feedforward Networks. Eluding Local Minimai: Simulated Annealing. Eluding Local Minima II: Genetic Optimisation. Regression and Neural Networks. Designing Feedforward Network Architectures. Interpreting Weights: How Does This Thing Work? Probalistic Neural Networks. Functional Link Networks. Hybrid Networks. Designing the Training Set. Preparing Input Data. Fuzzy Data and Processing. Unsupervised Training. Evaluating Performance of Neural Networks. Hybrid Networks. Designing the Training Set. Preparing Input Data. Fuzzy Data and Processing. Unsupervised Training. Evaluating Performance of Neural Networks. Confidence Measures. Optimizing the Decision Threshold. Using the NEURAL Program. Appendix. Bibliography. Index.

1,639 citations

•

[...]

TL;DR: This work considers a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network.

Abstract: Stochastic neurons and hard non-linearities can be useful for a number of reasons in deep learning models, but in many cases they pose a challenging problem: how to estimate the gradient of a loss function with respect to the input of such stochastic or non-smooth neurons? I.e., can we "back-propagate" through these stochastic neurons? We examine this question, existing approaches, and compare four families of solutions, applicable in different settings. One of them is the minimum variance unbiased gradient estimator for stochatic binary neurons (a special case of the REINFORCE algorithm). A second approach, introduced here, decomposes the operation of a binary stochastic neuron into a stochastic binary part and a smooth differentiable part, which approximates the expected effect of the pure stochatic binary neuron to first order. A third approach involves the injection of additive or multiplicative noise in a computational graph that is otherwise differentiable. A fourth approach heuristically copies the gradient with respect to the stochastic output directly as an estimator of the gradient with respect to the sigmoid argument (we call this the straight-through estimator). To explore a context where these estimators are useful, we consider a small-scale version of {\em conditional computation}, where sparse stochastic units form a distributed representation of gaters that can turn off in combinatorially many ways large chunks of the computation performed in the rest of the neural network. In this case, it is important that the gating units produce an actual 0 most of the time. The resulting sparsity can be potentially be exploited to greatly reduce the computational cost of large deep networks for which conditional computation would be useful.

1,474 citations

••

[...]

TL;DR: Theoretical results concerning the capabilities and limitations of various neural network models are summarized, and some of their extensions are discussed.

Abstract: Theoretical results concerning the capabilities and limitations of various neural network models are summarized, and some of their extensions are discussed. The network models considered are divided into two basic categories: static networks and dynamic networks. Unlike static networks, dynamic networks have memory. They fall into three groups: networks with feedforward dynamics, networks with output feedback, and networks with state feedback, which are emphasized in this work. Most of the networks discussed are trained using supervised learning. >

1,222 citations