scispace - formally typeset
Search or ask a question

Showing papers in "Neural Computation in 2007"


Journal ArticleDOI
TL;DR: This letter proposes two projected gradient methods for nonnegative matrix factorization, both of which exhibit strong optimization properties and discuss efficient implementations and demonstrate that one of the proposed methods converges faster than the popular multiplicative update approach.
Abstract: Nonnegative matrix factorization (NMF) can be formulated as a minimization problem with bound constraints. Although bound-constrained optimization has been studied extensively in both theory and practice, so far no study has formally applied its techniques to NMF. In this letter, we propose two projected gradient methods for NMF, both of which exhibit strong optimization properties. We discuss efficient implementations and demonstrate that one of the proposed methods converges faster than the popular multiplicative update approach. A simple Matlab code is also provided.

1,808 citations


Journal ArticleDOI
TL;DR: It is pointed out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility.
Abstract: Most literature on support vector machines (SVMs) concentrates on the dual optimization problem. In this letter, we point out that the primal problem can also be solved efficiently for both linear and nonlinear SVMs and that there is no reason for ignoring this possibility. On the contrary, from the primal point of view, new families of algorithms for large-scale SVM training can be investigated.

837 citations


Journal ArticleDOI
TL;DR: A method for objectively selecting the bin size from the spike count statistics alone, so that the resulting bar or line graph time histogram best represents the unknown underlying spike rate.
Abstract: The time histogram method is the most basic tool for capturing a time dependent rate of neuronal spikes. Generally in the neurophysiological literature, the bin size that critically determines the goodness of the fit of the time histogram to the underlying spike rate has been subjectively selected by individual researchers. Here, we propose a method for objectively selecting the bin size from the spike count statistics alone, so that the resulting bar or line graph time histogram best represents the unknown underlying spike rate. For a small number of spike sequences generated from a modestly fluctuating rate, the optimal bin size may diverge, indicating that any time histogram is likely to capture a spurious rate. Given a paucity of data, the method presented here can nevertheless suggest how many experimental trials should be added in order to obtain a meaningful time-dependent histogram with the required accuracy.

461 citations


Journal ArticleDOI
TL;DR: This article shows how many aspects of the anatomy and physiology of the circuit involving the cortex and basal ganglia are exactly those required to implement the computation defined by an asymptotically optimal statistical test for decision making: the multihypothesis sequential probability ratio test (MSPRT).
Abstract: Neurophysiological studies have identified a number of brain regions critically involved in solving the problem of action selection or decision making. In the case of highly practiced tasks, these regions include cortical areas hypothesized to integrate evidence supporting alternative actions and the basal ganglia, hypothesized to act as a central switch in gating behavioral requests. However, despite our relatively detailed knowledge of basal ganglia biology and its connectivity with the cortex and numerical simulation studies demonstrating selective function, no formal theoretical framework exists that supplies an algorithmic description of these circuits. This article shows how many aspects of the anatomy and physiology of the circuit involving the cortex and basal ganglia are exactly those required to implement the computation defined by an asymptotically optimal statistical test for decision making: the multihypothesis sequential probability ratio test (MSPRT). The resulting model of basal ganglia provides a new framework for understanding the computation in the basal ganglia during decision making in highly practiced tasks. The predictions of the theory concerning the properties of particular neuronal populations are validated in existing experimental data. Further, we show that this neurobiologically grounded implementation of MSPRT outperforms other candidates for neural decision making, that it is structurally and parametrically robust, and that it can accommodate cortical mechanisms for decision making in a way that complements those in basal ganglia.

390 citations


Journal ArticleDOI
TL;DR: It is shown that the modulation of STDP by a global reward signal leads to reinforcement learning, and analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity are derived, which may be used for training generic artificial spiking neural networks, regardless of the neural model used.
Abstract: The persistent modification of synaptic efficacy as a function of the relative timing of pre- and postsynaptic spikes is a phenomenon known as spike-timing-dependent plasticity (STDP). Here we show that the modulation of STDP by a global reward signal leads to reinforcement learning. We first derive analytically learning rules involving reward-modulated spike-timing-dependent synaptic and intrinsic plasticity, by applying a reinforcement learning algorithm to the stochastic spike response model of spiking neurons. These rules have several features common to plasticity mechanisms experimentally found in the brain. We then demonstrate in simulations of networks of integrate-and-fire neurons the efficacy of two simple learning rules involving modulated STDP. One rule is a direct extension of the standard STDP model (modulated STDP), and the other one involves an eligibility trace stored at each synapse that keeps a decaying memory of the relationships between the recent pairs of pre- and postsynaptic spike pairs (modulated STDP with eligibility trace). This latter rule permits learning even if the reward signal is delayed. The proposed rules are able to solve the XOR problem with both rate coded and temporally coded input and to learn a target output firing-rate pattern. These learning rules are biologically plausible, may be used for training generic artificial spiking neural networks, regardless of the neural model used, and suggest the experimental investigation in animals of the existence of reward-modulated STDP.

383 citations


Journal ArticleDOI
TL;DR: A novel STDP update rule is proposed, with a multiplicative dependence on the synaptic weight for depression, and a power law dependence for potentiation, and it is shown that this rule, when implemented in large, balanced networks of realistic connectivity and sparseness, is compatible with the asynchronous irregular activity regime.
Abstract: The balanced random network model attracts considerable interest because it explains the irregular spiking activity at low rates and large membrane potential fluctuations exhibited by cortical neurons in vivo. In this article, we investigate to what extent this model is also compatible with the experimentally observed phenomenon of spike-timing-dependent plasticity (STDP). Confronted with the plethora of theoretical models for STDP available, we reexamine the experimental data. On this basis, we propose a novel STDP update rule, with a multiplicative dependence on the synaptic weight for depression, and a power law dependence for potentiation. We show that this rule, when implemented in large, balanced networks of realistic connectivity and sparseness, is compatible with the asynchronous irregular activity regime. The resultant equilibrium weight distribution is unimodal with fluctuating individual weight trajectories and does not exhibit development of structure. We investigate the robustness of our results with respect to the relative strength of depression. We introduce synchronous stimulation to a group of neurons and demonstrate that the decoupling of this group from the rest of the network is so severe that it cannot effectively control the spiking of other neurons, even those with the highest convergence from this group.

375 citations


Journal ArticleDOI
TL;DR: A model of spike-driven synaptic plasticity inspired by experimental observations and motivated by the desire to build an electronic hardware device that can learn to classify complex stimuli in a semisupervised fashion is presented.
Abstract: We present a model of spike-driven synaptic plasticity inspired by experimental observations and motivated by the desire to build an electronic hardware device that can learn to classify complex stimuli in a semisupervised fashion. During training, patterns of activity are sequentially imposed on the input neurons, and an additional instructor signal drives the output neurons toward the desired activity. The network is made of integrate-and-fire neurons with constant leak and a floor. The synapses are bistable, and they are modified by the arrival of presynaptic spikes. The sign of the change is determined by both the depolarization and the state of a variable that integrates the postsynaptic action potentials. Following the training phase, the instructor signal is removed, and the output neurons are driven purely by the activity of the input neurons weighted by the plastic synapses. In the absence of stimulation, the synapses preserve their internal state indefinitely. Memories are also very robust to the disruptive action of spontaneous activity. A network of 2000 input neurons is shown to be able to classify correctly a large number (thousands) of highly overlapping patterns (300 classes of preprocessed Latex characters, 30 patterns per class, and a subset of the NIST characters data set) and to generalize with performances that are better than or comparable to those of artificial neural networks. Finally we show that the synaptic dynamics is compatible with many of the experimental observations on the induction of long-term modifications (spike-timing-dependent plasticity and its dependence on both the postsynaptic depolarization and the frequency of pre-and postsynaptic neurons).

371 citations


Journal ArticleDOI
TL;DR: The proposed analog VLSI synaptic circuit is based on a computational model that fits the real postsynaptic currents with exponentials and can be connected to additional modules for implementing a wide range of synaptic properties.
Abstract: Synapses are crucial elements for computation and information transfer in both real and artificial neural systems. Recent experimental findings and theoretical models of pulse-based neural networks suggest that synaptic dynamics can play a crucial role for learning neural codes and encoding spatiotemporal spike patterns. Within the context of hardware implementations of pulse-based neural networks, several analog VLSI circuits modeling synaptic functionality have been proposed. We present an overview of previously proposed circuits and describe a novel analog VLSI synaptic circuit suitable for integration in large VLSI spike-based neural systems. The circuit proposed is based on a computational model that fits the real postsynaptic currents with exponentials. We present experimental data showing how the circuit exhibits realistic dynamics and show how it can be connected to additional modules for implementing a wide range of synaptic properties.

356 citations


Journal ArticleDOI
TL;DR: Two new support vector approaches for ordinal regression are proposed, which optimize multiple thresholds to define parallel discriminant hyperplanes for the ordinal scales, and guarantee that the thresholds are properly ordered at the optimal solution.
Abstract: In this letter, we propose two new support vector approaches for ordinal regression, which optimize multiple thresholds to define parallel discriminant hyperplanes for the ordinal scales. Both approaches guarantee that the thresholds are properly ordered at the optimal solution. The size of these optimization problems is linear in the number of training samples. The sequential minimal optimization algorithm is adapted for the resulting optimization problems; it is extremely easy to implement and scales efficiently as a quadratic function of the number of examples. The results of numerical experiments on some benchmark and real-world data sets, including applications of ordinal regression to information retrieval, verify the usefulness of these approaches.

293 citations


Journal ArticleDOI
TL;DR: It is shown that Evolino-based LSTM can solve tasks that Echo State nets cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-basedLSTM.
Abstract: In recent years, gradient-based LSTM recurrent neural networks (RNNs) solved many previously RNN-unlearnable tasks. Sometimes, however, gradient information is of little use for training RNNs, due to numerous local minima. For such cases, we present a novel method: EVOlution of systems with LINear Outputs (Evolino). Evolino evolves weights to the nonlinear, hidden nodes of RNNs while computing optimal linear mappings from hidden state to output, using methods such as pseudo-inverse-based linear regression. If we instead use quadratic programming to maximize the margin, we obtain the first evolutionary recurrent support vector machines. We show that Evolino-based LSTM can solve tasks that Echo State nets (Jaeger, 2004a) cannot and achieves higher accuracy in certain continuous function generation tasks than conventional gradient descent RNNs, including gradient-based LSTM.

264 citations


Journal ArticleDOI
TL;DR: A functional space approximation framework is presented to better understand the operation of ESNs and an information-theoretic metric, the average entropy of echo states, is proposed to assess the richness of the ESN dynamics.
Abstract: The design of echo state network (ESN) parameters relies on the selection of the maximum eigenvalue of the linearized system around zero (spectral radius). However, this procedure does not quantify in a systematic manner the performance of the ESN in terms of approximation error. This article presents a functional space approximation framework to better understand the operation of ESNs and proposes an information-theoretic metric, the average entropy of echo states, to assess the richness of the ESN dynamics. Furthermore, it provides an interpretation of the ESN dynamics rooted in system theory as families of coupled linearized systems whose poles move according to the input signal dynamics. With this interpretation, a design methodology for functional approximation is put forward where ESNs are designed with uniform pole distributions covering the frequency spectrum to abide by the richness metric, irrespective of the spectral radius. A single bias parameter at the ESN input, adapted with the modeling error, configures the ESN spectral radius to the input-output joint space. Function approximation examples compare the proposed design methodology versus the conventional design.

Journal ArticleDOI
TL;DR: A parametric generalization of the two different multiplicative update rules for nonnegative matrix factorization by Lee and Seung (2001) is shown to lead to locally optimal solutions of the nonnegative Matrix factorization problem with this new cost function.
Abstract: This letter presents a general parametric divergence measure. The metric includes as special cases quadratic error and Kullback-Leibler divergence. A parametric generalization of the two different multiplicative update rules for nonnegative matrix factorization by Lee and Seung (2001) is shown to lead to locally optimal solutions of the nonnegative matrix factorization problem with this new cost function. Numeric simulations demonstrate that the new update rule may improve the quadratic distance convergence speed. A proof of convergence is given that, as in Lee and Seung, uses an auxiliary function known from the expectation-maximization theoretical framework.

Journal ArticleDOI
TL;DR: Overall it is found that fluctuation-driven persistent activity in the very simplified type of models the authors analyze is not a robust phenomenon.
Abstract: Spike trains from cortical neurons show a high degree of irregularity, with coefficients of variation (CV) of their interspike interval (ISI) distribution close to or higher than one. It has been suggested that this irregularity might be a reflection of a particular dynamical state of the local cortical circuit in which excitation and inhibition balance each other. In this "balanced" state, the mean current to the neurons is below threshold, and firing is driven by current fluctuations, resulting in irregular Poisson-like spike trains. Recent data show that the degree of irregularity in neuronal spike trains recorded during the delay period of working memory experiments is the same for both low-activity states of a few Hz and for elevated, persistent activity states of a few tens of Hz. Since the difference between these persistent activity states cannot be due to external factors coming from sensory inputs, this suggests that the underlying network dynamics might support coexisting balanced states at different firing rates. We use mean field techniques to study the possible existence of multiple balanced steady states in recurrent networks of current-based leaky integrate-and-fire (LIF) neurons. To assess the degree of balance of a steady state, we extend existing mean-field theories so that not only the firing rate, but also the coefficient of variation of the interspike interval distribution of the neurons, are determined self-consistently. Depending on the connectivity parameters of the network, we find bistable solutions of different types. If the local recurrent connectivity is mainly excitatory, the two stable steady states differ mainly in the mean current to the neurons. In this case, the mean drive in the elevated persistent activity state is suprathreshold and typically characterized by low spiking irregularity. If the local recurrent excitatory and inhibitory drives are both large and nearly balanced, or even dominated by inhibition, two stable states coexist, both with subthreshold current drive. In this case, the spiking variability in both the resting state and the mnemonic persistent state is large, but the balance condition implies parameter fine-tuning. Since the degree of required fine-tuning increases with network size and, on the other hand, the size of the fluctuations in the afferent current to the cells increases for small networks, overall we find that fluctuation-driven persistent activity in the very simplified type of models we analyze is not a robust phenomenon. Possible implications of considering more realistic models are discussed.

Journal ArticleDOI
TL;DR: This article derives multiplicative updates that improve the value of the objective function at each iteration and converge monotonically to the global minimum for convex problems in quadratic programming where the optimization is confined to an axis-aligned region in the nonnegative orthant.
Abstract: Many problems in neural computation and statistical learning involve optimizations with nonnegativity constraints. In this article, we study convex problems in quadratic programming where the optimization is confined to an axis-aligned region in the nonnegative orthant. For these problems, we derive multiplicative updates that improve the value of the objective function at each iteration and converge monotonically to the global minimum. The updates have a simple closed form and do not involve any heuristics or free parameters that must be tuned to ensure convergence. Despite their simplicity, they differ strikingly in form from other multiplicative updates used in machine learning. We provide complete proofs of convergence for these updates and describe their application to problems in signal processing and pattern recognition.

Journal ArticleDOI
TL;DR: Simulations of human brain rhythms were carried out and Physiologically plausible results were obtained based on this anatomically constrained neural mass model, which reduces the number of connection parameters substantially in this system faster.
Abstract: We study the generation of EEG rhythms by means of realistically coupled neural mass models. Previous neural mass models were used to model cortical voxels and the thalamus. Interactions between voxels of the same and other cortical areas and with the thalamus were taken into account. Voxels within the same cortical area were coupled (short-range connections) with both excitatory and inhibitory connections, while coupling between areas (long-range connections) was considered to be excitatory only. Short-range connection strengths were modeled by using a connectivity function depending on the distance between voxels. Coupling strength parameters between areas were defined from empirical anatomical data employing the information obtained from probabilistic paths, which were tracked by water diffusion imaging techniques and used to quantify white matter tracts in the brain. Each cortical voxel was then described by a set of 16 random differential equations, while the thalamus was described by a set of 12 random differential equations. Thus, for analyzing the neuronal dynamics emerging from the interaction of several areas, a large system of differential equations needs to be solved. The sparseness of the estimated anatomical connectivity matrix reduces the number of connection parameters substantially, making the solution of this system faster. Simulations of human brain rhythms were carried out in order to test the model. Physiologically plausible results were obtained based on this anatomically constrained neural mass model.

Journal ArticleDOI
TL;DR: This model reproduces perceived spatial frame shifts due to the audiovisual adaptation known as the ventriloquism aftereffect by adaptively changing the inner representation of the Bayesian observer in terms of experience.
Abstract: We study a computational model of audiovisual integration by setting a Bayesian observer that localizes visual and auditory stimuli without presuming the binding of audiovisual information. The observer adopts the maximum a posteriori approach to estimate the physically delivered position or timing of presented stimuli, simultaneously judging whether they are from the same source or not. Several experimental results on the perception of spatial unity and the ventriloquism effect can be explained comprehensively if the subjects in the experiments are regarded as Bayesian observers who try to accurately locate the stimulus. Moreover, by adaptively changing the inner representation of the Bayesian observer in terms of experience, we show that our model reproduces perceived spatial frame shifts due to the audiovisual adaptation known as the ventriloquism aftereffect.

Journal ArticleDOI
TL;DR: It is shown that by exploiting the existence of a minimal synaptic propagation delay, the need for a central event queue is removed, so that the precision of event-driven simulation on the level of single neurons is combined with the efficiency of time-driven global scheduling.
Abstract: Very large networks of spiking neurons can be simulated efficiently in parallel under the constraint that spike times are bound to an equidistant time grid. Within this scheme, the subthreshold dynamics of a wide class of integrate-and-fire-type neuron models can be integrated exactly from one grid point to the next. However, the loss in accuracy caused by restricting spike times to the grid can have undesirable consequences, which has led to interest in interpolating spike times between the grid points to retrieve an adequate representation of network dynamics. We demonstrate that the exact integration scheme can be combined naturally with off-grid spike events found by interpolation. We show that by exploiting the existence of a minimal synaptic propagation delay, the need for a central event queue is removed, so that the precision of event-driven simulation on the level of single neurons is combined with the efficiency of time-driven global scheduling. Further, for neuron models with linear subthreshold dynamics, even local event queuing can be avoided, resulting in much greater efficiency on the single-neuron level. These ideas are exemplified by two implementations of a widely used neuron model. We present a measure for the efficiency of network simulations in terms of their integration error and show that for a wide range of input spike rates, the novel techniques we present are both more accurate and faster than standard techniques.

Journal ArticleDOI
TL;DR: It is shown how intrinsic and synaptic plasticity mechanisms interact and allow the neuron to discover heavy-tailed directions in the input and it is demonstrated that intrinsic plasticity may be an alternative explanation for the sliding threshold postulated in the BCM theory of synaptic Plasticity.
Abstract: We propose a model of intrinsic plasticity for a continuous activation model neuron based on information theory. We then show how intrinsic and synaptic plasticity mechanisms interact and allow the neuron to discover heavy-tailed directions in the input. We also demonstrate that intrinsic plasticity may be an alternative explanation for the sliding threshold postulated in the BCM theory of synaptic plasticity. We present a theoretical analysis of the interaction of intrinsic plasticity with different Hebbian learning rules for the case of clustered inputs. Finally, we perform experiments on the “bars” problem, a popular nonlinear independent component analysis problem.

Journal ArticleDOI
TL;DR: In this paper, a simple biophysical model for the coupling between synaptic transmission and the local calcium concentration on an enveloping astrocytic domain is presented, which suggests a novel, testable hypothesis for the spike timing statistics measured for rapidly firing cells in culture experiments.
Abstract: We present a simple biophysical model for the coupling between synaptic transmission and the local calcium concentration on an enveloping astrocytic domain. This interaction enables the astrocyte to modulate the information flow from presynaptic to postsynaptic cells in a manner dependent on previous activity at this and other nearby synapses. Our model suggests a novel, testable hypothesis for the spike timing statistics measured for rapidly firing cells in culture experiments.

Journal ArticleDOI
TL;DR: A theoretical network analysis that can distinguish statistically causal interactions in population neural activity leading to a specific output and the concept of a causal core to refer to the set of neuronal interactions that are causally significant for the output, as assessed by Granger causality is described.
Abstract: We describe a theoretical network analysis that can distinguish statistically causal interactions in population neural activity leading to a specific output. We introduce the concept of a causal core to refer to the set of neuronal interactions that are causally significant for the output, as assessed by Granger causality. Because our approach requires extensive knowledge of neuronal connectivity and dynamics, an illustrative example is provided by analysis of Darwin X, a brain-based device that allows precise recording of the activity of neuronal units during behavior. In Darwin X, a simulated neuronal model of the hippocampus and surrounding cortical areas supports learning of a spatial navigation task in a real environment. Analysis of Darwin X reveals that large repertoires of neuronal interactions contain comparatively small causal cores and that these causal cores become smaller during learning, a finding that may reflect the selection of specific causal pathways from diverse neuronal repertoires.

Journal ArticleDOI
TL;DR: This letter proposes a one-parameter family of integration, called -integration, which includes all of these well-known integrations, which are generalizations of various averages of numbers such as arithmetic, geometric, and harmonic averages.
Abstract: When there are a number of stochastic models in the form of probability distributions, one needs to integrate them. Mixtures of distributions are frequently used, but exponential mixtures also provide a good means of integration. This letter proposes a one-parameter family of integration, called α-integration, which includes all of these well-known integrations. These are generalizations of various averages of numbers such as arithmetic, geometric, and harmonic averages. There are psychophysical experiments that suggest that α-integrations are used in the brain. The α-divergence between two distributions is defined, which is a natural generalization of Kullback-Leibler divergence and Hellinger distance, and it is proved that α-integration is optimal in the sense of minimizing α-divergence. The theory is applied to generalize the mixture of experts and the product of experts to the α-mixture of experts. The α-predictive distribution is also stated in the Bayesian framework.

Journal ArticleDOI
TL;DR: An augmented complex-valued extended Kalman filter algorithm for the class of nonlinear adaptive filters realized as fully connected recurrent neural networks is introduced based on some recent developments in the so-called augmented complex statistics and the use of general fully complex nonlinear activation functions within the neurons.
Abstract: An augmented complex-valued extended Kalman filter (ACEKF) algorithm for the class of nonlinear adaptive filters realized as fully connected recurrent neural networks is introduced. This is achieved based on some recent developments in the so-called augmented complex statistics and the use of general fully complex nonlinear activation functions within the neurons. This makes the ACEKF suitable for processing general complex-valued nonlinear and nonstationary signals and also bivariate signals with strong component correlations. Simulations on benchmark and real-world complex-valued signals support the approach.

Journal ArticleDOI
TL;DR: Empirical comparison with several other existing feature selection methods shows that the backward elimination variant of CSA leads to the most accurate classification results on an array of data sets.
Abstract: We present and study the contribution-selection algorithm (CSA), a novel algorithm for feature selection. The algorithm is based on the multiperturbation shapley analysis (MSA), a framework that relies on game theory to estimate usefulness. The algorithm iteratively estimates the usefulness of features and selects them accordingly, using either forward selection or backward elimination. It can optimize various performance measures over unseen data such as accuracy, balanced error rate, and area under receiver-operator-characteristic curve. Empirical comparison with several other existing feature selection methods shows that the backward elimination variant of CSA leads to the most accurate classification results on an array of data sets.

Journal ArticleDOI
TL;DR: This application shows that even in the presence of strong correlations, the methods constrain precisely the amount of information encoded by real spike trains recorded in vivo, which can provide data-robust upper and lower bounds to the mutual information.
Abstract: The estimation of the information carried by spike times is crucial for a quantitative understanding of brain function, but it is difficult because of an upward bias due to limited experimental sampling. We present new progress, based on two basic insights, on reducing the bias problem. First, we show that by means of a careful application of data-shuffling techniques, it is possible to cancel almost entirely the bias of the noise entropy, the most biased part of information. This procedure provides a new information estimator that is much less biased than the standard direct one and has similar variance. Second, we use a nonparametric test to determine whether all the information encoded by the spike train can be decoded assuming a low-dimensional response model. If this is the case, the complexity of response space can be fully captured by a small number of easily sampled parameters. Combining these two different procedures, we obtain a new class of precise estimators of information quantities, which can provide data-robust upper and lower bounds to the mutual information. These bounds are tight even when the number of trials per stimulus available is one order of magnitude smaller than the number of possible responses. The effectiveness and the usefulness of the methods are tested through applications to simulated data and recordings from somatosensory cortex. This application shows that even in the presence of strong correlations, our methods constrain precisely the amount of information encoded by real spike trains recorded in vivo.

Journal ArticleDOI
TL;DR: A probabilistic interpretation of the slow feature analysis (SFA) algorithm is developed, showing that inference and learning in the limiting case of a suitable Probabilistic model yield exactly the results of SFA.
Abstract: The brain extracts useful features from a maelstrom of sensory information, and a fundamental goal of theoretical neuroscience is to work out how it does so. One proposed feature extraction strategy is motivated by the observation that the meaning of sensory data, such as the identity of a moving visual object, is often more persistent than the activation of any single sensory receptor. This notion is embodied in the slow feature analysis (SFA) algorithm, which uses “slowness” as a heuristic by which to extract semantic information from multidimensional time series. Here, we develop a probabilistic interpretation of this algorithm, showing that inference and learning in the limiting case of a suitable probabilistic model yield exactly the results of SFA. Similar equivalences have proved useful in interpreting and extending comparable algorithms such as independent component analysis. For SFA, we use the equivalent probabilistic model as a conceptual springboard with which to motivate several novel extensions to the algorithm.

Journal ArticleDOI
TL;DR: A recently introduced policy learning algorithm from machine learning is applied to networks of spiking neurons and derived a spike-time-dependent plasticity rule that ensures convergence to a local optimum of the expected average reward.
Abstract: Learning agents, whether natural or artificial, must update their internal parameters in order to improve their behavior over time. In reinforcement learning, this plasticity is influenced by an environmental signal, termed a reward, that directs the changes in appropriate directions. We apply a recently introduced policy learning algorithm from machine learning to networks of spiking neurons and derive a spike-time-dependent plasticity rule that ensures convergence to a local optimum of the expected average reward. The approach is applicable to a broad class of neuronal models, including the Hodgkin-Huxley model. We demonstrate the effectiveness of the derived rule in several toy problems. Finally, through statistical analysis, we show that the synaptic plasticity rule established is closely related to the widely used BCM rule, for which good biological evidence exists.

Journal ArticleDOI
TL;DR: It is concluded that this dimension-reduction method gives ill-posed problems for a wide range of physiological parameters, and the suggested future directions are suggested.
Abstract: Computational techniques within the population density function (PDF) framework have provided time-saving alternatives to classical Monte Carlo simulations of neural network activity. Efficiency of the PDF method is lost as the underlying neuron model is made more realistic and the number of state variables increases. In a detailed theoretical and computational study, we elucidate strengths and weaknesses of dimension reduction by a particular moment closure method (Cai, Tao, Shelley, & McLaughlin, 2004; Cai, Tao, Rangan, & McLaughlin, 2006) as applied to integrate-and-fire neurons that receive excitatory synaptic input only. When the unitary postsynaptic conductance event has a single-exponential time course, the evolution equation for the PDF is a partial differential integral equation in two state variables, voltage and excitatory conductance. In the moment closure method, one approximates the conditional kth centered moment of excitatory conductance given voltage by the corresponding unconditioned moment. The result is a system of k coupled partial differential equations with one state variable, voltage, and k coupled ordinary differential equations. Moment closure at k = 2 works well, and at k = 3 works even better, in the regime of high dynamically varying synaptic input rates. Both closures break down at lower synaptic input rates. Phase-plane analysis of the k = 2 problem with typical parameters proves, and reveals why, no steady-state solutions exist below a synaptic input rate that gives a firing rate of 59 s-1 in the full 2D problem. Closure at k = 3 fails for similar reasons. Low firing-rate solutions can be obtained only with parameters for the amplitude or kinetics (or both) of the unitary postsynaptic conductance event that are on the edge of the physiological range. We conclude that this dimension-reduction method gives ill-posed problems for a wide range of physiological parameters, and we suggest future directions.

Journal ArticleDOI
TL;DR: The formulation of synaptic dynamics through an optimality criterion provides a simple graphical argument for the stability of synapses, necessary for synaptic memory.
Abstract: We studied the hypothesis that synaptic dynamics is controlled by three basic principles: (1) synapses adapt their weights so that neurons can effectively transmit information, (2) homeostatic processes stabilize the mean firing rate of the postsynaptic neuron, and (3) weak synapses adapt more slowly than strong ones, while maintenance of strong synapses is costly. Our results show that a synaptic update rule derived from these principles shares features, with spike-timing-dependent plasticity, is sensitive to correlations in the input and is useful for synaptic memory. Moreover, input selectivity (sharply tuned receptive fields) of postsynaptic neurons develops only if stimuli with strong features are presented. Sharply tuned neurons can coexist with unselective ones, and the distribution of synaptic weights can be unimodal or bimodal. The formulation of synaptic dynamics through an optimality criterion provides a simple graphical argument for the stability of synapses, necessary for synaptic memory.

Journal ArticleDOI
TL;DR: This model exploits arbitrary and reconfigurable connectivity between cells in the multichip architecture, achieved by asynchronously routing neural spike events within and between chips according to a memory-based look-up table.
Abstract: We present a multichip, mixed-signal VLSI system for spike-based vision processing. The system consists of an 80 × 60 pixel neuromorphic retina and a 4800 neuron silicon cortex with 4,194,304 synapses. Its functionality is illustrated with experimental data on multiple components of an attention-based hierarchical model of cortical object recognition, including feature coding, salience detection, and foveation. This model exploits arbitrary and reconfigurable connectivity between cells in the multichip architecture, achieved by asynchronously routing neural spike events within and between chips according to a memory-based look-up table. Synaptic parameters, including conductance and reversal potential, are also stored in memory and are used to dynamically configure synapse circuits within the silicon neurons.

Journal ArticleDOI
TL;DR: The experimental results provide promising evidence that it is possible to successfully employ the proposed algorithm ahead of SVM training, and to select only the patterns that are likely to be located near the decision boundary.
Abstract: The support vector machine (SVM) has been spotlighted in the machine learning community because of its theoretical soundness and practical performance. When applied to a large data set, however, it requires a large memory and a long time for training. To cope with the practical difficulty, we propose a pattern selection algorithm based on neighborhood properties. The idea is to select only the patterns that are likely to be located near the decision boundary. Those patterns are expected to be more informative than the randomly selected patterns. The experimental results provide promising evidence that it is possible to successfully employ the proposed algorithm ahead of SVM training.