scispace - formally typeset
Search or ask a question

Showing papers in "Neural Computation in 2008"


Journal ArticleDOI
TL;DR: The diffusion decision model is reviewed to show how it translates behavioral data accuracy, mean response times, and response time distributions into components of cognitive processing, including research in the domains of aging and neurophysiology.
Abstract: The diffusion decision model allows detailed explanations of behavior in two-choice discrimination tasks. In this article, the model is reviewed to show how it translates behavioral data—accuracy, mean response times, and response time distributions—into components of cognitive processing. Three experiments are used to illustrate experimental manipulations of three components: stimulus difficulty affects the quality of information on which a decision is based; instructions emphasizing either speed or accuracy affect the criterial amounts of information that a subject requires before initiating a response; and the relative proportions of the two stimuli affect biases in drift rate and starting point. The experiments also illustrate the strong constraints that ensure the model is empirically testable and potentially falsifiable. The broad range of applications of the model is also reviewed, including research in the domains of aging and neurophysiology.

2,318 citations


Journal ArticleDOI
TL;DR: This work proves that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions and suggests a new and less greedy criterion for training RBMs within DBNs.
Abstract: Deep belief networks (DBN) are generative neural network models with many layers of hidden explanatory factors, recently introduced by Hinton, Osindero, and Teh (2006) along with a greedy layer-wise unsupervised learning algorithm. The building block of a DBN is a probabilistic model called a restricted Boltzmann machine (RBM), used to represent one layer of the model. Restricted Boltzmann machines are interesting because inference is easy in them and because they have been successfully used as building blocks for training deeper models. We first prove that adding hidden units yields strictly improved modeling power, while a second theorem shows that RBMs are universal approximators of discrete distributions. We then study the question of whether DBNs with more layers are strictly more powerful in terms of representational power. This suggests a new and less greedy criterion for training RBMs within DBNs.

800 citations


Journal ArticleDOI
TL;DR: A locally competitive algorithm (LCA) is described that solves a collection of sparse coding principles minimizing a weighted combination of mean-squared error and a coefficient cost function to produce coefficients with sparsity levels comparable to the most popular centralized sparse coding algorithms while being readily suited for neural implementation.
Abstract: While evidence indicates that neural systems may be employing sparse approximations to represent sensed stimuli, the mechanisms underlying this ability are not understood. We describe a locally competitive algorithm (LCA) that solves a collection of sparse coding principles minimizing a weighted combination of mean-squared error and a coefficient cost function. LCAs are designed to be implemented in a dynamical system composed of many neuron-like elements operating in parallel. These algorithms use thresholding functions to induce local (usually one-way) inhibitory competitions between nodes to produce sparse representations. LCAs produce coefficients with sparsity levels comparable to the most popular centralized sparse coding algorithms while being readily suited for neural implementation. Additionally, LCA coefficients for video sequences demonstrate inertial properties that are both qualitatively and quantitatively more regular (i.e., smoother and more predictable) than the coefficients produced by greedy algorithms.

453 citations


Journal ArticleDOI
TL;DR: The dynamics of spiking neurons can be interpreted as a form of Bayesian inference in time, and firing statistics are close to Poisson, albeit providing a deterministic representation of probabilities.
Abstract: We show that the dynamics of spiking neurons can be interpreted as a form of Bayesian inference in time. Neurons that optimally integrate evidence about events in the external world exhibit properties similar to leaky integrate-and-fire neurons with spike-dependent adaptation and maximally respond to fluctuations of their input. Spikes signal the occurrence of new information---what cannot be predicted from the past activity. As a result, firing statistics are close to Poisson, albeit providing a deterministic representation of probabilities.

310 citations


Journal ArticleDOI
TL;DR: Simulations of larger networks (up to 350,000 neurons) demonstrated that the survival time of self-sustained activity increases exponentially with network size, and correctly predicted an intriguing property of conductance-based networks that does not appear to be shared by current-based models: they exhibit states of low-rate asynchronous irregular activity that persist for some period of time even in the absence of external inputs and without cortical pacemakers.
Abstract: We studied the dynamics of large networks of spiking neurons with conductance-based (nonlinear) synapses and compared them to networks with current-based (linear) synapses. For systems with sparse and inhibition-dominated recurrent connectivity, weak external inputs induced asynchronous irregular firing at low rates. Membrane potentials fluctuated a few millivolts below threshold, and membrane conductances were increased by a factor 2 to 5 with respect to the resting state. This combination of parameters characterizes the ongoing spiking activity typically recorded in the cortex in vivo. Many aspects of the asynchronous irregular state in conductance-based networks could be sufficiently well characterized with a simple numerical mean field approach. In particular, it correctly predicted an intriguing property of conductance-based networks that does not appear to be shared by current-based models: they exhibit states of low-rate asynchronous irregular activity that persist for some period of time even in the absence of external inputs and without cortical pacemakers. Simulations of larger networks (up to 350,000 neurons) demonstrated that the survival time of self-sustained activity increases exponentially with network size.

223 citations


Journal ArticleDOI
TL;DR: It is shown that when the two operations of the gaussian-like and max-like model are approximated by the circuit proposed here, the model is capable of generating selective and invariant neural responses and performing object recognition, in good agreement with neurophysiological data.
Abstract: A few distinct cortical operations have been postulated over the past few years, suggested by experimental data on nonlinear neural response across different areas in the cortex. Among these, the energy model proposes the summation of quadrature pairs following a squaring nonlinearity in order to explain phase invariance of complex V1 cells. The divisive normalization model assumes a gain-controlling, divisive inhibition to explain sigmoid-like response profiles within a pool of neurons. A gaussian-like operation hypothesizes a bell-shaped response tuned to a specific, optimal pattern of activation of the presynaptic inputs. A max-like operation assumes the selection and transmission of the most active response among a set of neural inputs. We propose that these distinct neural operations can be computed by the same canonical circuitry, involving divisive normalization and polynomial nonlinearities, for different parameter values within the circuit. Hence, this canonical circuit may provide a unifying framework for several circuit models, such as the divisive normalization and the energy models. As a case in point, we consider a feedforward hierarchical model of the ventral pathway of the primate visual cortex, which is built on a combination of the gaussian-like and max-like operations. We show that when the two operations are approximated by the circuit proposed here, the model is capable of generating selective and invariant neural responses and performing object recognition, in good agreement with neurophysiological data.

179 citations


Journal ArticleDOI
TL;DR: To reduce ambiguities of this type of decomposition, updates that can impose sparseness in any combination of modalities are developed, hence, proposed algorithms for sparse nonnegative Tucker decompositions (SN-TUCKER).
Abstract: There is a increasing interest in analysis of large-scale multiway data. The concept of multiway data refers to arrays of data with more than two dimensions, that is, taking the form of tensors. To analyze such data, decomposition techniques are widely used. The two most common decompositions for tensors are the Tucker model and the more restricted PARAFAC model. Both models can be viewed as generalizations of the regular factor analysis to data of more than two modalities. Nonnegative matrix factorization (NMF), in conjunction with sparse coding, has recently been given much attention due to its part-based and easy interpretable representation. While NMF has been extended to the PARAFAC model, no such attempt has been done to extend NMF to the Tucker model. However, if the tensor data analyzed are nonnegative, it may well be relevant to consider purely additive (i.e., nonnegative) Tucker decompositions). To reduce ambiguities of this type of decomposition, we develop updates that can impose sparseness in any combination of modalities, hence, proposed algorithms for sparse nonnegative Tucker decompositions (SN-TUCKER). We demonstrate how the proposed algorithms are superior to existing algorithms for Tucker decompositions when the data and interactions can be considered nonnegative. We further illustrate how sparse coding can help identify what model (PARAFAC or Tucker) is more appropriate for the data as well as to select the number of components by turning off excess components. The algorithms for SN-TUCKER can be downloaded from Morup (2007).

170 citations


Journal ArticleDOI
TL;DR: It is shown that exponentially deep belief networks can approximate any distribution over binary vectors to arbitrary accuracy, even when the width of each layer is limited to the dimensionality of the data.
Abstract: In this note, we show that exponentially deep belief networks can approximate any distribution over binary vectors to arbitrary accuracy, even when the width of each layer is limited to the dimensionality of the data. We further show that such networks can be greedily learned in an easy yet impractical way.

151 citations


Journal ArticleDOI
TL;DR: An improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations, including those when rewards are omitted or received early.
Abstract: The phasic firing of dopamine neurons has been theorized to encode a reward-prediction error as formalized by the temporal-difference (TD) algorithm in reinforcement learning. Most TD models of dopamine have assumed a stimulus representation, known as the complete serial compound, in which each moment in a trial is distinctly represented. We introduce a more realistic temporal stimulus representation for the TD model. In our model, all external stimuli, including rewards, spawn a series of internal microstimuli, which grow weaker and more diffuse over time. These microstimuli are used by the TD learning algorithm to generate predictions of future reward. This new stimulus representation injects temporal generalization into the TD model and enhances correspondence between model and data in several experiments, including those when rewards are omitted or received early. This improved fit mostly derives from the absence of large negative errors in the new model, suggesting that dopamine alone can encode the full range of TD errors in these situations.

146 citations


Journal ArticleDOI
TL;DR: It is hypothesized that within the functional significance of the experimentally observed link between attentional biasing of stimulus competition and gamma frequency rhythmicity lies the role of GABAA-receptor-mediated synapses.
Abstract: More coherent excitatory stimuli are known to have a competitive advantage over less coherent ones. We show here that this advantage is amplified greatly when the target includes inhibitory interneurons acting via GABAA-receptor-mediated synapses and the coherent input oscillates at gamma frequency. We hypothesize that therein lies, at least in part, the functional significance of the experimentally observed link between attentional biasing of stimulus competition and gamma frequency rhythmicity.

144 citations


Journal ArticleDOI
TL;DR: A large class of regularization methods, collectively known as spectral regularization and originally designed for solving ill-posed inverse problems, gives rise to regularized learning algorithms that are consistent kernel methods that can be easily implemented.
Abstract: We discuss how a large class of regularization methods, collectively known as spectral regularization and originally designed for solving ill-posed inverse problems, gives rise to regularized learning algorithms. All of these algorithms are consistent kernel methods that can be easily implemented. The intuition behind their derivation is that the same principle allowing for the numerical stabilization of a matrix inversion problem is crucial to avoid overfitting. The various methods have a common derivation but different computational and theoretical properties. We describe examples of such algorithms, analyze their classification performance on several data sets and discuss their applicability to real-world problems.

Journal ArticleDOI
TL;DR: This letter develops an extension of the RNN to the case when synchronous interactions can occur, leading to synchronous firing by large ensembles of cells, and presents an O(N3) gradient descent learning algorithm for an N-cell recurrent network having both conventional excitatory-inhibitory interactions and synchronous interaction.
Abstract: Large-scale distributed systems, such as natural neuronal and artificial systems, have many local interconnections, but they often also have the ability to propagate information very fast over relatively large distances. Mechanisms that enable such behavior include very long physical signaling paths and possibly saccades of synchronous behavior that may propagate across a network. This letter studies the modeling of such behaviors in neuronal networks and develops a related learning algorithm. This is done in the context of the random neural network (RNN), a probabilistic model with a well-developed mathematical theory, which was inspired by the apparently stochastic spiking behavior of certain natural neuronal systems. Thus, we develop an extension of the RNN to the case when synchronous interactions can occur, leading to synchronous firing by large ensembles of cells. We also present an O(N3) gradient descent learning algorithm for an N-cell recurrent network having both conventional excitatory-inhibitory interactions and synchronous interactions. Finally, the model and its learning algorithm are applied to a resource allocation problem that is NP-hard and requires fast approximate decisions.

Journal ArticleDOI
TL;DR: An online version of the expectation-maximization (EM) algorithm for hidden Markov models (HMMs) is presented, generalized to the case where the model parameters can change with time by introducing a discount factor into the recurrence relations.
Abstract: We present an online version of the expectation-maximization (EM) algorithm for hidden Markov models (HMMs). The sufficient statistics required for parameters estimation is computed recursively with time, that is, in an online way instead of using the batch forward-backward procedure. This computational scheme is generalized to the case where the model parameters can change with time by introducing a discount factor into the recurrence relations. The resulting algorithm is equivalent to the batch EM algorithm, for appropriate discount factor and scheduling of parameters update. On the other hand, the online algorithm is able to deal with dynamic environments, i.e., when the statistics of the observed data is changing with time. The implications of the online algorithm for probabilistic modeling in neuroscience are briefly discussed.

Journal ArticleDOI
TL;DR: Full brain (40,000 voxels) single TR (repetition time) classifiers are trained on data from 10 subjects in two different recognition tasks on the most controversial classes of stimuli and show 97.4% median out-of-sample (unseen TRs) generalization.
Abstract: Over the past decade, object recognition work has confounded voxel response detection with potential voxel class identification. Consequently, the claim that there are areas of the brain that are necessary and sufficient for object identification cannot be resolved with existing associative methods (e.g., the general linear model) that are dominant in brain imaging methods. In order to explore this controversy we trained full brain (40,000 voxels) single TR (repetition time) classifiers on data from 10 subjects in two different recognition tasks on the most controversial classes of stimuli (house and face) and show 97.4% median out-of-sample (unseen TRs) generalization. This performance allowed us to reliably and uniquely assay the classifier's voxel diagnosticity in all individual subjects' brains. In this two-class case, there may be specific areas diagnostic for house stimuli (e.g., LO) or for face stimuli (e.g., STS); however, in contrast to the detection results common in this literature, neither the fusiform face area nor parahippocampal place area is shown to be uniquely diagnostic for faces or places, respectively.

Journal ArticleDOI
TL;DR: From the proof of the existence and uniqueness of the solution, it is proved that the solution of a delayed dynamical system with high-slope activations approximates to the Filippov solution of the dynamicals system with discontinuous activations.
Abstract: We use the concept of the Filippov solution to study the dynamics of a class of delayed dynamical systems with discontinuous right-hand side, which contains the widely studied delayed neural network models with almost periodic self-inhibitions, interconnection weights, and external inputs. We prove that diagonal-dominant conditions can guarantee the existence and uniqueness of an almost periodic solution, as well as its global exponential stability. As special cases, we derive a series of results on the dynamics of delayed dynamical systems with discontinuous activations and periodic coefficients or constant coefficients, respectively. From the proof of the existence and uniqueness of the solution, we prove that the solution of a delayed dynamical system with high-slope activations approximates to the Filippov solution of the dynamical system with discontinuous activations.

Journal ArticleDOI
Kar-Ann Toh1
TL;DR: By approximating the nonlinear counting step function using a quadratic function, the classification error rate is shown to be deterministically solvable and empirical results indicate SLFN's effectiveness on classification generalization.
Abstract: This letter presents a minimum classification error learning formulation for a single-layer feedforward network (SLFN). By approximating the nonlinear counting step function using a quadratic function, the classification error rate is shown to be deterministically solvable. Essentially the derived solution is related to an existing weighted least-squares method with class-specific weights set according to the size of data set. By considering the class-specific weights as adjustable parameters, the learning formulation extends the classification robustness of the SLFN without sacrificing its intrinsic advantage of being a closed-form algorithm. While the method is applicable to other linear formulations, our empirical results indicate SLFN's effectiveness on classification generalization.

Journal ArticleDOI
TL;DR: In this article, it was shown that the structure of the connectivity matrix of such networks induces considerable correlations between synaptic currents as well as between subthreshold membrane potentials, provided Dale's principle is respected.
Abstract: The function of cortical networks depends on the collective interplay between neurons and neuronal populations, which is reflected in the correlation of signals that can be recorded at different levels. To correctly interpret these observations it is important to understand the origin of neuronal correlations. Here we study how cells in large recurrent networks of excitatory and inhibitory neurons interact and how the associated correlations affect stationary states of idle network activity. We demonstrate that the structure of the connectivity matrix of such networks induces considerable correlations between synaptic currents as well as between subthreshold membrane potentials, provided Dale's principle is respected. If, in contrast, synaptic weights are randomly distributed, input correlations can vanish, even for densely connected networks. Although correlations are strongly attenuated when proceeding from membrane potentials to action potentials (spikes), the resulting weak correlations in the spike output can cause substantial fluctuations in the population activity, even in highly diluted networks. We show that simple mean-field models that take the structure of the coupling matrix into account can adequately describe the power spectra of the population activity. The consequences of Dale's principle on correlations and rate fluctuations are discussed in the light of recent experimental findings.

Journal ArticleDOI
TL;DR: This work develops a strategy to reduce the dynamics of a large-size network by utilizing the fact that a continuous attractor can eliminate the noise components perpendicular to the attractor space very quickly, and simplifies it successfully as a one-dimensional Ornstein-Uhlenbeck process.
Abstract: Continuous attractor is a promising model for describing the encoding of continuous stimuli in neural systems. In a continuous attractor, the stationary states of the neural system form a continuous parameter space, on which the system is neutrally stable. This property enables the neutral system to track time-varying stimuli smoothly, but it also degrades the accuracy of information retrieval, since these stationary states are easily disturbed by external noise. In this work, based on a simple model, we systematically investigate the dynamics and the computational properties of continuous attractors. In order to analyze the dynamics of a large-size network, which is otherwise extremely complicated, we develop a strategy to reduce its dimensionality by utilizing the fact that a continuous attractor can eliminate the noise components perpendicular to the attractor space very quickly. We therefore project the network dynamics onto the tangent of the attractor space and simplify it successfully as a one-dimensional Ornstein-Uhlenbeck process. Based on this simplified model, we investigate (1) the decoding error of a continuous attractor under the driving of external noisy inputs, (2) the tracking speed of a continuous attractor when external stimulus experiences abrupt changes, (3) the neural correlation structure associated with the specific dynamics of a continuous attractor, and (4) the consequence of asymmetric neural correlation on statistical population decoding. The potential implications of these results on our understanding of neural information processing are also discussed.

Journal ArticleDOI
TL;DR: This work identifies a general form for the solution to the problem of converting unrealistic network models into biologically plausible models that respect this constraint and describes how the precise solution for a given cortical network can be determined empirically.
Abstract: In cortical neural networks, connections from a given neuron are either inhibitory or excitatory but not both. This constraint is often ignored by theoreticians who build models of these systems. There is currently no general solution to the problem of converting such unrealistic network models into biologically plausible models that respect this constraint. We demonstrate a constructive transformation of models that solves this problem for both feedforward and dynamic recurrent networks. The resulting models give a close approximation to the original network functions and temporal dynamics of the system, and they are biologically plausible. More precisely, we identify a general form for the solution to this problem. As a result, we also describe how the precise solution for a given cortical network can be determined empirically.

Journal ArticleDOI
TL;DR: The minimum principle is used to obtain minimum acceleration trajectories and the jerk is used as a control signal to find a solution that does not include nonphysiological impulse functions, constraints on the maximum and minimum jerk values are assumed.
Abstract: Rapid arm-reaching movements serve as an excellent test bed for any theory about trajectory formation. How are these movements planned? A minimum acceleration criterion has been examined in the past, and the solution obtained, based on the Euler-Poisson equation, failed to predict that the hand would begin and end the movement at rest (i.e., with zero acceleration). Therefore, this criterion was rejected in favor of the minimum jerk, which was proved to be successful in describing many features of human movements. This letter follows an alternative approach and solves the minimum acceleration problem with constraints using Pontryagin's minimum principle. We use the minimum principle to obtain minimum acceleration trajectories and use the jerk as a control signal. In order to find a solution that does not include nonphysiological impulse functions, constraints on the maximum and minimum jerk values are assumed. The analytical solution provides a three-phase piecewise constant jerk signal (bang-bang control) where the magnitude of the jerk and the two switching times depend on the magnitude of the maximum and minimum available jerk values. This result fits the observed trajectories of reaching movements and takes into account both the extrinsic coordinates and the muscle limitations in a single framework. The minimum acceleration with constraints principle is discussed as a unifying approach for many observations about the neural control of movements.

Journal ArticleDOI
TL;DR: In this letter, it is demonstrated how a variational approximation scheme enables effective inference of key parameters in the probabilistic L1-PCA model based on the variational expectation-maximization-type algorithm.
Abstract: We introduce a robust probabilistic L1-PCA model in which the conventional gaussian distribution for the noise in the observed data was replaced by the Laplacian distribution (or L1 distribution). Due to the heavy tail characteristics of the L1 distribution, the proposed model is supposed to be more robust against data outliers. In this letter, we demonstrate how a variational approximation scheme enables effective inference of key parameters in the probabilistic L1-PCA model. As the L1 density can be expanded as a superposition of infinite number of gaussian densities, we express the L1-PCA model as a marginalized model over the superpositions. By doing so, a tractable Bayesian inference can be achieved based on the variational expectation-maximization-type algorithm.

Journal ArticleDOI
TL;DR: It is demonstrated that correlation functions and statistical second-order measures generally exhibit a complex dependence on the filter properties and the statistics of the presynaptic spike trains, and can play a significant role in modulating the interaction strength between neurons or neuron populations.
Abstract: Correlated neural activity has been observed at various signal levels (e.g., spike count, membrane potential, local field potential, EEG, fMRI BOLD). Most of these signals can be considered as superpositions of spike trains filtered by components of the neural system (synapses, membranes) and the measurement process. It is largely unknown how the spike train correlation structure is altered by this filtering and what the consequences for the dynamics of the system and for the interpretation of measured correlations are. In this study, we focus on linearly filtered spike trains and particularly consider correlations caused by overlapping presynaptic neuron populations. We demonstrate that correlation functions and statistical second-order measures like the variance, the covariance, and the correlation coefficient generally exhibit a complex dependence on the filter properties and the statistics of the presynaptic spike trains. We point out that both contributions can play a significant role in modulating the interaction strength between neurons or neuron populations. In many applications, the coherence allows a filter-independent quantification of correlated activity. In different network models, we discuss the estimation of network connectivity from the high-frequency coherence of simultaneous intracellular recordings of pairs of neurons.

Journal ArticleDOI
TL;DR: It is proven that the neural network with a sufficiently high gain is globally convergent to the optimal solution of the linear programming problem.
Abstract: A one-layer recurrent neural network with a discontinuous activation function is proposed for linear programming. The number of neurons in the neural network is equal to that of decision variables in the linear programming problem. It is proven that the neural network with a sufficiently high gain is globally convergent to the optimal solution. Its application to linear assignment is discussed to demonstrate the utility of the neural network. Several simulation examples are given to show the effectiveness and characteristics of the neural network.

Journal ArticleDOI
TL;DR: An extension of the van Rossum metric to a multineuron metric is suggested that gives a metric that is both natural and easy to calculate.
Abstract: The Victor-Purpura spike train metric has recently been extended to a family of multineuron metrics and used to analyze spike trains recorded simultaneously from pairs of proximate neurons. The metric is one of the two metrics commonly used for quantifying the distance between two spike trains; the other is the van Rossum metric. Here, we suggest an extension of the van Rossum metric to a multineuron metric. We believe this gives a metric that is both natural and easy to calculate. Both types of multineuron metric are applied to simulated data and are compared.

Journal ArticleDOI
TL;DR: The trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, are analyzed, and show their general properties.
Abstract: We explicitly analyze the trajectories of learning near singularities in hierarchical networks, such as multilayer perceptrons and radial basis function networks, which include permutation symmetry of hidden nodes, and show their general properties. Such symmetry induces singularities in their parameter space, where the Fisher information matrix degenerates and odd learning behaviors, especially the existence of plateaus in gradient descent learning, arise due to the geometric structure of singularity. We plot dynamic vector fields to demonstrate the universal trajectories of learning near singularities. The singularity induces two types of plateaus, the on-singularity plateau and the near-singularity plateau, depending on the stability of the singularity and the initial parameters of learning. The results presented in this letter are universally applicable to a wide class of hierarchical models. Detailed stability analysis of the dynamics of learning in radial basis function networks and multilayer perceptrons will be presented in separate work.

Journal ArticleDOI
TL;DR: This work develops a theory of Bayesian learning in spiking neural networks, where the neurons learn to recognize temporal dynamics of their synaptic inputs, and successive layers of neurons learn hierarchical causal models for the sensory input.
Abstract: In the companion letter in this issue (“Bayesian Spiking Neurons I: Inference”), we showed that the dynamics of spiking neurons can be interpreted as a form of Bayesian integration, accumulating evidence over time about events in the external world or the body. We proceed to develop a theory of Bayesian learning in spiking neural networks, where the neurons learn to recognize temporal dynamics of their synaptic inputs. Meanwhile, successive layers of neurons learn hierarchical causal models for the sensory input. The corresponding learning rule is local, spike-time dependent, and highly nonlinear. This approach provides a principled description of spiking and plasticity rules maximizing information transfer, while limiting the number of costly spikes, between successive layers of neurons.

Journal ArticleDOI
TL;DR: A sensorimotor model is presented that shows that a naive organism can learn the auditory space based solely on acoustic inputs and their relation to motor states and demonstrates quantitatively that the experience of the sensory consequences of its voluntary motor actions allows an organism to learn the spatial location of any sound source.
Abstract: Sound localization is known to be a complex phenomenon, combining multisensory information processing, experience-dependent plasticity, and movement. Here we present a sensorimotor model that addresses the question of how an organism could learn to localize sound sources without any a priori neural representation of its head-related transfer function or prior experience with auditory spatial information. We demonstrate quantitatively that the experience of the sensory consequences of its voluntary motor actions allows an organism to learn the spatial location of any sound source. Using examples from humans and echolocating bats, our model shows that a naive organism can learn the auditory space based solely on acoustic inputs and their relation to motor states.

Journal ArticleDOI
TL;DR: It is demonstrated that bandlimited stimuli can be faithfully represented with spike trains generated by the ensemble of neurons and it is shown that recovery is perfect if the number of neurons in the population is larger than a threshold value.
Abstract: We consider a formal model of stimulus encoding with a circuit consisting of a bank of filters and an ensemble of integrate-and-fire neurons. Such models arise in olfactory systems, vision, and hearing. We demonstrate that bandlimited stimuli can be faithfully represented with spike trains generated by the ensemble of neurons. We provide a stimulus reconstruction scheme based on the spike times of the ensemble of neurons and derive conditions for perfect recovery. The key result calls for the spike density of the neural population to be above the Nyquist rate. We also show that recovery is perfect if the number of neurons in the population is larger than a threshold value. Increasing the number of neurons to achieve a faithful representation of the sensory world is consistent with basic neurobiological thought. Finally we demonstrate that in general, the problem of faithful recovery of stimuli from the spike train of single neurons is ill posed. The stimulus can be recovered, however, from the information contained in the spike train of a population of neurons.

Journal ArticleDOI
TL;DR: It is shown that the linear case of SFA can be interpreted as a variant of predictive coding that maximizes the mutual information between the current output of the system and the input signal in the next time step, demonstrating that the slowness principle and predictive coding are intimately related.
Abstract: Understanding the guiding principles of sensory coding strategies is a main goal in computational neuroscience. Among others, the principles of predictive coding and slowness appear to capture aspects of sensory processing. Predictive coding postulates that sensory systems are adapted to the structure of their input signals such that information about future inputs is encoded. Slow feature analysis (SFA) is a method for extracting slowly varying components from quickly varying input signals, thereby learning temporally invariant features. Here, we use the information bottleneck method to state an information-theoretic objective function for temporally local predictive coding. We then show that the linear case of SFA can be interpreted as a variant of predictive coding that maximizes the mutual information between the current output of the system and the input signal in the next time step. This demonstrates that the slowness principle and predictive coding are intimately related.

Journal ArticleDOI
TL;DR: A novel paradigm for spike train decoding is proposed, which avoids entirely spike sorting based on waveform measurements and involves an exact expectation EM algorithm that is fast enough that it could also be left to run during decoding to capture potential slow changes in the states of the neurons.
Abstract: We propose a novel paradigm for spike train decoding, which avoids entirely spike sorting based on waveform measurements This paradigm directly uses the spike train collected at recording electrodes from thresholding the bandpassed voltage signal Our approach is a paradigm, not an algorithm, since it can be used with any of the current decoding algorithms, such as population vector or likelihood-based algorithms Based on analytical results and an extensive simulation study, we show that our paradigm is comparable to, and sometimes more efficient than, the traditional approach based on well-isolated neurons and that it remains efficient even when all electrodes are severely corrupted by noise, a situation that would render spike sorting particularly difficult Our paradigm will also save time and computational effort, both of which are crucially important for successful operation of real-time brain-machine interfaces Indeed, in place of the lengthy spike-sorting task of the traditional approach, it involves an exact expectation EM algorithm that is fast enough that it could also be left to run during decoding to capture potential slow changes in the states of the neurons