scispace - formally typeset
Search or ask a question

Showing papers in "Neural Computation in 2006"


Journal ArticleDOI
TL;DR: A fast, greedy algorithm is derived that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory.
Abstract: We show how to use "complementary priors" to eliminate the explaining-away effects that make inference difficult in densely connected belief nets that have many hidden layers. Using complementary priors, we derive a fast, greedy algorithm that can learn deep, directed belief networks one layer at a time, provided the top two layers form an undirected associative memory. The fast, greedy algorithm is used to initialize a slower learning procedure that fine-tunes the weights using a contrastive version of the wake-sleep algorithm. After fine-tuning, a network with three hidden layers forms a very good generative model of the joint distribution of handwritten digit images and their labels. This generative model gives better digit classification than the best discriminative learning algorithms. The low-dimensional manifolds on which the digits lie are modeled by long ravines in the free-energy landscape of the top-level associative memory, and it is easy to explore these ravines by using the directed connections to display what the associative memory has in mind.

15,055 citations


Journal ArticleDOI
TL;DR: A minimal spiking network that can polychronize, that is, exhibit reproducible time-locked but not synchronous firing patterns with millisecond precision, as in synfire braids is presented.
Abstract: We present a minimal spiking network that can polychronize, that is, exhibit reproducible time-locked but not synchronous firing patterns with millisecond precision, as in synfire braids. The network consists of cortical spiking neurons with axonal conduction delays and spike-timing-dependent plasticity (STDP); a ready-to-use MATLAB code is included. It exhibits sleeplike oscillations, gamma (40 Hz) rhythms, conversion of firing rates to spike timings, and other interesting regimes. Due to the interplay between the delays and STDP, the spiking neurons spontaneously self-organize into groups and generate patterns of stereotypical polychronous activity. To our surprise, the number of coexisting polychronous groups far exceeds the number of neurons in the network, resulting in an unprecedented memory capacity of the system. We speculate on the significance of polychrony to the theory of neuronal group selection (TNGS, neural Darwinism), cognitive neural computations, binding and gamma rhythm, mechanisms of attention, and consciousness as "attention to memories."

1,171 citations


Journal ArticleDOI
TL;DR: This article presents an attempt to deconstruct this homunculus through powerful learning mechanisms that allow a computational model of the prefrontal cortex to control both itself and other brain areas in a strategic, task-appropriate manner.
Abstract: The prefrontal cortex has long been thought to subserve both working memory (the holding of information online for processing) and executive functions (deciding how to manipulate working memory and perform processing). Although many computational models of working memory have been developed, the mechanistic basis of executive function remains elusive, often amounting to a homunculus. This article presents an attempt to deconstruct this homunculus through powerful learning mechanisms that allow a computational model of the prefrontal cortex to control both itself and other brain areas in a strategic, task-appropriate manner. These learning mechanisms are based on subcortical structures in the midbrain, basal ganglia, and amygdala, which together form an actor-critic architecture. The critic system learns which prefrontal representations are task relevant and trains the actor, which in turn provides a dynamic gating mechanism for controlling working memory updating. Computationally, the learning mechanism is designed to simultaneously solve the temporal and structural credit assignment problems. The model's performance compares favorably with standard backpropagation-based temporal learning mechanisms on the challenging 1-2-AX working memory task and other benchmark working memory tasks.

971 citations


Journal ArticleDOI
Wei Wu1, Yun Gao1, Elie Bienenstock1, John P. Donoghue1, Michael J. Black1 
TL;DR: A real-time system that uses Bayesian inference techniques to estimate hand motion from the firing rates of multiple neurons, which provides a principled probabilistic model of motor-cortical coding, decodes hand motion in real time, provides an estimate of uncertainty, and is straightforward to implement.
Abstract: Effective neural motor prostheses require a method for decoding neural activity representing desired movement. In particular, the accurate reconstruction of a continuous motion signal is necessary for the control of devices such as computer cursors, robots, or a patient's own paralyzed limbs. For such applications, we developed a real-time system that uses Bayesian inference techniques to estimate hand motion from the firing rates of multiple neurons. In this study, we used recordings that were previously made in the arm area of primary motor cortex in awake behaving monkeys using a chronically implanted multielectrode microarray. Bayesian inference involves computing the posterior probability of the hand motion conditioned on a sequence of observed firing rates; this is formulated in terms of the product of a likelihood and a prior. The likelihood term models the probability of firing rates given a particular hand motion. We found that a linear gaussian model could be used to approximate this likelihood and could be readily learned from a small amount of training data. The prior term defines a probabilistic model of hand kinematics and was also taken to be a linear gaussian model. Decoding was performed using a Kalman filter, which gives an efficient recursive method for Bayesian inference when the likelihood and prior are linear and gaussian. In off-line experiments, the Kalman filter reconstructions of hand trajectory were more accurate than previously reported results. The resulting decoding algorithm provides a principled probabilistic model of motor-cortical coding, decodes hand motion in real time, provides an estimate of uncertainty, and is straightforward to implement. Additionally the formulation unifies and extends previous models of neural coding while providing insights into the motor-cortical code.

457 citations


Journal ArticleDOI
TL;DR: Analysis of the accuracy of the beauty prediction machine as a function of the size of the training data indicates that a machine producing human-like attractiveness rating could be obtained given a moderately larger data set.
Abstract: This work presents a novel study of the notion of facial attractiveness in a machine learning context. To this end, we collected human beauty ratings for data sets of facial images and used various techniques for learning the attractiveness of a face. The trained predictor achieves a significant correlation of 0.65 with the average human ratings. The results clearly show that facial beauty is a universal concept that a machine can learn. Analysis of the accuracy of the beauty prediction machine as a function of the size of the training data indicates that a machine producing human-like attractiveness rating could be obtained given a moderately larger data set.

249 citations


Journal ArticleDOI
TL;DR: In this article, the authors use a supervised learning paradigm to derive a synaptic update rule that optimizes by gradient ascent the likelihood of postsynaptic firing at one or several desired firing times.
Abstract: In timing-based neural codes, neurons have to emit action potentials at precise moments in time. We use a supervised learning paradigm to derive a synaptic update rule that optimizes by gradient ascent the likelihood of postsynaptic firing at one or several desired firing times. We find that the optimal strategy of up- and downregulating synaptic efficacies depends on the relative timing between presynaptic spike arrival and desired postsynaptic firing. If the presynaptic spike arrives before the desired postsynaptic spike timing, our optimal learning rule predicts that the synapse should become potentiated. The dependence of the potentiation on spike timing directly reflects the time course of an excitatory postsynaptic potential. However, our approach gives no unique reason for synaptic depression under reversed spike timing. In fact, the presence and amplitude of depression of synaptic efficacies for reversed spike timing depend on how constraints are implemented in the optimization problem. Two different constraints, control of postsynaptic rates and control of temporal locality, are studied. The relation of our results to spike-timing-dependent plasticity and reinforcement learning is discussed.

242 citations


Journal ArticleDOI
TL;DR: It is shown that information capacity of a heterogeneous network is not limited by the correlated noise, but scales linearly with the number of cells in the population, and an optimal linear readout that takes into account the neuronal heterogeneity can extract most of this information.
Abstract: In many cortical and subcortical areas, neurons are known to modulate their average firing rate in response to certain external stimulus features. It is widely believed that information about the stimulus features is coded by a weighted average of the neural responses. Recent theoretical studies have shown that the information capacity of such a coding scheme is very limited in the presence of the experimentally observed pairwise correlations. However, central to the analysis of these studies was the assumption of a homogeneous population of neurons. Experimental findings show a considerable measure of heterogeneity in the response properties of different neurons.In this study, we investigate the effect of neuronal heterogeneity on the information capacity of a correlated population of neurons. We show that information capacity of a heterogeneous network is not limited by the correlated noise, but scales linearly with the number of cells in the population. This information cannot be extracted by the population vector readout, whose accuracy is greatly suppressed by the correlated noise. On the other hand, we show that an optimal linear readout that takes into account the neuronal heterogeneity can extract most of this information. We study analytically the nature of the dependence of the optimal linear readout weights on the neuronal diversity. We show that simple online learning can generate readout weights with the appropriate dependence on the neuronal diversity, thereby yielding efficient readout.

242 citations


Journal ArticleDOI
TL;DR: Noise is applied for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration, and the resulting policy outperforms previous RL algorithms by almost two orders of magnitude.
Abstract: The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.

236 citations


Journal ArticleDOI
TL;DR: This is the first time that a fully variational Bayesian treatment for multiclass GP classification has been developed without having to resort to additional explicit approximations to the nongaussian likelihood term.
Abstract: It is well known in the statistics literature that augmenting binary and polychotomous response models with gaussian latent variables enables exact Bayesian analysis via Gibbs sampling from the parameter posterior. By adopting such a data augmentation strategy, dispensing with priors over regression coefficients in favor of gaussian process (GP) priors over functions, and employing variational approximations to the full posterior, we obtain efficient computational methods for GP classification in the multiclass setting. The model augmentation with additional latent variables ensures full a posteriori class coupling while retaining the simple a priori independent GP covariance structure from which sparse approximations, such as multiclass informative vector machines (IVM), emerge in a natural and straightforward manner. This is the first time that a fully variational Bayesian treatment for multiclass GP classification has been developed without having to resort to additional explicit approximations to the nongaussian likelihood term. Empirical comparisons with exact analysis use Markov Chain Monte Carlo (MCMC) and Laplace approximations illustrate the utility of the variational approximation as a computationally economic alternative to full MCMC and it is shown to be more accurate than the Laplace approximation.

229 citations


Journal ArticleDOI
TL;DR: A new boosting algorithm, AdaBoost.RT, is described, which requires selecting the suboptimal value of the error threshold to demarcate examples as poorly or well predicted for regression problems.
Abstract: The application of boosting technique to regression problems has received relatively little attention in contrast to research aimed at classification problems. This letter describes a new boosting algorithm, AdaBoost.RT, for regression problems. Its idea is in filtering out the examples with the relative estimation error that is higher than the preset threshold value, and then following the AdaBoost procedure. Thus, it requires selecting the suboptimal value of the error threshold to demarcate examples as poorly or well predicted. Some experimental results using the M5 model tree as a weak learning machine for several benchmark data sets are reported. The results are compared to other boosting methods, bagging, artificial neural networks, and a single M5 model tree. The preliminary empirical comparisons show higher performance of AdaBoost.RT for most of the considered data sets.

194 citations


Journal ArticleDOI
TL;DR: This model features three interacting populations of cortical neurons and is described by a six-dimensional nonlinear dynamical system, which leads to a compact description of the oscillatory behaviors observed in Jansen and Rit (1995) and Wendling, Bellanger, Bartolomei, and Chauvel (2000).
Abstract: We present a mathematical model of a neural mass developed by a number of people, including Lopes da Silva and Jansen. This model features three interacting populations of cortical neurons and is described by a six-dimensional nonlinear dynamical system. We address some aspects of its behavior through a bifurcation analysis with respect to the input parameter of the system. This leads to a compact description of the oscillatory behaviors observed in Jansen and Rit (1995) (alpha activity) and Wendling, Bellanger, Bartolomei, and Chauvel (2000) (spike-like epileptic activity). In the case of small or slow variation of the input, the model can even be described as a binary unit. Again using the bifurcation framework, we discuss the influence of other parameters of the system on the behavior of the neural mass model.

Journal ArticleDOI
TL;DR: The new theory assumes (in accord with recent computational theories of cortex) that problems of partial observability and stimulus history are solved in sensory cortex using statistical modeling and inference and that the TD system predicts reward using the results of this inference rather than raw sensory data.
Abstract: Although the responses of dopamine neurons in the primate midbrain are well characterized as carrying a temporal difference (TD) error signal for reward prediction, existing theories do not offer a credible account of how the brain keeps track of past sensory events that may be relevant to predicting future reward. Empirically, these shortcomings of previous theories are particularly evident in their account of experiments in which animals were exposed to variation in the timing of events. The original theories mispredicted the results of such experiments due to their use of a representational device called a tapped delay line.Here we propose that a richer understanding of history representation and a better account of these experiments can be given by considering TD algorithms for a formal setting that incorporates two features not originally considered in theories of the dopaminergic response: partial observability (a distinction between the animal's sensory experience and the true underlying state of the world) and semi-Markov dynamics (an explicit account of variation in the intervals between events). The new theory situates the dopaminergic system in a richer functional and anatomical context, since it assumes (in accord with recent computational theories of cortex) that problems of partial observability and stimulus history are solved in sensory cortex using statistical modeling and inference and that the TD system predicts reward using the results of this inference rather than raw sensory data. It also accounts for a range of experimental data, including the experiments involving programmed temporal variability and other previously unmodeled dopaminergic response phenomena, which we suggest are related to subjective noise in animals' interval timing. Finally, it offers new experimental predictions and a rich theoretical framework for designing future experiments.

Journal ArticleDOI
TL;DR: It is shown that Volterra and Wiener series can be represented implicitly as elements of a reproducing kernel Hilbert space by using polynomial kernels.
Abstract: Volterra and Wiener series are perhaps the best-understood nonlinear system representations in signal processing. Although both approaches have enjoyed a certain popularity in the past, their application has been limited to rather low-dimensional and weakly nonlinear systems due to the exponential growth of the number of terms that have to be estimated. We show that Volterra and Wiener series can be represented implicitly as elements of a reproducing kernel Hilbert space by using polynomial kernels. The estimation complexity of the implicit representation is linear in the input dimensionality and independent of the degree of nonlinearity. Experiments show performance advantages in terms of convergence, interpretability, and system sizes that can be handled.

Journal ArticleDOI
TL;DR: This work implements and evaluates critically an event-driven algorithm (ED-LUT) that uses precalculated look-up tables to characterize synaptic and neuronal dynamics, and introduces an improved two-stage event-queue algorithm, which allows the simulations to scale efficiently to highly connected networks with arbitrary propagation delays.
Abstract: Nearly all neuronal information processing and interneuronal communication in the brain involves action potentials, or spikes, which drive the short-term synaptic dynamics of neurons, but also their long-term dynamics, via synaptic plasticity. In many brain structures, action potential activity is considered to be sparse. This sparseness of activity has been exploited to reduce the computational cost of large-scale network simulations, through the development of event-driven simulation schemes. However, existing event-driven simulations schemes use extremely simplified neuronal models. Here, we implement and evaluate critically an event-driven algorithm (ED-LUT) that uses precalculated look-up tables to characterize synaptic and neuronal dynamics. This approach enables the use of more complex (and realistic) neuronal models or data in representing the neurons, while retaining the advantage of high-speed simulation. We demonstrate the method's application for neurons containing exponential synaptic conductances, thereby implementing shunting inhibition, a phenomenon that is critical to cellular computation. We also introduce an improved two-stage event-queue algorithm, which allows the simulations to scale efficiently to highly connected networks with arbitrary propagation delays. Finally, the scheme readily accommodates implementation of synaptic plasticity mechanisms that depend on spike timing, enabling future simulations to explore issues of long-term learning and adaptation in large-scale networks.

Journal ArticleDOI
TL;DR: An expectation-maximization algorithm for fitting LDS models to experimental data is presented and the difficulties inherent in estimating the parameters associated with feedback-driven learning are described.
Abstract: Recent studies have employed simple linear dynamical systems to model trial-by-trial dynamics in various sensorimotor learning tasks. Here we explore the theoretical and practical considerations that arise when employing the general class of linear dynamical systems (LDS) as a model for sensorimotor learning. In this framework, the state of the system is a set of parameters that define the current sensorimotor transformation—the function that maps sensory inputs to motor outputs. The class of LDS models provides a first-order approximation for any Markovian (state-dependent) learning rule that specifies the changes in the sensorimotor transformation that result from sensory feedback on each movement. We show that modeling the trial-by-trial dynamics of learning provides a substantially enhanced picture of the process of adaptation compared to measurements of the steady state of adaptation derived from more traditional blocked-exposure experiments. Specifically, these models can be used to quantify sensory and performance biases, the extent to which learned changes in the sensorimotor transformation decay over time, and the portion of motor variability due to either learning or performance variability. We show that previous attempts to fit such models with linear regression have not generally yielded consistent parameter estimates. Instead, we present an expectation-maximization algorithm for fitting LDS models to experimental data and describe the difficulties inherent in estimating the parameters associated with feedback-driven learning. Finally, we demonstrate the application of these methods in a simple sensorimotor learning experiment: adaptation to shifted visual feedback during reaching.

Journal ArticleDOI
TL;DR: An energy-based model is presented that uses a product of generalized Student-t distributions to capture the statistical structure in data sets to study the topographic organization of Gabor-like receptive fields that the model learns.
Abstract: We present an energy-based model that uses a product of generalized Student-t distributions to capture the statistical structure in data sets. This model is inspired by and particularly applicable to "natural" data sets such as images. We begin by providing the mathematical framework, where we discuss complete and overcomplete models and provide algorithms for training these models from data. Using patches of natural scenes, we demonstrate that our approach represents a viable alternative to independent component analysis as an interpretive model of biological visual systems. Although the two approaches are similar in flavor, there are also important differences, particularly when the representations are overcomplete. By constraining the interactions within our model, we are also able to study the topographic organization of Gabor-like receptive fields that our model learns. Finally, we discuss the relation of our new approach to previous work—in particular, gaussian scale mixture models and variants of independent components analysis.

Journal ArticleDOI
TL;DR: The aim of this letter is to provide a more comprehensive understanding of the mechanisms by which the asynchronous states in large, fully connected networks of inhibitory neurons are destabilized as a function of the noise level.
Abstract: GABAergic interneurons play a major role in the emergence of various types of synchronous oscillatory patterns of activity in the central nervous system. Motivated by these experimental facts, modeling studies have investigated mechanisms for the emergence of coherent activity in networks of inhibitory neurons. However, most of these studies have focused either when the noise in the network is absent or weak or in the opposite situation when it is strong. Hence, a full picture of how noise affects the dynamics of such systems is still lacking. The aim of this letter is to provide a more comprehensive understanding of the mechanisms by which the asynchronous states in large, fully connected networks of inhibitory neurons are destabilized as a function of the noise level. Three types of single neuron models are considered: the leaky integrateand-fire (LIF) model, the exponential integrate-and-fire (EIF), model and conductance-based models involving sodium and potassium Hodgkin-Huxley (HH) currents. We show that in all models, the instabilities of the asynchronous state can be classified in two classes. The first one consists of clustering instabilities, which exist in a restricted range of noise. These instabilities lead to synchronous patterns in which the population of neurons is broken into clusters of synchronously firing neurons. The irregularity of the firing patterns of the neurons is weak. The second class of instabilities, termed oscillatory firing rate instabilities, exists at any value of noise. They lead to cluster state at low noise. As the noise is increased, the instability occurs at larger coupling, and the pattern of firing that emerges becomes more irregular. In the regime of high noise and strong coupling, these instabilities lead to stochastic oscillations in which neurons fire in an approximately Poisson way with a common instantaneous probability of firing that oscillates in time.

Journal ArticleDOI
TL;DR: It is shown that the old principle of pseudolikelihood estimation provides an estimator that is computationally very simple yet statistically consistent in the basic case of fully visible Boltzmann machines.
Abstract: A Boltzmann machine is a classic model of neural computation, and a number of methods have been proposed for its estimation. Most methods are plagued by either very slow convergence or asymptotic bias in the resulting estimates. Here we consider estimation in the basic case of fully visible Boltzmann machines. We show that the old principle of pseudolikelihood estimation provides an estimator that is computationally very simple yet statistically consistent.

Journal ArticleDOI
TL;DR: A new learning algorithm that leverages oscillations in the strength of neural inhibition to train neural networks and can memorize large numbers of correlated input patterns without collapsing and that it shows good generalization to test patterns that do not exactly match studied patterns.
Abstract: We present a new learning algorithm that leverages oscillations in the strength of neural inhibition to train neural networks. Raising inhibition can be used to identify weak parts of target memories, which are then strengthened. Conversely, lowering inhibition can be used to identify competitors, which are then weakened. To update weights, we apply the Contrastive Hebbian Learning equation to successive time steps of the network. The sign of the weight change equation varies as a function of the phase of the inhibitory oscillation. We show that the learning algorithm can memorize large numbers of correlated input patterns without collapsing and that it shows good generalization to test patterns that do not exactly match studied patterns.

Journal ArticleDOI
TL;DR: In this article, the dynamics of a class of delayed neural networks with discontinuous activation functions are discussed and a relaxed set of sufficient conditions are derived, guaranteeing the existence, uniqueness, and global stability of the equilibrium point.
Abstract: In this letter, without assuming the boundedness of the activation functions, we discuss the dynamics of a class of delayed neural networks with discontinuous activation functions. A relaxed set of sufficient conditions is derived, guaranteeing the existence, uniqueness, and global stability of the equilibrium point. Convergence behaviors for both state and output are discussed. The constraints imposed on the feedback matrix are independent of the delay parameter and can be validated by the linear matrix inequality technique. We also prove that the solution of delayed neural networks with discontinuous activation functions can be regarded as a limit of the solutions of delayed neural networks with high-slope continuous activation functions.

Journal ArticleDOI
TL;DR: This letter begins a systematic study of the global parameter space structure of continuous-time recurrent neural networks (CTRNNs), a class of neural models that is simple but dynamically universal.
Abstract: A fundamental challenge for any general theory of neural circuits is how to characterize the structure of the space of all possible circuits over a given model neuron. As a first step in this direction, this letter begins a systematic study of the global parameter space structure of continuous-time recurrent neural networks (CTRNNs), a class of neural models that is simple but dynamically universal. First, we explicitly compute the local bifurcation manifolds of CTRNNs. We then visualize the structure of these manifolds in net input space for small circuits. These visualizations reveal a set of extremal saddle node bifurcation manifolds that divide CTRNN parameter space into regions of dynamics with different effective dimensionality. Next, we completely characterize the combinatorics and geometry of an asymptotically exact approximation to these regions for circuits of arbitrary size. Finally, we show how these regions can be used to calculate estimates of the probability of encountering different kinds of dynamics in CTRNN parameter space.

Journal ArticleDOI
TL;DR: In this paper, an n-neuron cellular neural network with time-varying delay can have 2n periodic orbits located in saturation regions and these periodic orbits are locally exponentially attractive.
Abstract: We show that an n-neuron cellular neural network with time-varying delay can have 2n periodic orbits located in saturation regions and these periodic orbits are locally exponentially attractive. In addition, we give some conditions for ascertaining periodic orbits to be locally or globally exponentially attractive and allow them to locate in any designated region. As a special case of exponential periodicity, exponential stability of delayed cellular neural networks is also characterized. These conditions improve and extend the existing results in the literature. To illustrate and compare the results, simulation results are discussed in three numerical examples.

Journal ArticleDOI
TL;DR: An overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures is given and the natural gradient method is shown to perform well because it takes the singular geometrical structure into account.
Abstract: The parameter spaces of hierarchical systems such as multilayer perceptrons include singularities due to the symmetry and degeneration of hidden units. A parameter space forms a geometrical manifold, called the neuromanifold in the case of neural networks. Such a model is identified with a statistical model, and a Riemannian metric is given by the Fisher information matrix. However, the matrix degenerates at singularities. Such a singular structure is ubiquitous not only in multilayer perceptrons but also in the gaussian mixture probability densities, ARMA time-series model, and many other cases. The standard statistical paradigm of the Cramer-Rao theorem does not hold, and the singularity gives rise to strange behaviors in parameter estimation, hypothesis testing, Bayesian inference, model selection, and in particular, the dynamics of learning from examples. Prevailing theories so far have not paid much attention to the problem caused by singularity, relying only on ordinary statistical theories developed for regular (nonsingular) models. Only recently have researchers remarked on the effects of singularity, and theories are now being developed.This article gives an overview of the phenomena caused by the singularities of statistical manifolds related to multilayer perceptrons and gaussian mixtures. We demonstrate our recent results on these problems. Simple toy models are also used to show explicit solutions. We explain that the maximum likelihood estimator is no longer subject to the gaussian distribution even asymptotically, because the Fisher information matrix degenerates, that the model selection criteria such as AIC, BIC, and MDL fail to hold in these models, that a smooth Bayesian prior becomes singular in such models, and that the trajectories of dynamics of learning are strongly affected by the singularity, causing plateaus or slow manifolds in the parameter space. The natural gradient method is shown to perform well because it takes the singular geometrical structure into account. The generalization error and the training error are studied in some examples.

Journal ArticleDOI
TL;DR: A state equation is derived and illustrated to capture this basic dependency between target and path and enables the use of estimation to study how brain regions that relate variously totarget and path together specify a trajectory.
Abstract: The execution of reaching movements involves the coordinated activity of multiple brain regions that relate variously to the desired target and a path of arm states to achieve that target. These arm states may represent positions, velocities, torques, or other quantities. Estimation has been previously applied to neural activity in reconstructing the target separately from the path. However, the target and path are not independent. Because arm movements are limited by finite muscle contractility, knowledge of the target constrains the path of states that leads to the target. In this letter, we derive and illustrate a state equation to capture this basic dependency between target and path. The solution is described for discrete-time linear systems and gaussian increments with known target arrival time. The resulting analysis enables the use of estimation to study how brain regions that relate variously to target and path together specify a trajectory. The corresponding reconstruction procedure may also be useful in brain-driven prosthetic devices to generate control signals for goal-directed movements.

Journal ArticleDOI
TL;DR: Investigating to what extent the amplitude and signal-to-noise ratio of the coherence between input and output varied for single-unit versus multi-unit activity and how they are affected by the duration of the recording showed results in agreement with the results of experimental data obtained from monkey visual cortex (V4).
Abstract: The purpose of this study was to obtain a better understanding of neuronal responses to correlated input, in particular focusing on the aspect of synchronization of neuronal activity. The first aim was to obtain an analytical expression for the coherence between the output spike train and correlated input and for the coherence between output spike trains of neurons with correlated input. For Poisson neurons, we could derive that the peak of the coherence between the correlated input and multi-unit activity increases proportionally with the square root of the number of neurons in the multi-unit recording. The coherence between two typical multi-unit recordings (2 to 10 single units) with partially correlated input increases proportionally with the number of units in the multi-unit recordings. The second aim of this study was to investigate to what extent the amplitude and signal-to-noise ratio of the coherence between input and output varied for single-unit versus multi-unit activity and how they are affected by the duration of the recording. The same problem was addressed for the coherence between two single-unit spike series and between two multi-unit spike series. The analytical results for the Poisson neuron and numerical simulations for the conductance-based leaky integrate-and-fire neuron and for the conductance-based Hodgkin-Huxley neuron show that the expectation value of the coherence function does not increase for a longer duration of the recording. The only effect of a longer duration of the spike recording is a reduction of the noise in the coherence function. The results of analytical derivations and computer simulations for model neurons show that the coherence for multi-unit activity is larger than that for single-unit activity. This is in agreement with the results of experimental data obtained from monkey visual cortex (V4). Finally, we show that multitaper techniques greatly contribute to a more accurate estimate of the coherence by reducing the bias and variance in the coherence estimate.

Journal ArticleDOI
TL;DR: A number of extensions of the classical leaky IF neuron model involving approximations of the membrane equation with conductancebased synaptic current are proposed, which lead to simple analytic expressions for the membrane state, and therefore can be used in the event-driven framework.
Abstract: Event-driven simulation strategies were proposed recently to simulate integrate-and-fire (IF) type neuronal models. These strategies can lead to computationally efficient algorithms for simulating large-scale networks of neurons; most important, such approaches are more precise than traditional clock-driven numerical integration approaches because the timing of spikes is treated exactly. The drawback of such event-driven methods is that in order to be efficient, the membrane equations must be solvable analytically, or at least provide simple analytic approximations for the state variables describing the system. This requirement prevents, in general, the use of conductance-based synaptic interactions within the framework of event-driven simulations and, thus, the investigation of network paradigms where synaptic conductances are important. We propose here a number of extensions of the classical leaky IF neuron model involving approximations of the membrane equation with conductancebased synaptic current, which lead to simple analytic expressions for the membrane state, and therefore can be used in the event-driven framework. These conductance-based IF (gIF) models are compared to commonly used models, such as the leaky IF model or biophysical models in which conductances are explicitly integrated. All models are compared with respect to various spiking response properties in the presence of synaptic activity, such as the spontaneous discharge statistics, the temporal precision in resolving synaptic inputs, and gain modulation under in vivo--like synaptic bombardment. Being based on the passive membrane equation with fixed-threshold spike generation, the proposed gIF models are situated in between leaky IF and biophysical models but are much closer to the latter with respect to their dynamic behavior and response characteristics, while still being nearly as computationally efficient as simple IF neuron models. gIF models should therefore provide a useful tool for efficient and precise simulation of large-scale neuronal networks with realistic, conductance-based synaptic interactions.

Journal ArticleDOI
TL;DR: Three structurally similar approaches can be applied to both algorithms that are localized learning, concave-convex learning, and winner-relaxing learning and it is shown that the NG results are valid for any data dimension, whereas in the SOM case, the results hold only for the one-dimensional case.
Abstract: We consider different ways to control the magnification in self-organizing maps (SOM) and neural gas (NG). Starting from early approaches of magnification control in vector quantization, we then concentrate on different approaches for SOM and NG. We show that three structurally similar approaches can be applied to both algorithms that are localized learning, concave-convex learning, and winner-relaxing learning. Thereby, the approach of concave-convex learning in SOM is extended to a more general description, whereas the concave-convex learning for NG is new. In general, the control mechanisms generate only slightly different behavior comparing both neural algorithms. However, we emphasize that the NG results are valid for any data dimension, whereas in the SOM case, the results hold only for the one-dimensional case.

Journal ArticleDOI
TL;DR: The effect of odor stimuli on the first odor-processing network in the honeybee brain, the antennal lobe, which corresponds to the vertebrate olfactory bulb, is analyzed to provide an ideal substrate for Hebbian reverberations and sensory memory in other neural systems.
Abstract: Sensory memory is a short-lived persistence of a sensory stimulus in the nervous system, such as iconic memory in the visual system. However, little is known about the mechanisms underlying olfactory sensory memory. We have therefore analyzed the effect of odor stimuli on the first odor-processing network in the honeybee brain, the antennal lobe, which corresponds to the vertebrate olfactory bulb. We stained output neurons with a calcium-sensitive dye and measured across-glomerular patterns of spontaneous activity before and after a stimulus. Such a single-odor presentation changed the relative timing of spontaneous activity across glomeruli in accordance with Hebb's theory of learning. Moreover, during the first few minutes after odor presentation, correlations between the spontaneous activity fluctuations suffice to reconstruct the stimulus. As spontaneous activity is ubiquitous in the brain, modifiable fluctuations could provide an ideal substrate for Hebbian reverberations and sensory memory in other neural systems.

Journal ArticleDOI
TL;DR: The new technique, called potential support vector machine (P-SVM), is a large-margin method for the construction of classifiers and regression functions for the column objects and can handle data and kernel matrices that are neither positive definite nor square.
Abstract: We describe a new technique for the analysis of dyadic data, where two sets of objects (row and column objects) are characterized by a matrix of numerical values that describe their mutual relationships. The new technique, called potential support vector machine (P-SVM), is a large-margin method for the construction of classifiers and regression functions for the column objects. Contrary to standard support vector machine approaches, the P-SVM minimizes a scale-invariant capacity measure and requires a new set of constraints. As a result, the P-SVM method leads to a usually sparse expansion of the classification and regression functions in terms of the row rather than the column objects and can handle data and kernel matrices that are neither positive definite nor square. We then describe two complementary regularization schemes. The first scheme improves generalization performance for classification and regression tasks; the second scheme leads to the selection of a small, informative set of row support objects and can be applied to feature selection. Benchmarks for classification, regression, and feature selection tasks are performed with toy data as well as with several real-world data sets. The results show that the new method is at least competitive with but often performs better than the benchmarked standard methods for standard vectorial as well as true dyadic data sets. In addition, a theoretical justification is provided for the new approach.

Journal ArticleDOI
Volker Roth1
TL;DR: The method presented in this letter bridges the gap between kernelized one-class classification and gaussian density estimation in the induced feature space and it is possible to identify atypical objects by quantifying their deviations from the gaussian model.
Abstract: The problem of detecting atypical objects or outliers is one of the classical topics in (robust) statistics. Recently, it has been proposed to address this problem by means of one-class SVM classifiers. The method presented in this letter bridges the gap between kernelized one-class classification and gaussian density estimation in the induced feature space. Having established the exact relation between the two concepts, it is now possible to identify atypical objects by quantifying their deviations from the gaussian model. This model-based formalization of outliers overcomes the main conceptual shortcoming of most one-class approaches, which, in a strict sense, are unable to detect outliers, since the expected fraction of outliers has to be specified in advance. In order to overcome the inherent model selection problem of unsupervised kernel methods, a cross-validated likelihood criterion for selecting all free model parameters is applied. Experiments for detecting atypical objects in image databases effectively demonstrate the applicability of the proposed method in real-world scenarios.