scispace - formally typeset
Search or ask a question

Showing papers in "Neural Computation in 2021"


Journal ArticleDOI
TL;DR: This formulation of affective inference offers a principled account of the link between affect, (mental) action, and implicit metacognition and characterizes how a deep biological system can infer its affective state and reduce uncertainty about such inferences through internal action.
Abstract: The positive-negative axis of emotional valence has long been recognized as fundamental to adaptive behavior, but its origin and underlying function have largely eluded formal theorizing and computational modeling. Using deep active inference, a hierarchical inference scheme that rests on inverting a model of how sensory data are generated, we develop a principled Bayesian model of emotional valence. This formulation asserts that agents infer their valence state based on the expected precision of their action model-an internal estimate of overall model fitness ("subjective fitness"). This index of subjective fitness can be estimated within any environment and exploits the domain generality of second-order beliefs (beliefs about beliefs). We show how maintaining internal valence representations allows the ensuing affective agent to optimize confidence in action selection preemptively. Valence representations can in turn be optimized by leveraging the (Bayes-optimal) updating term for subjective fitness, which we label affective charge (AC). AC tracks changes in fitness estimates and lends a sign to otherwise unsigned divergences between predictions and outcomes. We simulate the resulting affective inference by subjecting an in silico affective agent to a T-maze paradigm requiring context learning, followed by context reversal. This formulation of affective inference offers a principled account of the link between affect, (mental) action, and implicit metacognition. It characterizes how a deep biological system can infer its affective state and reduce uncertainty about such inferences through internal action (i.e., top-down modulation of priors that underwrite confidence). Thus, we demonstrate the potential of active inference to provide a formal and computationally tractable account of affect. Our demonstration of the face validity and potential utility of this formulation represents the first step within a larger research program. Next, this model can be leveraged to test the hypothesized role of valence by fitting the model to behavioral and neuronal responses.

92 citations


Journal ArticleDOI
TL;DR: In this article, the authors use numerical simulations to systematically study how essential design parameters of surrogate gradients affect learning performance on a range of classification problems, and they show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives.
Abstract: Brains process information in spiking neural networks. Their intricate connections shape the diverse functions these networks perform. Yet how network connectivity relates to function is poorly understood, and the functional capabilities of models of spiking networks are still rudimentary. The lack of both theoretical insight and practical algorithms to find the necessary connectivity poses a major impediment to both studying information processing in the brain and building efficient neuromorphic hardware systems. The training algorithms that solve this problem for artificial neural networks typically rely on gradient descent. But doing so in spiking networks has remained challenging due to the nondifferentiable nonlinearity of spikes. To avoid this issue, one can employ surrogate gradients to discover the required connectivity. However, the choice of a surrogate is not unique, raising the question of how its implementation influences the effectiveness of the method. Here, we use numerical simulations to systematically study how essential design parameters of surrogate gradients affect learning performance on a range of classification problems. We show that surrogate gradient learning is robust to different shapes of underlying surrogate derivatives, but the choice of the derivative's scale can substantially affect learning performance. When we combine surrogate gradients with suitable activity regularization techniques, spiking networks perform robust information processing at the sparse activity limit. Our study provides a systematic account of the remarkable robustness of surrogate gradient learning and serves as a practical guide to model functional spiking neural networks.

63 citations


Journal ArticleDOI
TL;DR: In this article, the authors provide an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in reinforcement learning, and an explicit discrete state comparison between active inference and reinforcement learning on an OpenAI gym baseline.
Abstract: Active inference is a first principle account of how autonomous agents operate in dynamic, nonstationary environments. This problem is also considered in reinforcement learning, but limited work exists on comparing the two approaches on the same discrete-state environments. In this letter, we provide (1) an accessible overview of the discrete-state formulation of active inference, highlighting natural behaviors in active inference that are generally engineered in reinforcement learning, and (2) an explicit discrete-state comparison between active inference and reinforcement learning on an OpenAI gym baseline. We begin by providing a condensed overview of the active inference literature, in particular viewing the various natural behaviors of active inference agents through the lens of reinforcement learning. We show that by operating in a pure belief-based setting, active inference agents can carry out epistemic exploration-and account for uncertainty about their environment-in a Bayes-optimal fashion. Furthermore, we show that the reliance on an explicit reward signal in reinforcement learning is removed in active inference, where reward can simply be treated as another observation we have a preference over; even in the total absence of rewards, agent behaviors are learned through preference learning. We make these properties explicit by showing two scenarios in which active inference agents can infer behaviors in reward-free environments compared to both Q-learning and Bayesian model-based reinforcement learning agents and by placing zero prior preferences over rewards and learning the prior preferences over the observations corresponding to reward. We conclude by noting that this formalism can be applied to more complex settings (e.g., robotic arm movement, Atari games) if appropriate generative models can be formulated. In short, we aim to demystify the behavior of active inference agents by presenting an accessible discrete state-space and time formulation and demonstrate these behaviors in a OpenAI gym environment, alongside reinforcement learning agents.

62 citations


Journal ArticleDOI
TL;DR: In this paper, the authors extend the second step of UMAP to a parametric optimization over neural network weights, learning a relationship between data and embedding, and demonstrate that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of learned parametric mapping.
Abstract: UMAP is a nonparametric graph-based dimensionality reduction algorithm using applied Riemannian geometry and algebraic topology to find low-dimensional embeddings of structured data. The UMAP algorithm consists of two steps: (1) computing a graphical representation of a data set (fuzzy simplicial complex) and (2) through stochastic gradient descent, optimizing a low-dimensional embedding of the graph. Here, we extend the second step of UMAP to a parametric optimization over neural network weights, learning a parametric relationship between data and embedding. We first demonstrate that parametric UMAP performs comparably to its nonparametric counterpart while conferring the benefit of a learned parametric mapping (e.g., fast online embeddings for new data). We then explore UMAP as a regularization, constraining the latent distribution of autoencoders, parametrically varying global structure preservation, and improving classifier accuracy for semisupervised learning by capturing structure in unlabeled data.1.

50 citations


Journal ArticleDOI
TL;DR: A neural variable risk minimization framework and neural variable optimizers are devised to achieve ANV for conventional network architectures in practice, and the empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.
Abstract: Deep learning is often criticized by two serious issues that rarely exist in natural nervous systems: overfitting and catastrophic forgetting. It can even memorize randomly labeled data, which has little knowledge behind the instance-label pairs. When a deep network continually learns over time by accommodating new tasks, it usually quickly overwrites the knowledge learned from previous tasks. Referred to as the neural variability, it is well known in neuroscience that human brain reactions exhibit substantial variability even in response to the same stimulus. This mechanism balances accuracy and plasticity/flexibility in the motor learning of natural nervous systems. Thus, it motivates us to design a similar mechanism, named artificial neural variability (ANV), that helps artificial neural networks learn some advantages from "natural" neural networks. We rigorously prove that ANV plays as an implicit regularizer of the mutual information between the training data and the learned model. This result theoretically guarantees ANV a strictly improved generalizability, robustness to label noise, and robustness to catastrophic forgetting. We then devise a neural variable risk minimization (NVRM) framework and neural variable optimizers to achieve ANV for conventional network architectures in practice. The empirical studies demonstrate that NVRM can effectively relieve overfitting, label noise memorization, and catastrophic forgetting at negligible costs.

34 citations


Journal ArticleDOI
TL;DR: In this paper, the rank of the connectivity matrix and the number of statistically defined populations are independent hyperparameters, and the resulting collective dynamics form a dynamical system, where the rank sets the dimensionality and the population structure shapes the dynamics.
Abstract: An emerging paradigm proposes that neural computations can be understood at the level of dynamic systems that govern low-dimensional trajectories of collective neural activity. How the connectivity structure of a network determines the emergent dynamical system, however, remains to be clarified. Here we consider a novel class of models, gaussian-mixture, low-rank recurrent networks in which the rank of the connectivity matrix and the number of statistically defined populations are independent hyperparameters. We show that the resulting collective dynamics form a dynamical system, where the rank sets the dimensionality and the population structure shapes the dynamics. In particular, the collective dynamics can be described in terms of a simplified effective circuit of interacting latent variables. While having a single global population strongly restricts the possible dynamics, we demonstrate that if the number of populations is large enough, a rank R network can approximate any R-dimensional dynamical system.

32 citations


Journal ArticleDOI
TL;DR: The expected free energy (EFE) is a central quantity in the theory of active inference, and it is the quantity that all active inference agents are mandated to minimize through action as discussed by the authors.
Abstract: The expected free energy (EFE) is a central quantity in the theory of active inference. It is the quantity that all active inference agents are mandated to minimize through action, and its decompos...

26 citations


Journal ArticleDOI
TL;DR: This work presents WorkMATe, a neural network architecture that models cognitive control over working memory content and learns the appropriate control operations needed to solve complex working memory tasks and provides a new solution for the neural implementation of flexible memory control.
Abstract: Working memory is essential: it serves to guide intelligent behavior of humans and nonhuman primates when task-relevant stimuli are no longer present to the senses. Moreover, complex tasks often require that multiple working memory representations can be flexibly and independently maintained, prioritized, and updated according to changing task demands. Thus far, neural network models of working memory have been unable to offer an integrative account of how such control mechanisms can be acquired in a biologically plausible manner. Here, we present WorkMATe, a neural network architecture that models cognitive control over working memory content and learns the appropriate control operations needed to solve complex working memory tasks. Key components of the model include a gated memory circuit that is controlled by internal actions, encoding sensory information through untrained connections, and a neural circuit that matches sensory inputs to memory content. The network is trained by means of a biologically plausible reinforcement learning rule that relies on attentional feedback and reward prediction errors to guide synaptic updates. We demonstrate that the model successfully acquires policies to solve classical working memory tasks, such as delayed recognition and delayed pro-saccade/anti-saccade tasks. In addition, the model solves much more complex tasks, including the hierarchical 12-AX task or the ABAB ordered recognition task, both of which demand an agent to independently store and updated multiple items separately in memory. Furthermore, the control strategies that the model acquires for these tasks subsequently generalize to new task contexts with novel stimuli, thus bringing symbolic production rule qualities to a neural network architecture. As such, WorkMATe provides a new solution for the neural implementation of flexible memory control.

23 citations


Journal ArticleDOI
TL;DR: Analysis of cognitive robotics research working under the predictive processing framework suggests that research in both cognitive robotics implementations and nonrobotic models needs to be extended to the study of how multiple exteroceptive modalities can be integrated into prediction error minimization schemes.
Abstract: Predictive processing has become an influential framework in cognitive sciences. This framework turns the traditional view of perception upside down, claiming that the main flow of information processing is realized in a top-down, hierarchical manner. Furthermore, it aims at unifying perception, cognition, and action as a single inferential process. However, in the related literature, the predictive processing framework and its associated schemes, such as predictive coding, active inference, perceptual inference, and free-energy principle, tend to be used interchangeably. In the field of cognitive robotics, there is no clear-cut distinction on which schemes have been implemented and under which assumptions. In this letter, working definitions are set with the main aim of analyzing the state of the art in cognitive robotics research working under the predictive processing framework as well as some related nonrobotic models. The analysis suggests that, first, research in both cognitive robotics implementations and nonrobotic models needs to be extended to the study of how multiple exteroceptive modalities can be integrated into prediction error minimization schemes. Second, a relevant distinction found here is that cognitive robotics implementations tend to emphasize the learning of a generative model, while in nonrobotics models, it is almost absent. Third, despite the relevance for active inference, few cognitive robotics implementations examine the issues around control and whether it should result from the substitution of inverse models with proprioceptive predictions. Finally, limited attention has been placed on precision weighting and the tracking of prediction error dynamics. These mechanisms should help to explore more complex behaviors and tasks in cognitive robotics research under the predictive processing framework.

21 citations


Journal ArticleDOI
TL;DR: This work proposes the conductance-based adaptive exponential integrate-and-fire model (CAdEx), a richer alternative to perform network simulations with simplified models reproducing neuronal intrinsic properties, and gives an analysis of the dynamics of the model.
Abstract: The intrinsic electrophysiological properties of single neurons can be described by a broad spectrum of models, from realistic Hodgkin-Huxley-type models with numerous detailed mechanisms to the ph...

19 citations


Journal ArticleDOI
TL;DR: Stable concurrent learning and control of dynamical systems is the subject of adaptive control Despite being an established field with many practical applications and a rich theory, much of the literature is focused on adaptive control.
Abstract: Stable concurrent learning and control of dynamical systems is the subject of adaptive control Despite being an established field with many practical applications and a rich theory, much of the de

Journal ArticleDOI
TL;DR: In this article, the authors provide a comprehensive comparison between replay in the mammalian brain and replay in artificial neural networks and identify multiple aspects of biological replay that are missing in deep learning systems and hypothesize how they could be used to improve artificial neural network.
Abstract: Replay is the reactivation of one or more neural patterns that are similar to the activation patterns experienced during past waking experiences. Replay was first observed in biological neural networks during sleep, and it is now thought to play a critical role in memory formation, retrieval, and consolidation. Replay-like mechanisms have been incorporated in deep artificial neural networks that learn over time to avoid catastrophic forgetting of previous knowledge. Replay algorithms have been successfully used in a wide range of deep learning methods within supervised, unsupervised, and reinforcement learning paradigms. In this letter, we provide the first comprehensive comparison between replay in the mammalian brain and replay in artificial neural networks. We identify multiple aspects of biological replay that are missing in deep learning systems and hypothesize how they could be used to improve artificial neural networks.

Journal ArticleDOI
TL;DR: In this paper, the authors use a simple model where the dendrite is implemented as a sequence of thresholded linear units to investigate the impacts of binary branching constraints and repetition of synaptic inputs on neural computation.
Abstract: Physiological experiments have highlighted how the dendrites of biological neurons can nonlinearly process distributed synaptic inputs. However, it is unclear how aspects of a dendritic tree, such as its branched morphology or its repetition of presynaptic inputs, determine neural computation beyond this apparent nonlinearity. Here we use a simple model where the dendrite is implemented as a sequence of thresholded linear units. We manipulate the architecture of this model to investigate the impacts of binary branching constraints and repetition of synaptic inputs on neural computation. We find that models with such manipulations can perform well on machine learning tasks, such as Fashion MNIST or Extended MNIST. We find that model performance on these tasks is limited by binary tree branching and dendritic asymmetry and is improved by the repetition of synaptic inputs to different dendritic branches. These computational experiments further neuroscience theory on how different dendritic properties might determine neural computation of clearly defined tasks.

Journal ArticleDOI
TL;DR: In this paper, surprise-based learning allows agents to rapidly adapt to nonstationary stochastic environments characterized by sudden changes, and they show that exact Bayesian inference in a hierarchical model gives...
Abstract: Surprise-based learning allows agents to rapidly adapt to nonstationary stochastic environments characterized by sudden changes. We show that exact Bayesian inference in a hierarchical model gives ...

Journal ArticleDOI
Abstract: A new network with super-approximation power is introduced. This network is built with Floor (⌊x⌋) or ReLU (max{0,x}) activation function in each neuron; hence, we call such networks Floor-ReLU net...

Journal ArticleDOI
TL;DR: In this article, the precision of a binary classifier depends on the ratio r of positive to negative cases in the test set, as well as the classifier's true and false-positive rates.
Abstract: In this note, I study how the precision of a binary classifier depends on the ratio r of positive to negative cases in the test set, as well as the classifier's true and false-positive rates. This ...

Journal ArticleDOI
TL;DR: In this paper, the spiking activity of individual neurons in a cell ensemble was studied. But the authors focused on different mechanisms, such as synaptic coupling and the synaptic activity of itself and it's neighbors.
Abstract: It is of great interest to characterize the spiking activity of individual neurons in a cell ensemble. Many different mechanisms, such as synaptic coupling and the spiking activity of itself and it...

Journal ArticleDOI
TL;DR: It is shown that it is possible to incorporate neuron models with input-dependent nonlinearities into the Neural Engineering Framework without compromising high-level function and that nonlinear postsynaptic currents can be systematically exploited to compute a wide variety of multivariate, band-limited functions.
Abstract: Nonlinear interactions in the dendritic tree play a key role in neural computation. Nevertheless, modeling frameworks aimed at the construction of large-scale, functional spiking neural networks, such as the Neural Engineering Framework, tend to assume a linear superposition of postsynaptic currents. In this letter, we present a series of extensions to the Neural Engineering Framework that facilitate the construction of networks incorporating Dale's principle and nonlinear conductance-based synapses. We apply these extensions to a two-compartment LIF neuron that can be seen as a simple model of passive dendritic computation. We show that it is possible to incorporate neuron models with input-dependent nonlinearities into the Neural Engineering Framework without compromising high-level function and that nonlinear postsynaptic currents can be systematically exploited to compute a wide variety of multivariate, band-limited functions, including the Euclidean norm, controlled shunting, and nonnegative multiplication. By avoiding an additional source of spike noise, the function approximation accuracy of a single layer of two-compartment LIF neurons is on a par with or even surpasses that of two-layer spiking neural networks up to a certain target function bandwidth.

Journal ArticleDOI
TL;DR: In this paper, a recursive sparse Bayesian learning (RSBL) algorithm was proposed to combine the ESI and independent component analysis (ICA) for M/EEG analysis.
Abstract: Electromagnetic source imaging (ESI) and independent component analysis (ICA) are two popular and apparently dissimilar frameworks for M/EEG analysis. This letter shows that the two frameworks can be linked by choosing biologically inspired source sparsity priors. We demonstrate that ESI carried out by the sparse Bayesian learning (SBL) algorithm yields source configurations composed of a few active regions that are also maximally independent from one another. In addition, we extend the standard SBL approach to source imaging in two important directions. First, we augment the generative model of M/EEG to include artifactual sources. Second, we modify SBL to allow for efficient model inversion with sequential data. We refer to this new algorithm as recursive SBL (RSBL), a source estimation filter with potential for online and offline imaging applications. We use simulated data to verify that RSBL can accurately estimate and demix cortical and artifactual sources under different noise conditions. Finally, we show that on real error-related EEG data, RSBL can yield single-trial source estimates in agreement with the experimental literature. Overall, by demonstrating that ESI can produce maximally independent sources while simultaneously localizing them in cortical space, we bridge the gap between the ESI and ICA frameworks for M/EEG analysis.

Journal ArticleDOI
TL;DR: In this paper, a nonlinear decoding approach for inferring natural scene stimuli from the spiking activities of retinal ganglion cells (RGCs) is presented, which uses neural networks to improve on existing decoders in both accuracy and scalability.
Abstract: Decoding sensory stimuli from neural activity can provide insight into how the nervous system might interpret the physical environment, and facilitates the development of brain-machine interfaces. Nevertheless, the neural decoding problem remains a significant open challenge. Here, we present an efficient nonlinear decoding approach for inferring natural scene stimuli from the spiking activities of retinal ganglion cells (RGCs). Our approach uses neural networks to improve on existing decoders in both accuracy and scalability. Trained and validated on real retinal spike data from more than 1000 simultaneously recorded macaque RGC units, the decoder demonstrates the necessity of nonlinear computations for accurate decoding of the fine structures of visual stimuli. Specifically, high-pass spatial features of natural images can only be decoded using nonlinear techniques, while low-pass features can be extracted equally well by linear and nonlinear methods. Together, these results advance the state of the art in decoding natural stimuli from large populations of neurons.

Journal ArticleDOI
TL;DR: It is demonstrated that DNNs for segmentation with few units have sufficient complexity to solve insideness for any curve, and only recurrent networks trained with small images learn solutions that generalize well to almost any curve.
Abstract: The insideness problem is an aspect of image segmentation that consists of determining which pixels are inside and outside a region. Deep neural networks (DNNs) excel in segmentation benchmarks, but it is unclear if they have the ability to solve the insideness problem as it requires evaluating long-range spatial dependencies. In this letter, we analyze the insideness problem in isolation, without texture or semantic cues, such that other aspects of segmentation do not interfere in the analysis. We demonstrate that DNNs for segmentation with few units have sufficient complexity to solve the insideness for any curve. Yet such DNNs have severe problems with learning general solutions. Only recurrent networks trained with small images learn solutions that generalize well to almost any curve. Recurrent networks can decompose the evaluation of long-range dependencies into a sequence of local operations, and learning with small images alleviates the common difficulties of training recurrent networks with a large number of unrolling steps.

Journal ArticleDOI
TL;DR: In this paper, the phase-based functional brain connectivity derived from electroencephalogram (EEG) in a machine learning framework was used to classify the children with autism and typical children in an experimentally obtained data set of 12 autism spectrum disorder (ASD) and 12 typical children.
Abstract: Autism is a psychiatric condition that is typically diagnosed with behavioral assessment methods. Recent years have seen a rise in the number of children with autism. Since this could have serious health and socioeconomic consequences, it is imperative to investigate how to develop strategies for an early diagnosis that might pave the way to an adequate intervention. In this study, the phase-based functional brain connectivity derived from electroencephalogram (EEG) in a machine learning framework was used to classify the children with autism and typical children in an experimentally obtained data set of 12 autism spectrum disorder (ASD) and 12 typical children. Specifically, the functional brain connectivity networks have quantitatively been characterized by graph-theoretic parameters computed from three proposed approaches based on a standard phase-locking value, which were used as the features in a machine learning environment. Our study was successfully classified between two groups with approximately 95.8% accuracy, 100% sensitivity, and 92% specificity through the trial-averaged phase-locking value (PLV) approach and cubic support vector machine (SVM). This work has also shown that significant changes in functional brain connectivity in ASD children have been revealed at theta band using the aggregated graph-theoretic features. Therefore, the findings from this study offer insight into the potential use of functional brain connectivity as a tool for classifying ASD children.

Journal ArticleDOI
TL;DR: In this article, a surrogate gradient approach is proposed to train the LIF units via backpropagation, which can be used to run the neurons in different operating modes, such as simple signal integrators or coincidence detectors.
Abstract: Up to now, modern machine learning (ML) has been based on approximating big data sets with high-dimensional functions, taking advantage of huge computational resources. We show that biologically inspired neuron models such as the leaky-integrate-and-fire (LIF) neuron provide novel and efficient ways of information processing. They can be integrated in machine learning models and are a potential target to improve ML performance. Thus, we have derived simple update rules for LIF units to numerically integrate the differential equations. We apply a surrogate gradient approach to train the LIF units via backpropagation. We demonstrate that tuning the leak term of the LIF neurons can be used to run the neurons in different operating modes, such as simple signal integrators or coincidence detectors. Furthermore, we show that the constant surrogate gradient, in combination with tuning the leak term of the LIF units, can be used to achieve the learning dynamics of more complex surrogate gradients. To prove the validity of our method, we applied it to established image data sets (the Oxford 102 flower data set, MNIST), implemented various network architectures, used several input data encodings and demonstrated that the method is suitable to achieve state-of-the-art classification performance. We provide our method as well as further surrogate gradient methods to train spiking neural networks via backpropagation as an open-source KERAS package to make it available to the neuroscience and machine learning community. To increase the interpretability of the underlying effects and thus make a small step toward opening the black box of machine learning, we provide interactive illustrations, with the possibility of systematically monitoring the effects of parameter changes on the learning characteristics.

Journal ArticleDOI
TL;DR: In this paper, spatial semantic pointers (SSPs) are used to model dynamical systems involving multiple objects represented in a symbol-like manner and integrated with deep neural networks to predict the future of physical trajectories.
Abstract: While neural networks are highly effective at learning task-relevant representations from data, they typically do not learn representations with the kind of symbolic structure that is hypothesized to support high-level cognitive processes, nor do they naturally model such structures within problem domains that are continuous in space and time. To fill these gaps, this work exploits a method for defining vector representations that bind discrete (symbol-like) entities to points in continuous topological spaces in order to simulate and predict the behavior of a range of dynamical systems. These vector representations are spatial semantic pointers (SSPs), and we demonstrate that they can (1) be used to model dynamical systems involving multiple objects represented in a symbol-like manner and (2) be integrated with deep neural networks to predict the future of physical trajectories. These results help unify what have traditionally appeared to be disparate approaches in machine learning.

Journal ArticleDOI
TL;DR: In this article, the authors used scalp electroencephalography (EEG) to study the brain's ability to maintain task focus over extended periods of time (Mackworth, 1948; Chun, Golomb, & Turk-Browne, 2011).
Abstract: Sustained attention is a cognitive ability to maintain task focus over extended periods of time (Mackworth, 1948; Chun, Golomb, & Turk-Browne, 2011). In this study, scalp electroencephalography (EE...

Journal ArticleDOI
TL;DR: In this paper, the authors utilize maximum caliber, a dynamical inference principle, to build a minimal yet general model of the collective (mean field) dynamics of large populations of neurons.
Abstract: The relationship between complex brain oscillations and the dynamics of individual neurons is poorly understood. Here we utilize maximum caliber, a dynamical inference principle, to build a minimal yet general model of the collective (mean field) dynamics of large populations of neurons. In agreement with previous experimental observations, we describe a simple, testable mechanism, involving only a single type of neuron, by which many of these complex oscillatory patterns may emerge. Our model predicts that the refractory period of neurons, which has often been neglected, is essential for these behaviors.

Journal ArticleDOI
TL;DR: In this paper, the authors explore the possibility that cortical microcircuits implement canonical correlation analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections.
Abstract: Cortical pyramidal neurons receive inputs from multiple distinct neural populations and integrate these inputs in separate dendritic compartments. We explore the possibility that cortical microcircuits implement canonical correlation analysis (CCA), an unsupervised learning method that projects the inputs onto a common subspace so as to maximize the correlations between the projections. To this end, we seek a multichannel CCA algorithm that can be implemented in a biologically plausible neural network. For biological plausibility, we require that the network operates in the online setting and its synaptic update rules are local. Starting from a novel CCA objective function, we derive an online optimization algorithm whose optimization steps can be implemented in a single-layer neural network with multicompartmental neurons and local non-Hebbian learning rules. We also derive an extension of our online CCA algorithm with adaptive output rank and output whitening. Interestingly, the extension maps onto a neural network whose neural architecture and synaptic updates resemble neural circuitry and non-Hebbian plasticity observed in the cortex.

Journal ArticleDOI
TL;DR: This letter demonstrates how hindsight can be introduced to policy gradient methods, generalizing this idea to a broad class of successful algorithms and shows that hindsight leads to a remarkable increase in sample efficiency.
Abstract: A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy. In addition to their potential to generalize desirable behavior to unseen goals, such policies may also enable higher-level planning based on subgoals. In sparse-reward environments, the capacity to exploit information about the degree to which an arbitrary goal has been achieved while another goal was intended appears crucial to enabling sample efficient learning. However, reinforcement learning agents have only recently been endowed with such capacity for hindsight. In this letter, we demonstrate how hindsight can be introduced to policy gradient methods, generalizing this idea to a broad class of successful algorithms. Our experiments on a diverse selection of sparse-reward environments show that hindsight leads to a remarkable increase in sample efficiency.

Journal ArticleDOI
TL;DR: A computational neural network model is utilized to investigate the circuit-level effects of N-methyl D-aspartate receptor dysfunction and demonstrate how one specific mechanism of cellular-level damage in mTBI affects the overall function of a neural network and point to the importance of reversing cellular- level changes to recover important properties of learning and memory in a microcircuit.
Abstract: Mild traumatic brain injury (mTBI) presents a significant health concern with potential persisting deficits that can last decades. Although a growing body of literature improves our understanding o...

Journal ArticleDOI
TL;DR: In this paper, an end-to-end recurrent spiking neural network model is proposed to recognize dynamic spatio-temporal patterns of neural computation in the brain, which achieves a test accuracy of 83.6% on average.
Abstract: Our real-time actions in everyday life reflect a range of spatiotemporal dynamic brain activity patterns, the consequence of neuronal computation with spikes in the brain. Most existing models with spiking neurons aim at solving static pattern recognition tasks such as image classification. Compared with static features, spatiotemporal patterns are more complex due to their dynamics in both space and time domains. Spatiotemporal pattern recognition based on learning algorithms with spiking neurons therefore remains challenging. We propose an end-to-end recurrent spiking neural network model trained with an algorithm based on spike latency and temporal difference backpropagation. Our model is a cascaded network with three layers of spiking neurons where the input and output layers are the encoder and decoder, respectively. In the hidden layer, the recurrently connected neurons with transmission delays carry out high-dimensional computation to incorporate the spatiotemporal dynamics of the inputs. The test results based on the data sets of spiking activities of the retinal neurons show that the proposed framework can recognize dynamic spatiotemporal patterns much better than using spike counts. Moreover, for 3D trajectories of a human action data set, the proposed framework achieves a test accuracy of 83.6% on average. Rapid recognition is achieved through the learning methodology-based on spike latency and the decoding process using the first spike of the output neurons. Taken together, these results highlight a new model to extract information from activity patterns of neural computation in the brain and provide a novel approach for spike-based neuromorphic computing.