scispace - formally typeset
Search or ask a question
Posted ContentDOI

Rethinking dopamine as generalized prediction error

TL;DR: A new theory of dopamine function is developed that embraces a broader conceptualization of prediction errors and concludes that by signaling errors in both sensory and reward predictions, dopamine supports a form of reinforcement learning that lies between model-based and model-free algorithms.
Abstract: Midbrain dopamine neurons are commonly thought to report a reward prediction error, as hypothesized by reinforcement learning theory. While this theory has been highly successful, several lines of evidence suggest that dopamine activity also encodes sensory prediction errors unrelated to reward. Here we develop a new theory of dopamine function that embraces a broader conceptualization of prediction errors. By signaling errors in both sensory and reward predictions, dopamine supports a form of reinforcement learning that lies between model-based and model-free algorithms. This account remains consistent with current canon regarding the correspondence between dopamine transients and reward prediction errors, while also accounting for new data suggesting a role for these signals in phenomena such as sensory preconditioning and identity unblocking, which ostensibly draw upon knowledge beyond reward predictions.
Citations
More filters
Journal ArticleDOI
TL;DR: In this paper , the authors provide an overview of neuromodulatory systems and their relationship to emerging pertinent principles in deep neural networks, and further outline opportunities for the integration of neurOModulatory principles into deep neural network, towards endowing artificial intelligence with a key ingredient underlying the flexibility and learning capability of biological systems.

17 citations

Journal ArticleDOI
TL;DR: D dopamine dynamics across many striatal regions are captured and it is demonstrated that dopamine release is, regionally, extremely heterogeneous and that a reward prediction error–like signal is predominantly found in the relatively small limbic domain of the striatum.
Abstract: Significance Although it is undisputed that striatal dopamine plays a prominent role in motivated behavior and learning, the precise information conveyed by dopamine signals as such is under active debate. For a long time, the idea dominated that dopamine encodes a reward prediction error and that this signal is broadcast uniformly throughout the brain. However, here, we capture dopamine dynamics across many striatal regions and demonstrate that dopamine release is, regionally, extremely heterogeneous and that a reward prediction error–like signal is predominantly found in the relatively small limbic domain of the striatum. Another striking organizing principle is that stimulus valence directs dopamine concentration homogeneously across all regions (i.e., appetitive stimuli increase dopamine and aversive stimuli decrease dopamine).

15 citations

Posted ContentDOI
04 Apr 2019-bioRxiv
TL;DR: In this paper, the authors show that midbrain dopamine neurons are modulated by contralateral choice in a manner that is distinct from RPE, implying that choice encoding is better explained by movement direction.
Abstract: Although midbrain dopamine (DA) neurons have been thought to primarily encode reward prediction error (RPE), recent studies have also found movement-related DAergic signals. For example, we recently reported that DA neurons in mice projecting to dorsomedial striatum are modulated by choices contralateral to the recording side. Here, we introduce, and ultimately reject, a candidate resolution for the puzzling RPE vs movement dichotomy, by showing how seemingly movement-related activity might be explained by an action-specific RPE. By considering both choice and RPE on a trial-by-trial basis, we find that DA signals are modulated by contralateral choice in a manner that is distinct from RPE, implying that choice encoding is better explained by movement direction. This fundamental separation between RPE and movement encoding may help shed light on the diversity of functions and dysfunctions of the DA system.

14 citations

Journal ArticleDOI
TL;DR: In this article , the authors proposed that the nature of active inference is abductive and to rectify aberrant active inference processes, they should change the "Rule" of abduction, or the "prior beliefs" entailed by a patient's generative model.
Abstract: This paper offers theoretical explanations for why “guided touch” or manual touch with verbal communication can be an effective way of treating the body (e.g., chronic pain) and the mind (e.g., emotional disorders). The active inference theory suggests that chronic pain and emotional disorders can be attributed to distorted and exaggerated patterns of interoceptive and proprioceptive inference. We propose that the nature of active inference is abductive. As such, to rectify aberrant active inference processes, we should change the “Rule” of abduction, or the “prior beliefs” entailed by a patient’s generative model. This means pre-existing generative models should be replaced with new models. To facilitate such replacement—or updating—the present treatment proposes that we should weaken prior beliefs, especially the one at the top level of hierarchical generative models, thereby altering the sense of agency, and redeploying attention. Then, a new prior belief can be installed through inner communication along with manual touch. The present paper proposes several hypotheses for possible experimental studies. If touch with verbal guidance is proven to be effective, this would demonstrate the relevance of active inference and the implicit prediction model at a behavioral level. Furthermore, it would open new possibilities of employing inner communication interventions, including self-talk training, for a wide range of psychological and physical therapies.

13 citations

Posted ContentDOI
02 Mar 2022-bioRxiv
TL;DR: A new “Vector RPE” model is introduced, positing that DA neurons report individual RPEs for a subset of a population vector code for an animal’s state, which provides a path to reconcile new observations of DA neuron heterogeneity with classic ideas about RPE coding, while also providing a new perspective on how the brain performs reinforcement learning in high dimensional environments.
Abstract: The hypothesis that midbrain dopamine (DA) neurons broadcast an error signal for the prediction of reward (reward prediction error, RPE) is among the great successes of computational neuroscience1–3. However, recent results contradict a core aspect of this theory: that the neurons uniformly convey a scalar, global signal. Instead, when animals are placed in a high-dimensional environment, DA neurons in the ventral tegmental area (VTA) display substantial heterogeneity in the features to which they respond, while also having more consistent RPE-like responses at the time of reward. Here we introduce a new “Vector RPE” model that explains these findings, by positing that DA neurons report individual RPEs for a subset of a population vector code for an animal’s state (moment-to-moment situation). To investigate this claim, we train a deep reinforcement learning model on a navigation and decision-making task, and compare the Vector RPE derived from the network to population recordings from DA neurons during the same task. The Vector RPE model recapitulates the key features of the neural data: specifically, heterogeneous coding of task variables during the navigation and decision-making period, but uniform reward responses. The model also makes new predictions about the nature of the responses, which we validate. Our work provides a path to reconcile new observations of DA neuron heterogeneity with classic ideas about RPE coding, while also providing a new perspective on how the brain performs reinforcement learning in high dimensional environments.

11 citations

References
More filters